3 options

Learning data mining with Python : use Python to manipulate data and build predictive models / Robert Layton.

EBSCOhost Academic eBook Collection (North America) Available online

Ebook Central College Complete Available online

O'Reilly Online Learning: Academic/Public Library Edition Available online

Format:: Book
Author/Creator:: Layton, Robert, 1986- author.
Language:: English
Subjects (All):: Python (Computer program language).
Physical Description:: 1 online resource (348 pages)
Edition:: Second edition.
Other Title:: Use Python to manipulate data and build predictive models
Place of Publication:: Birmingham, [England] ; Mumbai, [India] : Packt Publishing, 2017.
System Details:: text file
Summary:: Harness the power of Python to develop data mining applications, analyze data, delve into machine learning, explore object detection using Deep Neural Networks, and create insightful predictive models. About This Book Use a wide variety of Python libraries for practical data mining purposes. Learn how to find, manipulate, analyze, and visualize data using Python. Step-by-step instructions on data mining techniques with Python that have real-world applications. Who This Book Is For If you are a Python programmer who wants to get started with data mining, then this book is for you. If you are a data analyst who wants to leverage the power of Python to perform data mining efficiently, this book will also help you. No previous experience with data mining is expected. What You Will Learn Apply data mining concepts to real-world problems Predict the outcome of sports matches based on past results Determine the author of a document based on their writing style Use APIs to download datasets from social media and other online services Find and extract good features from difficult datasets Create models that solve real-world problems Design and develop data mining applications using a variety of datasets Perform object detection in images using Deep Neural Networks Find meaningful insights from your data through intuitive visualizations Compute on big data, including real-time data from the internet In Detail This book teaches you to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis. This book covers a large number of libraries available in Python, including the Jupyter Notebook, pandas, scikit-learn, and NLTK. You will gain hands on experience with complex data types including text, images, and graphs. You will also discover object detection using Deep Neural Networks, which is one of the big, difficult areas of machine learning right now. With restructured examples and code samples updated for the latest edition of Python, each chapter of this book introduces you to new algorithms and techniques. By the end of the book, you will have great insights into using Python for data mining and understanding of the algorithms as well as implementations. Style and approach This book will be your comprehensive guide to learning the various data mining techniques and implementing them in Python. A variety of real-world datasets is used to explain data mining techniques in a very crisp...
Contents:: Cover; Copyright; Credits; About the Author; About the Reviewer; www.PacktPub.com; Customer Feedback; Table of Contents; Preface; Chapter 1: Getting Started with Data Mining; Introducing data mining; Using Python and the Jupyter Notebook; Installing Python; Installing Jupyter Notebook; Installing scikit-learn; A simple affinity analysis example; What is affinity analysis?; Product recommendations; Loading the dataset with NumPy; Downloading the example code; Implementing a simple ranking of rules; Ranking to find the best rules; A simple classification example; What is classification?; Loading and preparing the dataset; Implementing the OneR algorithm; Testing the algorithm; Summary; Chapter 2: Classifying with scikit-learn Estimators; scikit-learn estimators; Nearest neighbors; Distance metrics; Loading the dataset; Moving towards a standard workflow; Running the algorithm; Setting parameters; Preprocessing; Standard pre-processing; Putting it all together; Pipelines; Chapter 3: Predicting Sports Winners with Decision Trees; Collecting the data; Using pandas to load the dataset; Cleaning up the dataset; Extracting new features; Decision trees; Parameters in decision trees; Using decision trees; Sports outcome prediction; Random forests; How do ensembles work?; Setting parameters in Random Forests; Applying random forests; Engineering new features; Chapter 4: Recommending Movies Using Affinity Analysis; Affinity analysis; Algorithms for affinity analysis; Overall methodology; Dealing with the movie recommendation problem; Obtaining the dataset; Loading with pandas; Sparse data formats; Understanding the Apriori algorithm and its implementation.; Looking into the basics of the Apriori algorithm; Implementing the Apriori algorithm; Extracting association rules; Evaluating the association rules; Chapter 5: Features and scikit-learn Transformers; Feature extraction; Representing reality in models; Common feature patterns; Creating good features; Feature selection; Selecting the best individual features; Feature creation; Principal Component Analysis; Creating your own transformer; The transformer API; Implementing a Transformer; Unit testing; Chapter 6: Social Media Insight using Naive Bayes; Disambiguation; Downloading data from a social network; Loading and classifying the dataset; Creating a replicable dataset from Twitter; Text transformers; Bag-of-words models; n-gram features; Other text features; Naive Bayes; Understanding Bayes' theorem; Naive Bayes algorithm; How it works; Applying of Naive Bayes; Extracting word counts; Converting dictionaries to a matrix; Evaluation using the F1-score; Getting useful features from models; Chapter 7: Follow Recommendations Using Graph Mining; Classifying with an existing model; Getting follower information from Twitter; Building the network; Creating a graph; Creating a similarity graph; Finding subgraphs; Connected components; Optimizing criteria; Chapter 8: Beating CAPTCHAs with Neural Networks; Artificial neural networks; An introduction to neural networks; Creating the dataset; Drawing basic CAPTCHAs; Splitting the image into individual letters; Creating a training dataset; Training and classifying; Back-propagation; Predicting words; Improving accuracy using a dictionary; Ranking mechanisms for word similarity.; Putting it all together; Chapter 9: Authorship Attribution; Attributing documents to authors; Applications and use cases; Authorship attribution; Getting the data; Using function words; Counting function words; Classifying with function words; Support Vector Machines; Classifying with SVMs; Kernels; Character n-grams; Extracting character n-grams; The Enron dataset; Accessing the Enron dataset; Creating a dataset loader; Evaluation; Chapter 10: Clustering News Articles; Trending topic discovery; Using a web API to get data; Reddit as a data source; Extracting text from arbitrary websites; Finding the stories in arbitrary websites; Extracting the content; Grouping news articles; The k-means algorithm; Evaluating the results; Extracting topic information from clusters; Using clustering algorithms as transformers; Clustering ensembles; Evidence accumulation; Implementation; Online learning; Chapter 11: Object Detection in Images using Deep Neural Networks; Object classification; Use cases; Application scenario; Deep neural networks; Intuition; Implementing deep neural networks; An Introduction to TensorFlow; Using Keras; Convolutional Neural Networks; GPU optimization; When to use GPUs for computation; Running our code on a GPU; Setting up the environment; Application; Creating the neural network; Chapter 12: Working with Big Data; Big data; Applications of big data; MapReduce; The intuition behind MapReduce; A word count example; Hadoop MapReduce; Applying MapReduce; Naive Bayes prediction; The mrjob package; Extracting the blog posts.; Training Naive Bayes; Training on Amazon's EMR infrastructure; Appendix: Next Steps...; Getting Started with Data Mining; Scikit-learn tutorials; Extending the Jupyter Notebook; More datasets; Other Evaluation Metrics; More application ideas; Classifying with scikit-learn Estimators; Scalability with the nearest neighbor; More complex pipelines; Comparing classifiers; Automated Learning; Predicting Sports Winners with Decision Trees; More complex features; Dask; Research; Recommending Movies Using Affinity Analysis; New datasets; The Eclat algorithm; Collaborative Filtering; Extracting Features with Transformers; Adding noise; Vowpal Wabbit; word2vec; Social Media Insight Using Naive Bayes; Spam detection; Natural language processing and part-of-speech tagging; Discovering Accounts to Follow Using Graph Mining; More complex algorithms; NetworkX; Beating CAPTCHAs with Neural Networks; Better (worse?) CAPTCHAs; Deeper networks; Reinforcement learning; Authorship Attribution; Increasing the sample size; Blogs dataset; Local n-grams; Clustering News Articles; Clustering Evaluation; Temporal analysis; Real-time clusterings; Classifying Objects in Images Using Deep Learning; Mahotas; Magenta; Working with Big Data; Courses on Hadoop; Pydoop; Recommendation engine; W.I.L.L; More resources; Kaggle competitions; Coursera; Index.
Notes:: Includes bibliographical references and index.; Description based on online resource; title from PDF title page (ebrary, viewed July 12, 2017).
ISBN:: 9781787129566; 178712956X
OCLC:: 987331258

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

3 options

Learning data mining with Python : use Python to manipulate data and build predictive models / Robert Layton.

Find

My Account

Guides