3 options
R data analysis projects : build end to end analytics systems to get deeper insights from your data / Gopi Subramanian.
- Format:
- Book
- Author/Creator:
- Subramanian, Gopi, author.
- Language:
- English
- Subjects (All):
- R (Computer program language).
- Physical Description:
- 1 online resource (1 volume) : illustrations
- Edition:
- 1st edition
- Other Title:
- Build end to end analytics systems to get deeper insights from your data
- Place of Publication:
- Birmingham, England ; Mumbai, [India] : Packt Publishing, 2017.
- System Details:
- text file
- Summary:
- Get valuable insights from your data by building data analysis systems from scratch with R. About This Book A handy guide to take your understanding of data analysis with R to the next level Real-world projects that focus on problems in finance, network analysis, social media, and more From data manipulation to analysis to visualization in R, this book will teach you everything you need to know about building end-to-end data analysis pipelines using R Who This Book Is For If you are looking for a book that takes you all the way through the practical application of advanced and effective analytics methodologies in R, then this is the book for you. A fundamental understanding of R and the basic concepts of data analysis is all you need to get started with this book. What You Will Learn Build end-to-end predictive analytics systems in R Build an experimental design to gather your own data and conduct analysis Build a recommender system from scratch using different approaches Use and leverage RShiny to build reactive programming applications Build systems for varied domains including market research, network analysis, social media analysis, and more Explore various R Packages such as RShiny, ggplot, recommenderlab, dplyr, and find out how to use them effectively Communicate modeling results using Shiny Dashboards Perform multi-variate time-series analysis prediction, supplemented with sensitivity analysis and risk modeling In Detail R offers a large variety of packages and libraries for fast and accurate data analysis and visualization. As a result, it's one of the most popularly used languages by data scientists and analysts, or anyone who wants to perform data analysis. This book will demonstrate how you can put to use your existing knowledge of data analysis in R to build highly efficient, end-to-end data analysis pipelines without any hassle. You'll start by building a content-based recommendation system, followed by building a project on sentiment analysis with tweets. You'll implement time-series modeling for anomaly detection, and understand cluster analysis of streaming data. You'll work through projects on performing efficient market data research, building recommendation systems, and analyzing networks accurately, all provided with easy to follow codes. With the help of these real-world projects, you'll get a better understanding of the challenges faced when building data analysis pipelines, and see how you can overcome them without comp...
- Contents:
- Cover
- Title Page
- Copyright
- Credits
- About the Author
- About the Reviewer
- www.PacktPub.com
- Customer Feedback
- Table of Contents
- Preface
- Chapter 1: Association Rule Mining
- Understanding the recommender systems
- Transactions
- Weighted transactions
- Our web application
- Retailer use case and data
- Association rule mining
- Support and confidence thresholds
- The cross-selling campaign
- Leverage
- Conviction
- Weighted association rule mining
- Hyperlink-induced topic search (HITS)
- Negative association rules
- Rules visualization
- Wrapping up
- Summary
- Chapter 2: Fuzzy Logic Induced Content-Based Recommendation
- Introducing content-based recommendation
- News aggregator use case and data
- Designing the content-based recommendation engine
- Building a similarity index
- Bag-of-words
- Term frequency
- Document frequency
- Inverse document frequency (IDF)
- TFIDF
- Why cosine similarity?
- Searching
- Polarity scores
- Jaccard's distance
- Jaccards distance/index
- Ranking search results
- Fuzzy logic
- Fuzzification
- Defining the rules
- Evaluating the rules
- Defuzzification
- Complete R Code
- Chapter 3: Collaborative Filtering
- Collaborative filtering
- Memory-based approach
- Model-based approach
- Latent factor approach
- Recommenderlab package
- Popular approach
- Use case and data
- Designing and implementing collaborative filtering
- Ratings matrix
- Normalization
- Train test split
- Train model
- User-based models
- Item-based models
- Factor-based models
- Chapter 4: Taming Time Series Data Using Deep Neural Networks
- Time series data
- Non-seasonal time series
- Seasonal time series
- Time series as a regression problem
- Deep neural networks
- Forward cycle
- Backward cycle.
- Introduction to the MXNet R package
- Symbolic programming in MXNet
- Softmax activation
- Deep networks for time series prediction
- Training test split
- Complete R code
- Chapter 5: Twitter Text Sentiment Classification Using Kernel Density Estimates
- Kernel density estimation
- Twitter text
- Sentiment classification
- Dictionary methods
- Machine learning methods
- Our approach
- Dictionary based scoring
- Text pre-processing
- Term-frequeny inverse document frequency (TFIDF)
- Delta TFIDF
- Building a sentiment classifier
- Assembling an RShiny application
- Chapter 6: Record Linkage - Stochastic and Machine Learning Approaches
- Introducing our use case
- Demonstrating the use of RecordLinkage package
- Feature generation
- String features
- Phonetic features
- Stochastic record linkage
- Expectation maximization method
- Weights-based method
- Machine learning-based record linkage
- Unsupervised learning
- Supervised learning
- Building an RShiny application
- Machine learning method
- RShiny application
- Chapter 7: Streaming Data Clustering Analysis in R
- Streaming data and its challenges
- Bounded problems
- Drift
- Single pass
- Real time
- Introducing stream clustering
- Macro-cluster
- Introducing the stream package
- Data stream data
- DSD as a static simulator
- DSD as a simulator with drift
- DSD connecting to memory, file, or database
- Inflight operation
- Can we connect this DSD to an actual data stream?
- Data stream task
- Speed layer
- Batch layer
- Reservoir sampling
- Chapter 8: Analyze and Understand Networks Using R
- Graphs in R.
- Degree of a vertex
- Strength of a vertex
- Adjacency Matrix
- More networks in R
- Centrality of a vertex
- Farness and Closeness of a node
- Finding the shortest path between nodes
- Random walk on a graph
- Data preparation
- Product network analysis
- Building a RShiny application
- The complete R script
- Index.
- Notes:
- Includes index.
- Description based on online resource; title from PDF title page (EBC, viewed December 18, 2017).
- OCLC:
- 1017754231
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.