1 option
Hands-on data science with Anaconda : utilize right mix of tools to create high performance data science applications / Yuxing Yan, James Yan.
- Format:
- Book
- Author/Creator:
- Yan, Yuxing, author.
- Yan, James, author.
- Language:
- English
- Subjects (All):
- Python (Computer program language).
- Physical Description:
- 1 online resource (356 pages)
- Edition:
- 1st ed.
- Place of Publication:
- Birmingham ; Mumbai : Packt, 2018.
- Biography/History:
- Yan Yuxing: Yuxing Yan graduated from McGill University with a PhD in finance. Over the years, he has been teaching various finance courses at eight universities: McGill University and Wilfrid Laurier University (in Canada), Nanyang Technological University (in Singapore), Loyola University of Maryland, UMUC, Hofstra University, University at Buffalo, and Canisius College (in the US). His research and teaching areas include: market microstructure, open-source finance and financial data analytics. He has 22 publications including papers published in the Journal of Accounting and Finance, Journal of Banking and Finance, Journal of Empirical Finance, Real Estate Review, Pacific Basin Finance Journal, Applied Financial Economics, and Annals of Operations Research. He is good at several computer languages, such as SAS, R, Python, Matlab, and C. His four books are related to applying two pieces of open-source software to finance: Python for Finance (2014), Python for Finance (2nd ed. , expected 2017), Python for Finance (Chinese version, expected 2017), and Financial Modeling Using R (2016). In addition, he is an expert on data, especially on financial databases. From 2003 to 2010, he worked at Wharton School as a consultant, helping researchers with their programs and data issues. In 2007, he published a book titled Financial Databases (with S. W. Zhu). This book is written in Chinese. Currently, he is writing a new book called Financial Modeling Using Excel in an R-Assisted Learning Environment. The phrase "R-Assisted" distinguishes it from other similar books related to Excel and financial modeling. New features include using a huge amount of public data related to economics, finance, and accounting; an efficient way to retrieve data: 3 seconds for each time series; a free financial calculator, showing 50 financial formulas instantly, 300 websites, 100 YouTube videos, 80 references, paperless for homework, midterms, and final exams; easy to extend for instructors; and especially, no need to learn R. Yan James: James Yan is an undergraduate student at the University of Toronto (UofT), currently double-majoring in computer science and statistics. He has hands-on knowledge of Python, R, Java, MATLAB, and SQL. During his study at UofT, he has taken many related courses, such as Methods of Data Analysis I and II, Methods of Applied Statistics, Introduction to Databases, Introduction to Artificial Intelligence, and Numerical Methods, including a capstone course on AI in clinical medicine.
- Summary:
- Hands-On Data Science with Anaconda gets you started with Anaconda and demonstrates how you can use it to perform data science operations in the real world. You will learn different ways to retrieve data from various sources and different visualization tools packages available in Python, R, and Julia.
- Contents:
- Cover
- Title Page
- Copyright and Credits
- Dedication
- Packt Upsell
- Contributors
- Table of Contents
- Preface
- Chapter 1: Ecosystem of Anaconda
- Introduction
- Reasons for using Jupyter via Anaconda
- Using Jupyter without pre-installation
- Miniconda
- Anaconda Cloud
- Finding help
- Summary
- Review questions and exercises
- Chapter 2: Anaconda Installation
- Installing Anaconda
- Anaconda for Windows
- Testing Python
- Using IPython
- Using Python via Jupyter
- Introducing Spyder
- Installing R via Conda
- Installing Julia and linking it to Jupyter
- Installing Octave and linking it to Jupyter
- Chapter 3: Data Basics
- Sources of data
- UCI machine learning
- Introduction to the Python pandas package
- Several ways to input data
- Inputting data using R
- Inputting data using Python
- Introduction to the Quandl data delivery platform
- Dealing with missing data
- Data sorting
- Slicing and dicing datasets
- Merging different datasets
- Data output
- Introduction to the cbsodata Python package
- Introduction to the datadotworld Python package
- Introduction to the haven and foreign R packages
- Introduction to the dslabs R package
- Generating Python datasets
- Generating R datasets
- Chapter 4: Data Visualization
- Importance of data visualization
- Data visualization in R
- Data visualization in Python
- Data visualization in Julia
- Drawing simple graphs
- Various bar charts, pie charts, and histograms
- Adding a trend
- Adding legends and other explanations
- Visualization packages for R
- Visualization packages for Python
- Visualization packages for Julia
- Dynamic visualization
- Saving pictures as pdf
- Saving dynamic visualization as HTML file
- Summary.
- Review questions and exercises
- Chapter 5: Statistical Modeling in Anaconda
- Introduction to linear models
- Running a linear regression in R, Python, Julia, and Octave
- Critical value and the decision rule
- F-test, critical value, and the decision rule
- An application of a linear regression in finance
- Removing missing data
- Replacing missing data with another value
- Detecting outliers and treatments
- Several multivariate linear models
- Collinearity and its solution
- A model's performance measure
- Chapter 6: Managing Packages
- Introduction to packages, modules, or toolboxes
- Two examples of using packages
- Finding all R packages
- Finding all Python packages
- Finding all Julia packages
- Finding all Octave packages
- Task views for R
- Finding manuals
- Package dependencies
- Package management in R
- Package management in Python
- Package management in Julia
- Package management in Octave
- Conda - the package manager
- Creating a set of programs in R and Python
- Finding environmental variables
- Chapter 7: Optimization in Anaconda
- Why optimization is important
- General issues for optimization problems
- Expressing various kinds of optimization problems as LPP
- Quadratic optimization
- Optimization in R
- Optimization in Python
- Optimization in Julia
- Optimization in Octave
- Example #1 - stock portfolio optimization
- Example #2 - optimal tax policy
- Packages for optimization in R
- Packages for optimization in Python
- Packages for optimization in Octave
- Packages for optimization in Julia
- Chapter 8: Unsupervised Learning in Anaconda
- Introduction to unsupervised learning
- Hierarchical clustering.
- k-means clustering
- Introduction to Python packages - scipy
- Introduction to Python packages - contrastive
- Introduction to Python packages - sklearn (scikit-learn)
- Introduction to R packages - rattle
- Introduction to R packages - randomUniformForest
- Introduction to R packages - Rmixmod
- Implementation using Julia
- Task view for Cluster Analysis
- Chapter 9: Supervised Learning in Anaconda
- A glance at supervised learning
- Classification
- The k-nearest neighbors algorithm
- Bayes classifiers
- Reinforcement learning
- Implementation of supervised learning via R
- Introduction to RTextTools
- Implementation via Python
- Using the scikit-learn (sklearn) module
- Implementation via Octave
- Implementation via Julia
- Task view for machine learning in R
- Chapter 10: Predictive Data Analytics - Modeling and Validation
- Understanding predictive data analytics
- Useful datasets
- The AppliedPredictiveModeling R package
- Time series analytics
- Predicting future events
- Seasonality
- Visualizing components
- R package - LiblineaR
- R package - datarobot
- R package - eclust
- Model selection
- Python package - model-catwalk
- Python package - sklearn
- Julia package - QuantEcon
- Octave package - ltfat
- Granger causality test
- Chapter 11: Anaconda Cloud
- Introduction to Anaconda Cloud
- Jupyter Notebook in depth
- Formats of Jupyter Notebook
- Sharing of notebooks
- Sharing of projects
- Sharing of environments
- Replicating others' environments locally
- Downloading a package from Anaconda
- Chapter 12: Distributed Computing, Parallel Computing, and HPCC
- Introduction to distributed versus parallel computing.
- Task view for parallel processing
- Sample programs in Python
- Understanding MPI
- R package Rmpi
- R package plyr
- R package parallel
- R package snow
- Parallel processing in Python
- Parallel processing for word frequency
- Parallel Monte-Carlo options pricing
- Compute nodes
- Anaconda add-on
- Introduction to HPCC
- References
- Chapter 01: Ecosystem of Anaconda
- Chapter 02: Anaconda Installation
- Chapter 03: Data Basics
- Chapter 04: Data Visualization
- Chapter 05: Statistical Modeling in Anaconda
- Chapter 06: Managing Packages
- Chapter 07: Optimization in Anaconda
- Chapter 08: Unsupervised Learning in Anaconda
- Chapter 09: Supervised Learning in Anaconda
- Chapter 10: Predictive Data Analytics - Modelling and Validation
- Other Books You May Enjoy
- Index.
- Notes:
- Description based on print version record.
- ISBN:
- 9781788834735
- 1788834739
- OCLC:
- 1039690173
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.