3 options
Java for data science : examine the techniques and Java tools supporting the growing field of data science / Richard M. Reese, Jennifer L. Reese.
- Format:
- Book
- Author/Creator:
- Reese, Richard M., author.
- Reese, Jennifer L., author.
- Language:
- English
- Subjects (All):
- Java (Computer program language).
- Machine learning.
- Physical Description:
- 1 online resource (376 pages)
- Edition:
- 1st edition
- Place of Publication:
- Birmingham, England ; Mumbai, [India] : Packt, 2017.
- System Details:
- text file
- Biography/History:
- Reese Richard M. : Richard Reese has worked in the industry and academics for the past 29 years. For 10 years he provided software development support at Lockheed and at one point developed a C based network application. He was a contract instructor providing software training to industry for 5 years. Richard is currently an Associate Professor at Tarleton State University in Stephenville Texas. Richard is the author of various books and video courses some of which are as follows: Natural Language Processing with Java. Java for Data Science Getting Started with Natural Language Processing in JavaReese Jennifer L. : Jennifer L. Reese studied computer science at Tarleton State University. She also earned her M. Ed. from Tarleton in December 2016. She currently teaches computer science to high-school students. Her interests include the integration of computer science concepts with other academic disciplines, increasing diversity in computer science courses, and the application of data science to the field of education. She has co-authored two books: Java for Data Science and Java 7 New Features Cookbook. She previously worked as a software engineer. In her free time she enjoys reading, cooking, and travelingespecially to any destination with a beach. She is a musician and appreciates a variety of musical genres.
- Summary:
- Examine the techniques and Java tools supporting the growing field of data science About This Book Your entry ticket to the world of data science with the stability and power of Java Explore, analyse, and visualize your data effectively using easy-to-follow examples Make your Java applications more capable using machine learning Who This Book Is For This book is for Java developers who are comfortable developing applications in Java. Those who now want to enter the world of data science or wish to build intelligent applications will find this book ideal. Aspiring data scientists will also find this book very helpful. What You Will Learn Understand the nature and key concepts used in the field of data science Grasp how data is collected, cleaned, and processed Become comfortable with key data analysis techniques See specialized analysis techniques centered on machine learning Master the effective visualization of your data Work with the Java APIs and techniques used to perform data analysis In Detail Data science is concerned with extracting knowledge and insights from a wide variety of data sources to analyse patterns or predict future behaviour. It draws from a wide array of disciplines including statistics, computer science, mathematics, machine learning, and data mining. In this book, we cover the important data science concepts and how they are supported by Java, as well as the often statistically challenging techniques, to provide you with an understanding of their purpose and application. The book starts with an introduction of data science, followed by the basic data science tasks of data collection, data cleaning, data analysis, and data visualization. This is followed by a discussion of statistical techniques and more advanced topics including machine learning, neural networks, and deep learning. The next section examines the major categories of data analysis including text, visual, and audio data, followed by a discussion of resources that support parallel implementation. The final chapter illustrates an in-depth data science problem and provides a comprehensive, Java-based solution. Due to the nature of the topic, simple examples of techniques are presented early followed by a more detailed treatment later in the book. This permits a more natural introduction to the techniques and concepts presented in the book. Style and approach This book follows a tutorial approach, providing examples of each of the major concepts covered. With a...
- Contents:
- Cover
- Copyright
- Credits
- About the Authors
- About the Reviewers
- www.PacktPub.com
- Customer Feedback
- Table of Contents
- Preface
- Chapter 1: Getting Started with Data Science
- Problems solved using data science
- Understanding the data science problem - solving approach
- Using Java to support data science
- Acquiring data for an application
- The importance and process of cleaning data
- Visualizing data to enhance understanding
- The use of statistical methods in data science
- Machine learning applied to data science
- Using neural networks in data science
- Deep learning approaches
- Performing text analysis
- Visual and audio analysis
- Improving application performance using parallel techniques
- Assembling the pieces
- Summary
- Chapter 2: Data Acquisition
- Understanding the data formats used in data science applications
- Overview of CSV data
- Overview of spreadsheets
- Overview of databases
- Overview of PDF files
- Overview of JSON
- Overview of XML
- Overview of streaming data
- Overview of audio/video/images in Java
- Data acquisition techniques
- Using the HttpUrlConnection class
- Web crawlers in Java
- Creating your own web crawler
- Using the crawler4j web crawler
- Web scraping in Java
- Using API calls to access common social media sites
- Using OAuth to authenticate users
- Handing Twitter
- Handling Wikipedia
- Handling Flickr
- Handling YouTube
- Searching by keyword
- Chapter 3: Data Cleaning
- Handling data formats
- Handling CSV data
- Handling spreadsheets
- Handling Excel spreadsheets
- Handling PDF files
- Handling JSON
- Using JSON streaming API
- Using the JSON tree API
- The nitty gritty of cleaning text
- Using Java tokenizers to extract words
- Java core tokenizers
- Third-party tokenizers and libraries.
- Transforming data into a usable form
- Simple text cleaning
- Removing stop words
- Finding words in text
- Finding and replacing text
- Data imputation
- Subsetting data
- Sorting text
- Data validation
- Validating data types
- Validating dates
- Validating e-mail addresses
- Validating ZIP codes
- Validating names
- Cleaning images
- Changing the contrast of an image
- Smoothing an image
- Brightening an image
- Resizing an image
- Converting images to different formats
- Chapter 4: Data Visualization
- Understanding plots and graphs
- Visual analysis goals
- Creating index charts
- Creating bar charts
- Using country as the category
- Using decade as the category
- Creating stacked graphs
- Creating pie charts
- Creating scatter charts
- Creating histograms
- Creating donut charts
- Creating bubble charts
- Chapter 5: Statistical Data Analysis Techniques
- Working with mean, mode, and median
- Calculating the mean
- Using simple Java techniques to find mean
- Using Java 8 techniques to find mean
- Using Google Guava to find mean
- Using Apache Commons to find mean
- Calculating the median
- Using simple Java techniques to find median
- Using Apache Commons to find the median
- Calculating the mode
- Using ArrayLists to find multiple modes
- Using a HashMap to find multiple modes
- Using a Apache Commons to find multiple modes
- Standard deviation
- Sample size determination
- Hypothesis testing
- Regression analysis
- Using simple linear regression
- Using multiple regression
- Chapter 6: Machine Learning
- Supervised learning techniques
- Decision trees
- Decision tree types
- Decision tree libraries
- Using a decision tree with a book dataset
- Testing the book decision tree
- Support vector machines
- Using an SVM for camping data.
- Testing individual instances
- Bayesian networks
- Using a Bayesian network
- Unsupervised machine learning
- Association rule learning
- Using association rule learning to find buying relationships
- Reinforcement learning
- Chapter 7: Neural Networks
- Training a neural network
- Getting started with neural network architectures
- Understanding static neural networks
- A basic Java example
- Understanding dynamic neural networks
- Multilayer perceptron networks
- Building the model
- Evaluating the model
- Predicting other values
- Saving and retrieving the model
- Learning vector quantization
- Self-Organizing Maps
- Using a SOM
- Displaying the SOM results
- Additional network architectures and algorithms
- The k-Nearest Neighbors algorithm
- Instantaneously trained networks
- Spiking neural networks
- Cascading neural networks
- Holographic associative memory
- Backpropagation and neural networks
- Chapter 8: Deep Learning
- Deeplearning4j architecture
- Acquiring and manipulating data
- Reading in a CSV file
- Configuring and building a model
- Using hyperparameters in ND4J
- Instantiating the network model
- Training a model
- Testing a model
- Deep learning and regression analysis
- Preparing the data
- Setting up the class
- Reading and preparing the data
- Restricted Boltzmann Machines
- Reconstruction in an RBM
- Configuring an RBM
- Deep autoencoders
- Building an autoencoder in DL4J
- Configuring the network
- Building and training the network
- Saving and retrieving a network
- Specialized autoencoders
- Convolutional networks
- Recurrent Neural Networks
- Chapter 9: Text Analysis
- Implementing named entity recognition
- Using OpenNLP to perform NER.
- Identifying location entities
- Classifying text
- Word2Vec and Doc2Vec
- Classifying text by labels
- Classifying text by similarity
- Understanding tagging and POS
- Using OpenNLP to identify POS
- Understanding POS tags
- Extracting relationships from sentences
- Using OpenNLP to extract relationships
- Sentiment analysis
- Downloading and extracting the Word2Vec model
- Building our model and classifying text
- Chapter 10: Visual and Audio Analysis
- Text-to-speech
- Using FreeTTS
- Getting information about voices
- Gathering voice information
- Understanding speech recognition
- Using CMUPhinx to convert speech to text
- Obtaining more detail about the words
- Extracting text from an image
- Using Tess4j to extract text
- Identifying faces
- Using OpenCV to detect faces
- Classifying visual data
- Creating a Neuroph Studio project for classifying visual images
- Training the model
- Chapter 11: Mathematical and Parallel Techniques for Data Analysis
- Implementing basic matrix operations
- Using GPUs with DeepLearning4j
- Using map-reduce
- Using Apache's Hadoop to perform map-reduce
- Writing the map method
- Writing the reduce method
- Creating and executing a new Hadoop job
- Various mathematical libraries
- Using the jblas API
- Using the Apache Commons math API
- Using the ND4J API
- Using OpenCL
- Using Aparapi
- Creating an Aparapi application
- Using Aparapi for matrix multiplication
- Using Java 8 streams
- Understanding Java 8 lambda expressions and streams
- Using Java 8 to perform matrix multiplication
- Using Java 8 to perform map-reduce
- Chapter 12: Bringing It All Together
- Defining the purpose and scope of our application
- Understanding the application's architecture
- Data acquisition using Twitter
- Understanding the TweetHandler class.
- Extracting data for a sentiment analysis model
- Building the sentiment model
- Processing the JSON input
- Cleaning data to improve our results
- Performing sentiment analysis
- Analysing the results
- Other optional enhancements
- Index.
- Notes:
- Includes index.
- Description based on online resource; title from PDF title page (ebrary, viewed March 2, 2017).
- OCLC:
- 970818593
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.