My Account Log in

3 options

Natural language processing with Java : techniques for building machine learning and neural network models for NLP / Richard M. Reese, AshishSingh Bhatia.

EBSCOhost Academic eBook Collection (North America) Available online

View online

Ebook Central Academic Complete Available online

View online

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Reese, Richard M., author.
Bhatia, AshishSingh, author.
Language:
English
Subjects (All):
Natural language processing (Computer science).
Java (Computer program language).
Physical Description:
1 online resource (308 pages) : illustrations
Edition:
Second edition.
Place of Publication:
Birmingham ; Mumbai : Packt, 2018.
System Details:
text file
Biography/History:
M. Reese Richard: Richard Reese has worked in the industry and academics for the past 29 years. For 10 years he provided software development support at Lockheed and at one point developed a C based network application. He was a contract instructor providing software training to industry for 5 years. Richard is currently an Associate Professor at Tarleton State University in Stephenville Texas. Richard is the author of various books and video courses some of which are as follows: Natural Language Processing with Java. Java for Data Science Getting Started with Natural Language Processing in Java
Summary:
Explore various approaches to organize and extract useful text from unstructured data using Java Key Features Use deep learning and NLP techniques in Java to discover hidden insights in text Work with popular Java libraries such as CoreNLP, OpenNLP, and Mallet Explore machine translation, identifying parts of speech, and topic modeling Book Description Natural Language Processing (NLP) allows you to take any sentence and identify patterns, special names, company names, and more. The second edition of Natural Language Processing with Java teaches you how to perform language analysis with the help of Java libraries, while constantly gaining insights from the outcomes. You'll start by understanding how NLP and its various concepts work. Having got to grips with the basics, you'll explore important tools and libraries in Java for NLP, such as CoreNLP, OpenNLP, Neuroph, and Mallet. You'll then start performing NLP on different inputs and tasks, such as tokenization, model training, parts-of-speech and parsing trees. You'll learn about statistical machine translation, summarization, dialog systems, complex searches, supervised and unsupervised NLP, and more. By the end of this book, you'll have learned more about NLP, neural networks, and various other trained models in Java for enhancing the performance of NLP applications. What you will learn Understand basic NLP tasks and how they relate to one another Discover and use the available tokenization engines Apply search techniques to find people, as well as things, within a document Construct solutions to identify parts of speech within sentences Use parsers to extract relationships between elements of a document Identify topics in a set of documents Explore topic modeling from a document Who this book is for Natural Language Processing with Java is for you if you are a data analyst, data scientist, or machine learning engineer who wants to extract information from a language using Java. Knowledge of Java programming is needed, while a basic understanding of statistics will be useful but not mandatory.
Contents:
Cover
Title Page
Copyright and Credits
Dedication
Packt Upsell
Contributors
Table of Contents
Preface
Chapter 1: Introduction to NLP
What is NLP?
Why use NLP?
Why is NLP so hard?
Survey of NLP tools
Apache OpenNLP
Stanford NLP
LingPipe
GATE
UIMA
Apache Lucene Core
Deep learning for Java
Overview of text-processing tasks
Finding parts of text
Finding sentences
Feature-engineering
Finding people and things
Detecting parts of speech
Classifying text and documents
Extracting relationships
Using combined approaches
Understanding NLP models
Identifying the task
Selecting a model
Building and training the model
Verifying the model
Using the model
Preparing data
Summary
Chapter 2: Finding Parts of Text
Understanding the parts of text
What is tokenization?
Uses of tokenizers
Simple Java tokenizers
Using the Scanner class
Specifying the delimiter
Using the split method
Using the BreakIterator class
Using the StreamTokenizer class
Using the StringTokenizer class
Performance considerations with Java core tokenization
NLP tokenizer APIs
Using the OpenNLPTokenizer class
Using the SimpleTokenizer class
Using the WhitespaceTokenizer class
Using the TokenizerME class
Using the Stanford tokenizer
Using the PTBTokenizer class
Using the DocumentPreprocessor class
Using a pipeline
Using LingPipe tokenizers
Training a tokenizer to find parts of text
Comparing tokenizers
Understanding normalization
Converting to lowercase
Removing stopwords
Creating a StopWords class
Using LingPipe to remove stopwords
Using stemming
Using the Porter Stemmer
Stemming with LingPipe
Using lemmatization
Using the StanfordLemmatizer class
Using lemmatization in OpenNLP.
Normalizing using a pipeline
Chapter 3: Finding Sentences
The SBD process
What makes SBD difficult?
Understanding the SBD rules of LingPipe's HeuristicSentenceModel class
Simple Java SBDs
Using regular expressions
Using NLP APIs
Using OpenNLP
Using the SentenceDetectorME class
Using the sentPosDetect method
Using the Stanford API
Using the StanfordCoreNLP class
Using LingPipe
Using the IndoEuropeanSentenceModel class
Using the SentenceChunker class
Using the MedlineSentenceModel class
Training a sentence-detector model
Using the Trained model
Evaluating the model using the SentenceDetectorEvaluator class
Chapter 4: Finding People and Things
Why is NER difficult?
Techniques for name recognition
Lists and regular expressions
Statistical classifiers
Using regular expressions for NER
Using Java's regular expressions to find entities
Using the RegExChunker class of LingPipe
Using OpenNLP for NER
Determining the accuracy of the entity
Using other entity types
Processing multiple entity types
Using the Stanford API for NER
Using LingPipe for NER
Using LingPipe's named entity models
Using the ExactDictionaryChunker class
Building a new dataset with the NER annotation tool
Training a model
Evaluating a model
Chapter 5: Detecting Part of Speech
The tagging process
The importance of POS taggers
What makes POS difficult?
Using the NLP APIs
Using OpenNLP POS taggers
Using the OpenNLP POSTaggerME class for POS taggers
Using OpenNLP chunking
Using the POSDictionary class
Obtaining the tag dictionary for a tagger
Determining a word's tags
Changing a word's tags.
Adding a new tag dictionary
Creating a dictionary from a file
Using Stanford POS taggers
Using Stanford MaxentTagger
Using the MaxentTagger class to tag textese
Using the Stanford pipeline to perform tagging
Using LingPipe POS taggers
Using the HmmDecoder class with Best_First tags
Using the HmmDecoder class with NBest tags
Determining tag confidence with the HmmDecoder class
Training the OpenNLP POSModel
Chapter 6: Representing Text with Features
N-grams
Word embedding
GloVe
Word2vec
Dimensionality reduction
Principle component analysis
Distributed stochastic neighbor embedding
Chapter 7: Information Retrieval
Boolean retrieval
Dictionaries and tolerant retrieval
Wildcard queries
Spelling correction
Soundex
Vector space model
Scoring and term weighting
Inverse document frequency
TF-IDF weighting
Evaluation of information retrieval systems
Chapter 8: Classifying Texts and Documents
How classification is used
Understanding sentiment analysis
Text-classifying techniques
Using APIs to classify text
Training an OpenNLP classification model
Using DocumentCategorizerME to classify text
Using the ColumnDataClassifier class for classification
Using the Stanford pipeline to perform sentiment analysis
Using LingPipe to classify text
Training text using the Classified class
Using other training categories
Classifying text using LingPipe
Sentiment analysis using LingPipe
Language identification using LingPipe
Chapter 9: Topic Modeling
What is topic modeling?
The basics of LDA
Topic modeling with MALLET
Training
Evaluation
Chapter 10: Using Parsers to Extract Relationships
Relationship types.
Understanding parse trees
Using extracted relationships
Using the LexicalizedParser class
Using the TreePrint class
Finding word dependencies using the GrammaticalStructure class
Finding coreference resolution entities
Extracting relationships for a question-answer system
Finding the word dependencies
Determining the question type
Searching for the answer
Chapter 11: Combined Pipeline
Using boilerpipe to extract text from HTML
Using POI to extract text from Word documents
Using PDFBox to extract text from PDF documents
Using Apache Tika for content analysis and extraction
Pipelines
Using the Stanford pipeline
Using multiple cores with the Stanford pipeline
Creating a pipeline to search text
Chapter 12: Creating a Chatbot
Chatbot architecture
Artificial Linguistic Internet Computer Entity
Understanding AIML
Developing a chatbot using ALICE and AIML
Other Books You May Enjoy
Index.
Notes:
Includes index.
Description based on print version record.
ISBN:
9781788993067
1788993063
OCLC:
1048799933

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account