My Account Log in

1 option

Python natural language processing : explore NLP with machine learning and deep learning techniques / Jalaj Thanaki.

Ebook Central College Complete Available online

View online
Format:
Book
Author/Creator:
Thanaki, Jalaj, author.
Language:
English
Subjects (All):
Python (Computer program language).
Natural language processing (Computer science).
Machine learning.
Physical Description:
1 online resource (476 pages) : illustrations
Edition:
1st ed.
Place of Publication:
Birmingham, England ; Mumbai, India : Packt Publishing, 2017.
Summary:
Leverage the power of machine learning and deep learning to extract information from text dataKey Features[*] Implement Machine Learning and Deep Learning techniques for efficient natural language processing[*] Get started with NLTK and implement NLP in your applications with ease[*] Understand and interpret human languages with the power of text analysis via PythonBook DescriptionThis book starts off by laying the foundation for Natural Language Processing and why Python is one of the best options to build an NLP-based expert system with advantages such as Community support, availability of frameworks and so on. Later it gives you a better understanding of available free forms of corpus and different types of dataset. After this, you will know how to choose a dataset for natural language processing applications and find the right NLP techniques to process sentences in datasets and understand their structure. You will also learn how to tokenize different parts of sentences and ways to analyze them. During the course of the book, you will explore the semantic as well as syntactic analysis of text. You will understand how to solve various ambiguities in processing human language and will come across various scenarios while performing text analysis. You will learn the very basics of getting the environment ready for natural language processing, move on to the initial setup, and then quickly understand sentences and language parts. You will learn the power of Machine Learning and Deep Learning to extract information from text data. By the end of the book, you will have a clear understanding of natural language processing and will have worked on multiple examples that implement NLP in the real world.What you will learn[*] Focus on Python programming paradigms, which are used to develop NLP applications[*] Understand corpus analysis and different types of data attribute.[*] Learn NLP using Python libraries such as NLTK, Polyglot, SpaCy, Standford CoreNLP and so on[*] Learn about Features Extraction and Feature selection as part of Features Engineering.[*] Explore the advantages of vectorization in Deep Learning.[*] Get a better understanding of the architecture of a rule-based system.[*] Optimize and fine-tune Supervised and Unsupervised Machine Learning algorithms for NLP problems.[*] Identify Deep Learning techniques for Natural Language Processing and Natural Language Generation problems.Who this book is forThis book is intended for Python developers who wish to start with natural language processing and want to make their applications smarter by implementing NLP in them.
Contents:
Cover
Copyright
Credits
Foreword
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Customer Feedback
Table of Contents
Preface
Chapter 1: Introduction
Understanding natural language processing
Understanding basic applications
Understanding advanced applications
Advantages of togetherness - NLP and Python
Environment setup for NLTK
Tips for readers
Summary
Chapter 2: Practical Understanding of a Corpus and Dataset
What is a corpus?
Why do we need a corpus?
Understanding corpus analysis
Exercise
Understanding types of data attributes
Categorical or qualitative data attributes
Numeric or quantitative data attributes
Exploring different file formats for corpora
Resources for accessing free corpora
Preparing a dataset for NLP applications
Selecting data
Preprocessing the dataset
Formatting
Cleaning
Sampling
Transforming data
Web scraping
Chapter 3: Understanding the Structure of a Sentences
Understanding components of NLP
Natural language understanding
Natural language generation
Differences between NLU and NLG
Branches of NLP
Defining context-free grammar
Morphological analysis
What is morphology?
What are morphemes?
What is a stem?
What is morphological analysis?
What is a word?
Classification of morphemes
Free morphemes
Bound morphemes
Derivational morphemes
Inflectional morphemes
What is the difference between a stem and a root?
Lexical analysis
What is a token?
What are part of speech tags?
Process of deriving tokens
Difference between stemming and lemmatization
Applications
Syntactic analysis
What is syntactic analysis?
Semantic analysis
What is semantic analysis?
Lexical semantics.
Hyponymy and hyponyms
Homonymy
Polysemy
What is the difference between polysemy and homonymy?
Application of semantic analysis
Handling ambiguity
Lexical ambiguity
Syntactic ambiguity
Approach to handle syntactic ambiguity
Semantic ambiguity
Pragmatic ambiguity
Discourse integration
Pragmatic analysis
Chapter 4: Preprocessing
Handling corpus-raw text
Getting raw text
Lowercase conversion
Sentence tokenization
Challenges of sentence tokenization
Stemming for raw text
Challenges of stemming for raw text
Lemmatization of raw text
Challenges of lemmatization of raw text
Stop word removal
Handling corpus-raw sentences
Word tokenization
Challenges for word tokenization
Word lemmatization
Challenges for word lemmatization
Basic preprocessing
Regular expressions
Basic level regular expression
Basic flags
Advanced level regular expression
Positive lookahead
Positive lookbehind
Negative lookahead
Negative lookbehind
Practical and customized preprocessing
Decide by yourself
Is preprocessing required?
What kind of preprocessing is required?
Understanding case studies of preprocessing
Grammar correction system
Sentiment analysis
Machine translation
Spelling correction
Approach
Chapter 5: Feature Engineering and NLP Algorithms
Understanding feature engineering
What is feature engineering?
What is the purpose of feature engineering?
Challenges
Basic feature of NLP
Parsers and parsing
Understanding the basics of parsers
Understanding the concept of parsing
Developing a parser from scratch
Types of grammar
Context-free grammar
Probabilistic context-free grammar
Calculating the probability of a tree.
Calculating the probability of a string
Grammar transformation
Developing a parser with the Cocke-Kasami-Younger Algorithm
Developing parsers step-by-step
Existing parser tools
The Stanford parser
The spaCy parser
Extracting and understanding the features
Customizing parser tools
POS tagging and POS taggers
Understanding the concept of POS tagging and POS taggers
Developing POS taggers step-by-step
Plug and play with existing POS taggers
A Stanford POS tagger example
Using polyglot to generate POS tagging
Using POS tags as features
Name entity recognition
Classes of NER
Plug and play with existing NER tools
A Stanford NER example
A Spacy NER example
n-grams
Understanding n-gram using a practice example
Application
Bag of words
Understanding BOW
Understanding BOW using a practical example
Comparing n-grams and BOW
Semantic tools and resources
Basic statistical features for NLP
Basic mathematics
Basic concepts of linear algebra for NLP
Basic concepts of the probabilistic theory for NLP
Probability
Independent event and dependent event
Conditional probability
TF-IDF
Understanding TF-IDF
Understanding TF-IDF with a practical example
Using textblob
Using scikit-learn
Vectorization
Encoders and decoders
One-hot encoding
Understanding a practical example for one-hot encoding
Normalization
The linguistics aspect of normalization
The statistical aspect of normalization
Probabilistic models
Understanding probabilistic language modeling
Application of LM
Indexing
Ranking
Advantages of features engineering
Challenges of features engineering.
Summary
Chapter 6: Advanced Feature Engineering and NLP Algorithms
Recall word embedding
Understanding the basics of word2vec
Distributional semantics
Defining word2vec
Necessity of unsupervised distribution semantic model - word2vec
Converting the word2vec model from black box to white box
Distributional similarity based representation
Understanding the components of the word2vec model
Input of the word2vec
Output of word2vec
Construction components of the word2vec model
Architectural component
Understanding the logic of the word2vec model
Vocabulary builder
Context builder
Neural network with two layers
Structural details of a word2vec neural network
Word2vec neural network layer's details
Softmax function
Main processing algorithms
Continuous bag of words
Skip-gram
Understanding algorithmic techniques and the mathematics behind the word2vec model
Understanding the basic mathematics for the word2vec algorithm
Techniques used at the vocabulary building stage
Lossy counting
Using it at the stage of vocabulary building
Techniques used at the context building stage
Dynamic window scaling
Understanding dynamic context window techniques
Subsampling
Pruning
Algorithms used by neural networks
Structure of the neurons
Basic neuron structure
Training a simple neuron
Define error function
Understanding gradient descent in word2vec
Single neuron application
Multi-layer neural networks
Backpropagation
Mathematics behind the word2vec model
Techniques used to generate final vectors and probability prediction stage
Hierarchical softmax
Negative sampling
Some of the facts related to word2vec
Applications of word2vec
Implementation of simple examples
Famous example (king - man + woman).
Advantages of word2vec
Challenges of word2vec
How is word2vec used in real-life applications?
When should you use word2vec?
Developing something interesting
Extension of the word2vec concept
Para2Vec
Doc2Vec
Applications of Doc2vec
GloVe
Importance of vectorization in deep learning
Chapter 7: Rule-Based System for NLP
Understanding of the rule-based system
What does the RB system mean?
Purpose of having the rule-based system
Why do we need the rule-based system?
Which kind of applications can use the RB approach over the other approaches?
What kind of resources do you need if you want to develop a rule-based system?
Architecture of the RB system
General architecture of the rule-based system as an expert system
Practical architecture of the rule-based system for NLP applications
Custom architecture - the RB system for NLP applications
Apache UIMA - the RB system for NLP applications
Understanding the RB system development life cycle
NLP applications using the rule-based system
Generalized AI applications using the rule-based system
Developing NLP applications using the RB system
Thinking process for making rules
Start with simple rules
Scraping the text data
Defining the rule for our goal
Coding our rule and generating a prototype and result
Python for pattern-matching rules for a proofreading application
Grammar correction
Template-based chatbot application
Flow of code
Advantages of template-based chatbot
Disadvantages of template-based chatbot
Comparing the rule-based approach with other approaches
Advantages of the rule-based system
Disadvantages of the rule-based system
Challenges for the rule-based system.
Understanding word-sense disambiguation basics.
Notes:
Includes index.
Description based on online resource; title from PDF title page (ebrary, viewed August 25, 2017).
OCLC:
999637356

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account