My Account Log in

1 option

Natural language processing recipes : unlocking text data with machine learning and deep learning using Python / Akshay Kulkarni, Adarsha Shivananda.

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Kulkarni, Akshay, author.
Shivananda, Adarsha, author.
Language:
English
Subjects (All):
Natural language processing (Computer science).
Physical Description:
1 online resource (302 pages)
Edition:
Second edition.
Place of Publication:
[Place of publication not identified] : Apress, [2021]
Summary:
Focus on implementing end-to-end projects using Python and leverage state-of-the-art algorithms. This book teaches you to efficiently use a wide range of natural language processing (NLP) packages to: implement text classification, identify parts of speech, utilize topic modeling, text summarization, sentiment analysis, information retrieval, and many more applications of NLP. The book begins with text data collection, web scraping, and the different types of data sources. It explains how to clean and pre-process text data, and offers ways to analyze data with advanced algorithms. You then explore semantic and syntactic analysis of the text. Complex NLP solutions that involve text normalization are covered along with advanced pre-processing methods, POS tagging, parsing, text summarization, sentiment analysis, word2vec, seq2seq, and much more. The book presents the fundamentals necessary for applications of machine learning and deep learning in NLP. This second edition goes over advanced techniques to convert text to features such as Glove, Elmo, Bert, etc. It also includes an understanding of how transformers work, taking sentence BERT and GPT as examples. The final chapters explain advanced industrial applications of NLP with solution implementation and leveraging the power of deep learning techniques for NLP problems. It also employs state-of-the-art advanced RNNs, such as long short-term memory, to solve complex text generation tasks. After reading this book, you will have a clear understanding of the challenges faced by different industries and you will have worked on multiple examples of implementing NLP in the real world. What You Will Learn * Know the core concepts of implementing NLP and various approaches to natural language processing (NLP), including NLP using Python libraries such as NLTK, textblob, SpaCy, Standford CoreNLP, and more * Implement text pre-processing and feature engineering in NLP, including advanced methods of feature engineering * Understand and implement the concepts of information retrieval, text summarization, sentiment analysis, text classification, and other advanced NLP techniques leveraging machine learning and deep learning Who This Book Is For Data scientists who want to refresh and learn various concepts of natural language processing (NLP) through coding exercises
Contents:
Intro
Table of Contents
About the Authors
About the Technical Reviewer
Acknowledgments
Introduction
Chapter 1: Extracting the Data
Client Data
Free Sources
Web Scraping
Recipe 1-1. Collecting Data
Problem
Solution
How It Works
Step 1-1. Log in to the Twitter developer portal
Step 1-2. Execute query in Python
Recipe 1-2. Collecting Data from PDFs
Step 2-1. Install and import all the necessary libraries
Step 2-2. Extract text from a PDF file
Recipe 1-3. Collecting Data from Word Files
Step 3-1. Install and import all the necessary libraries
Step 3-2. Extract text from a Word file
Recipe 1-4. Collecting Data from JSON
Step 4-1. Install and import all the necessary libraries
Step 4-2. Extract text from a JSON file
Recipe 1-5. Collecting Data from HTML
Step 5-1. Install and import all the necessary libraries
Step 5-2. Fetch the HTML file
Step 5-3. Parse the HTML file
Step 5-4. Extract a tag value
Step 5-5. Extract all instances of a particular tag
Step 5-6. Extract all text from a particular tag
Recipe 1-6. Parsing Text Using Regular Expressions
Tokenizing
Extracting Email IDs
Replacing Email IDs
Extracting Data from an eBook and Performing regex
Recipe 1-7. Handling Strings
Replacing Content
Concatenating Two Strings
Searching for a Substring in a String
Recipe 1-8. Scraping Text from the Web
Step 8-1. Install all the necessary libraries
Step 8-2. Import the libraries
Step 8-3. Identify the URL to extract the data.
Step 8-4. Request the URL and download the content using Beautiful Soup
Step 8-5. Understand the website's structure to extract the required information
Step 8-6. Use Beautiful Soup to extract and parse the data from HTML tags
Step 8-7. Convert lists to a data frame and perform an analysis that meets business requirements
Step 8-8. Download the data frame
Chapter 2: Exploring and Processing Text Data
Recipe 2-1. Converting Text Data to Lowercase
Step 1-1. Read/create the text data
Step 1-2. Execute the lower() function on the text data
Recipe 2-2. Removing Punctuation
Step 2-1. Read/create the text data
Step 2-2. Execute the replace() function on the text data
Recipe 2-3. Removing Stop Words
Step 3-1. Read/create the text data
Step 3-2. Remove punctuation from the text data
Recipe 2-4. Standardizing Text
Step 4-1. Create a custom lookup dictionary
Step 4-2. Create a custom function for text standardization
Step 4-3. Run the text_std function
Recipe 2-5. Correcting Spelling
Step 5-1. Read/create the text data
Step 5-2. Execute spelling correction on the text data
Recipe 2-6. Tokenizing Text
Step 6-1. Read/create the text data
Step 6-2. Tokenize the text data
Recipe 2-7. Stemming
Step 7-1. Read the text data
Step 7-2. Stem the text
Recipe 2-8. Lemmatizing
Step 8-1. Read the text data
Step 8-2. Lemmatize the data
Recipe 2-9. Exploring Text Data
Step 9-1. Read the text data
Step 9-2. Import necessary libraries.
Step 9-3 Check the number of words in the data
Step 9-4. Compute the frequency of all words in the reviews
Step 9-5. Consider words with length greater than 3 and plot
Step 9-6. Build a word cloud
Recipe 2-10. Dealing with Emojis and Emoticons
Step 10-A1. Read the text data
Step 10-A2. Install and import necessary libraries
Step 10-A3. Write a function that coverts emojis into words
Step 10-A4. Pass text with an emoji to the function
Step 10-B1. Read the text data
Step 10-B2. Install and import necessary libraries
Step 10-B3. Write a function to remove emojis
Step 10-B4. Pass text with an emoji to the function
Step 10-C1. Read the text data
Step 10-C2. Install and import necessary libraries
Step 10-C3. Write function to convert emoticons into word
Step 10-C4. Pass text with emoticons to the function
Step 10-D1 Read the text data
Step 10-D2. Install and import necessary libraries
Step 10-D3. Write function to remove emoticons
Step 10-D4. Pass text with emoticons to the function
Step 10-E1. Read the text data
Step 10-E2. Install and import necessary libraries
Step 10-E3. Find all emojis and determine their meaning
Recipe 2-11. Building a Text Preprocessing Pipeline
Step 11-1. Read/create the text data
Step 11-2. Process the text
Chapter 3: Converting Text to Features
Recipe 3-1. Converting Text to Features Using One-Hot Encoding
Step 1-1. Store the text in a variable
Step 1-2. Execute a function on the text data
Recipe 3-2. Converting Text to Features Using a Count Vectorizer
Problem.
Solution
Recipe 3-3. Generating n-grams
Step 3-1. Generate n-grams using TextBlob
Step 3-2. Generate bigram-based features for a document
Recipe 3-4. Generating a Co-occurrence Matrix
Step 4-1. Import the necessary libraries
Step 4-2. Create function for a co-occurrence matrix
Step 4-3. Generate a co-occurrence matrix
Recipe 3-5. Hash Vectorizing
Step 5-1. Import the necessary libraries and create a document
Step 5-2. Generate a hash vectorizer matrix
Recipe 3-6. Converting Text to Features Using TF-IDF
Step 6-1. Read the text data
Step 6-2. Create the features
Recipe 3-7. Implementing Word Embeddings
skip-gram
Continuous Bag of Words (CBOW)
Recipe 3-8. Implementing fastText
Recipe 3-9. Converting Text to Features Using State-of-the-Art Embeddings
ELMo
Sentence Encoders
doc2vec
Sentence-BERT
Universal Encoder
InferSent
Open-AI GPT
Step 9-1. Import a notebook and data to Google Colab
Step 9-2. Install and import libraries
Step 9-3. Read text data
Step 9-4. Process text data
Step 9-5. Generate a feature vector
Infersent
Step 9-6. Generate a feature vector function automatically using a selected embedding method
Chapter 4: Advanced Natural Language Processing
Recipe 4-1. Extracting Noun Phrases
Recipe 4-2. Finding Similarity Between Texts
Step 2-1. Create/read the text data
Step 2-2. Find similarities
Phonetic Matching.
Recipe 4-3. Tagging Part of Speech
Step 3-1. Store the text in a variable
Step 3-2. Import NLTK for POS
Recipe 4-4. Extracting Entities from Text
Step 4-1. Read/create the text data
Step 4-2. Extract the entities
Using NLTK
Using spaCy
Recipe 4-5. Extracting Topics from Text
Step 5-1. Create the text data
Step 5-2. Clean and preprocess the data
Step 5-3. Prepare the document term matrix
Step 5-4. Create the LDA model
Recipe 4-6. Classifying Text
Step 6-1. Collect and understand the data
Step 6-2. Text processing and feature engineering
Step 6-3. Model training
Recipe 4-7. Carrying Out Sentiment Analysis
Step 7-1. Create the sample data
Step 7-2. Clean and preprocess the data
Step 7-3. Get the sentiment scores
Recipe 4-8. Disambiguating Text
Step 8-1. Import libraries
Step 8-2. Disambiguate word sense
Recipe 4-9. Converting Speech to Text
Step 9-1. Define the business problem
Step 9-2. Install and import necessary libraries
Step 9-3. Run the code
Recipe 4-10. Converting Text to Speech
Step 10-1. Install and import necessary libraries
Step 10-2. Run the code with the gTTs function
Recipe 4-11. Translating Speech
Step 11-1. Install and import necessary libraries
Step 11-2. Input text
Step 11-3. Run the goslate function
Chapter 5: Implementing Industry Applications
Recipe 5-1. Implementing Multiclass Classification
Step 1-1. Get the data from Kaggle.
Step 1-2. Import the libraries.
Notes:
Description based on print version record.
Includes index.
ISBN:
9781523150915
1523150912
9781484273517
1484273516
OCLC:
1265462358

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account