1 option
Natural language processing recipes : unlocking text data with machine learning and deep learning using Python / Akshay Kulkarni, Adarsha Shivananda.
- Format:
- Book
- Author/Creator:
- Kulkarni, Akshay, author.
- Shivananda, Adarsha, author.
- Language:
- English
- Subjects (All):
- Natural language processing (Computer science).
- Physical Description:
- 1 online resource (302 pages)
- Edition:
- Second edition.
- Place of Publication:
- [Place of publication not identified] : Apress, [2021]
- Summary:
- Focus on implementing end-to-end projects using Python and leverage state-of-the-art algorithms. This book teaches you to efficiently use a wide range of natural language processing (NLP) packages to: implement text classification, identify parts of speech, utilize topic modeling, text summarization, sentiment analysis, information retrieval, and many more applications of NLP. The book begins with text data collection, web scraping, and the different types of data sources. It explains how to clean and pre-process text data, and offers ways to analyze data with advanced algorithms. You then explore semantic and syntactic analysis of the text. Complex NLP solutions that involve text normalization are covered along with advanced pre-processing methods, POS tagging, parsing, text summarization, sentiment analysis, word2vec, seq2seq, and much more. The book presents the fundamentals necessary for applications of machine learning and deep learning in NLP. This second edition goes over advanced techniques to convert text to features such as Glove, Elmo, Bert, etc. It also includes an understanding of how transformers work, taking sentence BERT and GPT as examples. The final chapters explain advanced industrial applications of NLP with solution implementation and leveraging the power of deep learning techniques for NLP problems. It also employs state-of-the-art advanced RNNs, such as long short-term memory, to solve complex text generation tasks. After reading this book, you will have a clear understanding of the challenges faced by different industries and you will have worked on multiple examples of implementing NLP in the real world. What You Will Learn * Know the core concepts of implementing NLP and various approaches to natural language processing (NLP), including NLP using Python libraries such as NLTK, textblob, SpaCy, Standford CoreNLP, and more * Implement text pre-processing and feature engineering in NLP, including advanced methods of feature engineering * Understand and implement the concepts of information retrieval, text summarization, sentiment analysis, text classification, and other advanced NLP techniques leveraging machine learning and deep learning Who This Book Is For Data scientists who want to refresh and learn various concepts of natural language processing (NLP) through coding exercises
- Contents:
- Intro
- Table of Contents
- About the Authors
- About the Technical Reviewer
- Acknowledgments
- Introduction
- Chapter 1: Extracting the Data
- Client Data
- Free Sources
- Web Scraping
- Recipe 1-1. Collecting Data
- Problem
- Solution
- How It Works
- Step 1-1. Log in to the Twitter developer portal
- Step 1-2. Execute query in Python
- Recipe 1-2. Collecting Data from PDFs
- Step 2-1. Install and import all the necessary libraries
- Step 2-2. Extract text from a PDF file
- Recipe 1-3. Collecting Data from Word Files
- Step 3-1. Install and import all the necessary libraries
- Step 3-2. Extract text from a Word file
- Recipe 1-4. Collecting Data from JSON
- Step 4-1. Install and import all the necessary libraries
- Step 4-2. Extract text from a JSON file
- Recipe 1-5. Collecting Data from HTML
- Step 5-1. Install and import all the necessary libraries
- Step 5-2. Fetch the HTML file
- Step 5-3. Parse the HTML file
- Step 5-4. Extract a tag value
- Step 5-5. Extract all instances of a particular tag
- Step 5-6. Extract all text from a particular tag
- Recipe 1-6. Parsing Text Using Regular Expressions
- Tokenizing
- Extracting Email IDs
- Replacing Email IDs
- Extracting Data from an eBook and Performing regex
- Recipe 1-7. Handling Strings
- Replacing Content
- Concatenating Two Strings
- Searching for a Substring in a String
- Recipe 1-8. Scraping Text from the Web
- Step 8-1. Install all the necessary libraries
- Step 8-2. Import the libraries
- Step 8-3. Identify the URL to extract the data.
- Step 8-4. Request the URL and download the content using Beautiful Soup
- Step 8-5. Understand the website's structure to extract the required information
- Step 8-6. Use Beautiful Soup to extract and parse the data from HTML tags
- Step 8-7. Convert lists to a data frame and perform an analysis that meets business requirements
- Step 8-8. Download the data frame
- Chapter 2: Exploring and Processing Text Data
- Recipe 2-1. Converting Text Data to Lowercase
- Step 1-1. Read/create the text data
- Step 1-2. Execute the lower() function on the text data
- Recipe 2-2. Removing Punctuation
- Step 2-1. Read/create the text data
- Step 2-2. Execute the replace() function on the text data
- Recipe 2-3. Removing Stop Words
- Step 3-1. Read/create the text data
- Step 3-2. Remove punctuation from the text data
- Recipe 2-4. Standardizing Text
- Step 4-1. Create a custom lookup dictionary
- Step 4-2. Create a custom function for text standardization
- Step 4-3. Run the text_std function
- Recipe 2-5. Correcting Spelling
- Step 5-1. Read/create the text data
- Step 5-2. Execute spelling correction on the text data
- Recipe 2-6. Tokenizing Text
- Step 6-1. Read/create the text data
- Step 6-2. Tokenize the text data
- Recipe 2-7. Stemming
- Step 7-1. Read the text data
- Step 7-2. Stem the text
- Recipe 2-8. Lemmatizing
- Step 8-1. Read the text data
- Step 8-2. Lemmatize the data
- Recipe 2-9. Exploring Text Data
- Step 9-1. Read the text data
- Step 9-2. Import necessary libraries.
- Step 9-3 Check the number of words in the data
- Step 9-4. Compute the frequency of all words in the reviews
- Step 9-5. Consider words with length greater than 3 and plot
- Step 9-6. Build a word cloud
- Recipe 2-10. Dealing with Emojis and Emoticons
- Step 10-A1. Read the text data
- Step 10-A2. Install and import necessary libraries
- Step 10-A3. Write a function that coverts emojis into words
- Step 10-A4. Pass text with an emoji to the function
- Step 10-B1. Read the text data
- Step 10-B2. Install and import necessary libraries
- Step 10-B3. Write a function to remove emojis
- Step 10-B4. Pass text with an emoji to the function
- Step 10-C1. Read the text data
- Step 10-C2. Install and import necessary libraries
- Step 10-C3. Write function to convert emoticons into word
- Step 10-C4. Pass text with emoticons to the function
- Step 10-D1 Read the text data
- Step 10-D2. Install and import necessary libraries
- Step 10-D3. Write function to remove emoticons
- Step 10-D4. Pass text with emoticons to the function
- Step 10-E1. Read the text data
- Step 10-E2. Install and import necessary libraries
- Step 10-E3. Find all emojis and determine their meaning
- Recipe 2-11. Building a Text Preprocessing Pipeline
- Step 11-1. Read/create the text data
- Step 11-2. Process the text
- Chapter 3: Converting Text to Features
- Recipe 3-1. Converting Text to Features Using One-Hot Encoding
- Step 1-1. Store the text in a variable
- Step 1-2. Execute a function on the text data
- Recipe 3-2. Converting Text to Features Using a Count Vectorizer
- Problem.
- Solution
- Recipe 3-3. Generating n-grams
- Step 3-1. Generate n-grams using TextBlob
- Step 3-2. Generate bigram-based features for a document
- Recipe 3-4. Generating a Co-occurrence Matrix
- Step 4-1. Import the necessary libraries
- Step 4-2. Create function for a co-occurrence matrix
- Step 4-3. Generate a co-occurrence matrix
- Recipe 3-5. Hash Vectorizing
- Step 5-1. Import the necessary libraries and create a document
- Step 5-2. Generate a hash vectorizer matrix
- Recipe 3-6. Converting Text to Features Using TF-IDF
- Step 6-1. Read the text data
- Step 6-2. Create the features
- Recipe 3-7. Implementing Word Embeddings
- skip-gram
- Continuous Bag of Words (CBOW)
- Recipe 3-8. Implementing fastText
- Recipe 3-9. Converting Text to Features Using State-of-the-Art Embeddings
- ELMo
- Sentence Encoders
- doc2vec
- Sentence-BERT
- Universal Encoder
- InferSent
- Open-AI GPT
- Step 9-1. Import a notebook and data to Google Colab
- Step 9-2. Install and import libraries
- Step 9-3. Read text data
- Step 9-4. Process text data
- Step 9-5. Generate a feature vector
- Infersent
- Step 9-6. Generate a feature vector function automatically using a selected embedding method
- Chapter 4: Advanced Natural Language Processing
- Recipe 4-1. Extracting Noun Phrases
- Recipe 4-2. Finding Similarity Between Texts
- Step 2-1. Create/read the text data
- Step 2-2. Find similarities
- Phonetic Matching.
- Recipe 4-3. Tagging Part of Speech
- Step 3-1. Store the text in a variable
- Step 3-2. Import NLTK for POS
- Recipe 4-4. Extracting Entities from Text
- Step 4-1. Read/create the text data
- Step 4-2. Extract the entities
- Using NLTK
- Using spaCy
- Recipe 4-5. Extracting Topics from Text
- Step 5-1. Create the text data
- Step 5-2. Clean and preprocess the data
- Step 5-3. Prepare the document term matrix
- Step 5-4. Create the LDA model
- Recipe 4-6. Classifying Text
- Step 6-1. Collect and understand the data
- Step 6-2. Text processing and feature engineering
- Step 6-3. Model training
- Recipe 4-7. Carrying Out Sentiment Analysis
- Step 7-1. Create the sample data
- Step 7-2. Clean and preprocess the data
- Step 7-3. Get the sentiment scores
- Recipe 4-8. Disambiguating Text
- Step 8-1. Import libraries
- Step 8-2. Disambiguate word sense
- Recipe 4-9. Converting Speech to Text
- Step 9-1. Define the business problem
- Step 9-2. Install and import necessary libraries
- Step 9-3. Run the code
- Recipe 4-10. Converting Text to Speech
- Step 10-1. Install and import necessary libraries
- Step 10-2. Run the code with the gTTs function
- Recipe 4-11. Translating Speech
- Step 11-1. Install and import necessary libraries
- Step 11-2. Input text
- Step 11-3. Run the goslate function
- Chapter 5: Implementing Industry Applications
- Recipe 5-1. Implementing Multiclass Classification
- Step 1-1. Get the data from Kaggle.
- Step 1-2. Import the libraries.
- Notes:
- Description based on print version record.
- Includes index.
- ISBN:
- 9781523150915
- 1523150912
- 9781484273517
- 1484273516
- OCLC:
- 1265462358
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.