My Account Log in

1 option

Applied Natural Language Processing in the Enterprise / Patel, Ankur.

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Patel, Ankur, author.
Arasanipalai, Ajay Uppili, author.
Language:
English
Subjects (All):
Natural language processing (Computer science).
Machine learning.
Physical Description:
1 online resource (350 pages)
Edition:
1st edition
Place of Publication:
O'Reilly Media, Inc., 2021.
System Details:
text file
Summary:
NLP is one of the hottest topics in AI today. Having lagged for years behind other deep learning fields such as computer vision, NLP only recently gained mainstream popularity. Google, Facebook, and OpenAI have open-sourced large pretrained language models, but many organizations today still struggle with building and adopting NLP applications. This hands-on guide helps you learn the process quickly. If you have a basic to intermediate understanding of machine learning and programming experience with Python, you’ll learn how to build and deploy real-world NLP applications in your organization. Authors Ankur Patel and Ajay Uppili Arasanipalai walk you through the process without bogging you down in theory. Understand how state-of-the-art NLP models work Learn the tools of the trade, including frameworks popular today Perform NLP tasks such as text classification, semantic search, and reading comprehension Solve problems using new models like transformers and techniques such as transfer learning Build NLP models from scratch with performance comparable or superior to out-of-the-box systems Deploy your models to production and maintain their performance Implement a suite of NLP algorithms using Python and PyTorch
Contents:
Intro
Copyright
Table of Contents
Preface
What Is Natural Language Processing?
Why Should I Read This Book?
What Do I Need to Know Already?
What Is This Book All About?
How Is This Book Organized?
Conventions Used in This Book
Using Code Examples
O'Reilly Online Learning
How to Contact Us
Acknowledgments
Ajay
Ankur
Part I. Scratching the Surface
Chapter 1. Introduction to NLP
What Is NLP?
Popular Applications
History
Inflection Points
A Final Word
Basic NLP
Defining NLP Tasks
Set Up the Programming Environment
spaCy, fast.ai, and Hugging Face
Perform NLP Tasks Using spaCy
Conclusion
Chapter 2. Transformers and Transfer Learning
Training with fastai
Using the fastai Library
ULMFiT for Transfer Learning
Fine-Tuning a Language Model on IMDb
Training a Text Classifier
Inference with Hugging Face
Loading Models
Generating Predictions
Chapter 3. NLP Tasks and Applications
Pretrained Language Models
Transfer Learning and Fine-Tuning
NLP Tasks
Natural Language Dataset
Explore the AG Dataset
NLP Task #1: Named Entity Recognition
Perform Inference Using the Original spaCy Model
Custom NER
Annotate via Prodigy: NER
Train the Custom NER Model Using spaCy
Custom NER Model Versus Original NER Model
NLP Task #2: Text Classification
Annotate via Prodigy: Text Classification
Train Text Classification Models Using spaCy
Part II. The Cogs in the Machine
Chapter 4. Tokenization
A Minimal Tokenizer
Hugging Face Tokenizers
Subword Tokenization
Building Your Own Tokenizer
Chapter 5. Embeddings: How Machines "Understand" Words
Understanding Versus Reading Text
Word Vectors
Word2Vec
Embeddings in the Age of Transfer Learning.
Embeddings in Practice
Preprocessing
Model
Training
Validation
Embedding Things That Aren't Words
Making Vectorized Music
Some General Tips for Making Custom Embeddings
Chapter 6. Recurrent Neural Networks and Other Sequence Models
Recurrent Neural Networks
RNNs in PyTorch from Scratch
Bidirectional RNN
Sequence to Sequence Using RNNs
Long Short-Term Memory
Gated Recurrent Units
Chapter 7. Transformers
Building a Transformer from Scratch
Attention Mechanisms
Dot Product Attention
Scaled Dot Product Attention
Multi-Head Self-Attention
Adaptive Attention Span
Persistent Memory/All-Attention
Product-Key Memory
Transformers for Computer Vision
Chapter 8. BERTology: Putting It All Together
ImageNet
The Power of Pretrained Models
The Path to NLP's ImageNet Moment
Pretrained Word Embeddings
The Limitations of One-Hot Encoding
GloVe
fastText
Context-Aware Pretrained Word Embeddings
Sequential Models
Sequential Data and the Importance of Sequential Models
RNNs
Vanilla RNNs
LSTM Networks
GRUs
Transformers
Transformer-XL
NLP's ImageNet Moment
Universal Language Model Fine-Tuning
ELMo
BERT
BERTology
GPT-1, GPT-2, GPT-3
Part III. Outside the Wall
Chapter 9. Tools of the Trade
Deep Learning Frameworks
PyTorch
TensorFlow
Jax
Julia
Visualization and Experiment Tracking
TensorBoard
Weights &amp
Biases
Neptune
Comet
MLflow
AutoML
H2O.ai
Dataiku
DataRobot
ML Infrastructure and Compute
Paperspace
FloydHub
Google Colab
Kaggle Kernels
Lambda GPU Cloud
Edge/On-Device Inference
ONNX
Core ML
Edge Accelerators
Cloud Inference and Machine Learning as a Service.
AWS
Microsoft Azure
Google Cloud Platform
Continuous Integration and Delivery
Chapter 10. Visualization
Our First Streamlit App
Build the Streamlit App
Deploy the Streamlit App
Explore the Streamlit Web App
Build and Deploy a Streamlit App for Custom NER
Build and Deploy a Streamlit App for Text Classification on AG News Dataset
Build and Deploy a Streamlit App for Text Classification on Custom Text
Chapter 11. Productionization
Data Scientists, Engineers, and Analysts
Prototyping, Deployment, and Maintenance
Notebooks and Scripts
Databricks: Your Unified Data Analytics Platform
Support for Big Data
Support for Multiple Programming Languages
Support for ML Frameworks
Support for Model Repository, Access Control, Data Lineage, and Versioning
Databricks Setup
Set Up Access to S3 Bucket
Set Up Libraries
Create Cluster
Create Notebook
Enable Init Script and Restart Cluster
Run Speed Test: Inference on NER Using spaCy
Machine Learning Jobs
Production Pipeline Notebook
Scheduled Machine Learning Jobs
Event-Driven Machine Learning Pipeline
Log and Register Model
MLflow Model Serving
Alternatives to Databricks
Amazon SageMaker
Saturn Cloud
Chapter 12. Conclusion
Ten Final Lessons
Lesson 1: Start with Simple Approaches First
Lesson 2: Leverage the Community
Lesson 3: Do Not Create from Scratch, When Possible
Lesson 4: Intuition and Experience Trounces Theory
Lesson 5: Fight Decision Fatigue
Lesson 6: Data Is King
Lesson 7: Lean on Humans
Lesson 8: Pair Yourself with Really Great Engineers
Lesson 9: Ensemble
Lesson 10: Have Fun
Final Word
Appendix A. Scaling
Multi-GPU Training
Distributed Training
What Makes Deep Training Fast?
Appendix B. CUDA.
Threads and Thread Blocks
Writing CUDA Kernels
CUDA in Practice
Index
About the Authors
Colophon.
Notes:
Online resource; Title from title page (viewed April 25, 2021)
Description based on publisher supplied metadata and other sources.
ISBN:
1-4920-6252-9
1-4920-6254-5
1-4920-6256-1
OCLC:
1192526432

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account