1 option
Mastering NLP from Foundations to LLMs : Apply Advanced Rule-Based Techniques to LLMs and Solve Real-world Business Problems Using Python.
- Format:
- Book
- Author/Creator:
- Gazit, Lior.
- Language:
- English
- Subjects (All):
- ChatGPT.
- Artificial intelligence--Data processing.
- Artificial intelligence.
- Natural language processing (Computer science).
- Cloud computing.
- Physical Description:
- 1 online resource (340 pages)
- Edition:
- 1st ed.
- Place of Publication:
- Birmingham : Packt Publishing, Limited, 2024.
- Biography/History:
- Gazit Lior: Lior Gazit is a highly skilled Machine Learning professional with a proven track record of success in building and leading teams drive business growth. He is an expert in Natural Language Processing and has successfully developed innovative Machine Learning pipelines and products. He holds a Master degree and has published in peer-reviewed journals and conferences. As a Senior Director of the Machine Learning group in the Financial sector, and a Principal Machine Learning Advisor at an emerging startup, Lior is a respected leader in the industry, with a wealth of knowledge and experience to share. With much passion and inspiration, Lior is dedicated to using Machine Learning to drive positive change and growth in his organizations. Ghaffari Meysam: Meysam Ghaffari is a Senior Data Scientist with a strong background in Natural Language Processing and Deep Learning. Currently working at MSKCC, where he specialize in developing and improving Machine Learning and NLP models for healthcare problems. He has over 9 years of experience in Machine Learning and over 4 years of experience in NLP and Deep Learning. He received his Ph. D. in Computer Science from Florida State University, His MS in Computer Science - Artificial Intelligence from Isfahan University of Technology and his B. S. in Computer Science at Iran University of Science and Technology. He also worked as a post doctoral research associate at University of Wisconsin-Madison before joining MSKCC.
- Summary:
- Enhance your NLP proficiency with modern frameworks like LangChain, explore mathematical foundations and code samples, and gain expert insights into current and future trends Key Features Learn how to build Python-driven solutions with a focus on NLP, LLMs, RAGs, and GPT Master embedding techniques and machine learning principles for real-world applications Understand the mathematical foundations of NLP and deep learning designs Purchase of the print or Kindle book includes a free PDF eBook Book Description Do you want to master Natural Language Processing (NLP) but don't know where to begin? This book will give you the right head start. Written by leaders in machine learning and NLP, Mastering NLP from Foundations to LLMs provides an in-depth introduction to techniques. Starting with the mathematical foundations of machine learning (ML), you'll gradually progress to advanced NLP applications such as large language models (LLMs) and AI applications. You'll get to grips with linear algebra, optimization, probability, and statistics, which are essential for understanding and implementing machine learning and NLP algorithms. You'll also explore general machine learning techniques and find out how they relate to NLP. Next, you'll learn how to preprocess text data, explore methods for cleaning and preparing text for analysis, and understand how to do text classification. You'll get all of this and more along with complete Python code samples. By the end of the book, the advanced topics of LLMs' theory, design, and applications will be discussed along with the future trends in NLP, which will feature expert opinions. You'll also get to strengthen your practical skills by working on sample real-world NLP business problems and solutions. What you will learn Master the mathematical foundations of machine learning and NLP Implement advanced techniques for preprocessing text data and analysis Design ML-NLP systems in Python Model and classify text using traditional machine learning and deep learning methods Understand the theory and design of LLMs and their implementation for various applications in AI Explore NLP insights, trends, and expert opinions on its future direction and potential Who this book is for This book is for deep learning and machine learning researchers, NLP practitioners, ML/NLP educators, and STEM students. Professionals working with text data as part of their projects will also find plenty of useful information in this book. Beginner-level familiarity with machine learning and a basic working knowledge of Python will help you get the best out of this book.
- Contents:
- Cover
- Title page
- Copyright and credits
- Dedication
- Foreword
- Contributors
- Disclaimer
- Table of Contents
- Preface
- Chapter 1: Navigating the NLP Landscape: A comprehensive introduction
- Who this book is for
- What is natural language processing?
- The history and evolution of natural language processing
- Initial strategies in the machine processing of natural language
- A winning synergy - the coming together of NLP and ML
- Introduction to math and statistics in NLP
- Understanding language models - ChatGPT example
- Summary
- Questions and answers
- Chapter 2: Linear Algebra, Probability and Statistics, and Estimation for Machine Learning and Natur
- Introduction to linear algebra
- Basic operations on matrices and vectors
- Matrix definitions
- Eigenvalues and eigenvectors
- Numerical methods for finding eigenvectors
- Eigenvalue decomposition
- Singular value decomposition
- Basic probability for machine learning
- Statistically independent
- Discrete random variables and their distribution
- Probability density function
- Bayesian estimation
- Further reading
- References
- Chapter 3: Machine Learning for Natural Language Processing
- Technical requirements
- Data exploration
- Data visualization
- Data cleaning
- Feature selection
- Feature engineering
- Common machine learning models
- Linear regression
- Logistic regression
- Decision trees
- Random forest
- Support vector machines (SVMs)
- Neural networks and transformers
- Model underfitting and overfitting
- Splitting data
- Hyperparameter tuning
- Ensemble models
- Bagging
- Boosting
- Stacking
- Random forests
- Gradient boosting
- Handling imbalanced data
- SMOTE
- The NearMiss algorithm
- Cost-sensitive learning
- Data augmentation
- Dealing with correlated data
- References.
- Chapter 4: Streamlining Text Preprocessing Techniques for Optimal NLP Performance
- Lowercasing in NLP
- Removing special characters and punctuation
- Stop word removal
- NER
- POS tagging
- Rule-based methods
- Statistical methods
- Deep learning-based methods
- Regular expressions
- Tokenization
- Explaining the preprocessing pipeline
- Code for NER and POS
- Chapter 5: Text Classification, Part 1 - Using Traditional Machine Learning
- Types of text classification
- Supervised learning
- Unsupervised learning
- Semi-supervised learning
- Sentence classification using one-hot encoding vector representation
- Text classification using TF-IDF
- Text classification using Word2Vec
- Word2Vec
- Model evaluation
- Overfitting and underfitting
- Additional topics in applied text classification
- Topic modeling - a particular use case of unsupervised text classification
- LDA
- Real-world ML system design for NLP text classification
- Implementing an ML solution
- Reviewing our use case - ML system design for NLP classification in a Jupyter Notebook
- The pipeline
- Code settings
- Generating the chosen model
- Chapter 6: Text Classification Reimagined: Delving Deep into Deep Learning Language Models
- Understanding deep learning basics
- What is a neural network?
- The basic design of a neural network
- Neural network common terms
- The architecture of different neural networks
- The challenges of training neural networks
- Language models
- Transfer learning
- Understanding transformers
- Architecture of transformers
- Applications of transformers
- Learning more about large language models.
- The challenges of training language models
- Specific designs of language models
- Challenges of using GPT-3
- Reviewing our use case - ML/DL system design for NLP classification in a Jupyter Notebook
- The business objective
- The technical objective
- Chapter 7: Demystifying Large Language Models: Theory, Design, and Langchain Implementation
- What are LLMs and how are they different from LMs?
- n-gram models
- Hidden Markov models (HMMs)
- Recurrent neural networks (RNNs)
- How LLMs stand out
- Motivations for developing and using LLMs
- Improved performance
- Broad generalization
- Few-shot learning
- Understanding complex contexts
- Multilingual capabilities
- Human-like text generation
- Challenges in developing LLMs
- Amounts of data
- Computational resources
- Risk of bias
- Model robustness
- Interpretability and debugging
- Environmental impact
- Different types of LLMs
- Transformer models
- Example designs of state-of-the-art LLMs
- GPT-3.5 and ChatGPT
- LM pretraining
- Training the reward model
- How to fine-tune the model using reinforcement learning
- GPT-4
- LLaMA
- PaLM
- Open-source tools for RLHF
- Chapter 8: Accessing the Power of Large Language Models: Advanced Setup and Integration with RAG
- Setting up an LLM application - API-based closed source models
- Choosing a remote LLM provider
- Prompt engineering and priming GPT
- Experimenting with OpenAI's GPT model
- Setting up an LLM application - local open source models
- About the different aspects that distinguish between open source and closed source
- Hugging Face's hub of models
- Employing LLMs from Hugging Face via Python
- Exploring advanced system design - RAG and LangChain
- LangChain's design concepts
- Data sources.
- Data that is not pre-embedded
- Chains
- Agents
- Long-term memory and referring to prior conversations
- Ensuring continuous relevance through incremental updates and automated monitoring
- Reviewing a simple LangChain setup in a Jupyter notebook
- Setting up a LangChain pipeline with Python
- LLMs in the cloud
- AWS
- Microsoft Azure
- GCP
- Concluding cloud services
- Chapter 9: Exploring the Frontiers: Advanced Applications and Innovations Driven by LLMs
- Enhancing LLM performance with RAG and LangChain - a dive into advanced functionalities
- LangChain pipeline with Python - enhancing performance with LLMs
- Advanced methods with chains
- Asking the LLM a general knowledge question
- Requesting output structure - making the LLM provide output in a particular data format
- Evolving to a fluent conversation - inserting an element of memory to have previous interactions as reference and context for follow-up prompts
- Retrieving information from various web sources automatically
- Retrieving content from a YouTube video and summarizing it
- Prompt compression and API cost reduction
- Prompt compression
- Experimenting with prompt compression and evaluating trade-offs
- Multiple agents - forming a team of LLMs that collaborate
- Potential advantages of multiple LLM agents working simultaneously
- Concluding thoughts on the multiple-agent team
- Chapter 10: Riding the Wave: Analyzing Past, Present, and Future Trends Shaped by LLMs and AI
- Key technical trends around LLMs and AI
- Computation power - the engine behind LLMs
- The future of computational power in NLP
- Large datasets and their indelible mark on NLP and LLMs
- Purpose - training, benchmarking, and domain expertise
- Value - robustness, diversity, and efficiency.
- Impact - democratization, proficiency, and new concerns
- Evolution of large language models - purpose, value, and impact
- Purpose - why the push for bigger and better LLMs?
- Value - the LLM advantage
- Impact - changing the landscape
- NLP and LLMs in the business world
- Business sectors
- Customer interactions and service - the early adopter
- Change management driven by AI's impact
- Behavioral trends induced by AI and LLMs - the social aspect
- Personal assistants becoming indispensable
- Ease in communication and bridging language barriers
- Ethical implications of delegated decisions
- Ethics and risks - growing concerns around the implementation of AI
- Chapter 11: Exclusive Industry Insights: Perspectives and Predictions from World Class Experts
- Overview of our experts
- Nitzan Mekel-Bobrov, PhD
- David Sontag, PhD
- John D. Halamka, M.D., M.S.
- Xavier Amatriain, PhD
- Melanie Garson, PhD
- Our questions and the experts' answers
- Nitzan Mekel-Bobrov
- Q1.1 - Future of LLM - hybrid learning paradigms: In light of the evolving landscape of learning schemes, what do you envision as the next breakthrough in combining different learning paradigms within LLMs?
- Q2.1 - As the Chief AI Officer becomes more integral to the corporate hierarchy, what unique challenges do you foresee in bridging the gap between AI potential and practical business applications, and how should the CAIO's role evolve to meet these challe
- Q3 - How do foundation models and the strategies of major tech companies toward open sourcing affect data ownership and its value for businesses?
- David Sontag
- Q1 - As we progress toward creating more equitable and unbiased datasets, what strategies do you believe are most effective in identifying and mitigating implicit biases within large datasets?.
- Q2 - How do you see these strategies evolving with the advancement of NLP technologies, and what do you envision as the next breakthrough in combining different learning paradigms within LLMs?.
- Notes:
- Description based on publisher supplied metadata and other sources.
- ISBN:
- 9781804616383
- 1804616389
- OCLC:
- 1430322280
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.