1 option

Transformers for Natural Language Processing and Computer Vision : Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3 / Denis Rothman.

O'Reilly Online Learning: Academic/Public Library Edition Available online

Format:: Book
Author/Creator:: Rothman, Denis, author.
Series:: Expert insight.; Expert insight
Language:: English
Subjects (All):: ChatGPT.; Artificial intelligence--Data processing.; Artificial intelligence.; Natural language processing (Computer science).; Cloud computing.
Physical Description:: 1 online resource (729 pages)
Edition:: Third edition.
Place of Publication:: Birmingham, England : Packt Publishing Ltd., [2024]
Biography/History:: Rothman Denis: Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive Natural Language Processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an Advanced Planning and Scheduling (APS) solution used worldwide.
Summary:: Transformers for Natural Language Processing and Computer Vision, Third Edition, explores Large Language Model (LLM) architectures, applications, and various platforms (Hugging Face, OpenAI, and Google Vertex AI) used for Natural Language Processing (NLP) and Computer Vision (CV). The book guides you through different transformer architectures to the latest Foundation Models and Generative AI. You’ll pretrain and fine-tune LLMs and work through different use cases, from summarization to implementing question-answering systems with embedding-based search techniques. You will also learn the risks of LLMs, from hallucinations and memorization to privacy, and how to mitigate such risks using moderation models with rule and knowledge bases. You’ll implement Retrieval Augmented Generation (RAG) with LLMs to improve the accuracy of your models and gain greater control over LLM outputs. Dive into generative vision transformers and multimodal model architectures and build applications, such as image and video-to-text classifiers. Go further by combining different models and platforms and learning about AI agent replication. This book provides you with an understanding of transformer architectures, pretraining, fine-tuning, LLM use cases, and best practices.
Contents:: Cover; Copyright; Contributors; Table of Contents; Preface; Chapter 1: What Are Transformers?; How constant time complexity O(1) changed our lives forever; O(1) attention conquers O(n) recurrent methods; Attention layer; Recurrent layer; The magic of the computational time complexity of an attention layer; Computational time complexity with a CPU; Computational time complexity with a GPU; Computational time complexity with a TPU; TPU-LLM; A brief journey from recurrent to attention; A brief history; From one token to an AI revolution; From one token to everything; Foundation Models; From general purpose to specific tasks; The role of AI professionals; The future of AI professionals; What resources should we use?; Decision-making guidelines; The rise of transformer seamless APIs and assistants; Choosing ready-to-use API-driven libraries; Choosing a cloud platform and transformer model; Summary; Questions; References; Further reading; Chapter 2: Getting Started with the Architecture of the Transformer Model; The rise of the Transformer: Attention Is All You Need; The encoder stack; Input embedding; Positional encoding; Sublayer 1: Multi-head attention; Sublayer 2: Feedforward network; The decoder stack; Output embedding and position encoding; The attention layers; The FFN sublayer, the post-LN, and the linear layer; Training and performance; Hugging Face transformer models; Chapter 3: Emergent vs Downstream Tasks: The Unseen Depths of Transformers; The paradigm shift: What is an NLP task?; Inside the head of the attention sublayer of a transformer; Exploring emergence with ChatGPT; Investigating the potential of downstream tasks; Evaluating models with metrics; Accuracy score; F1-score.; MCC; Human evaluation; Benchmark tasks and datasets; Defining the SuperGLUE benchmark tasks; Running downstream tasks; The Corpus of Linguistic Acceptability (CoLA); Stanford Sentiment TreeBank (SST-2); Microsoft Research Paraphrase Corpus (MRPC); Winograd schemas; Chapter 4: Advancements in Translations with Google Trax, Google Translate, and Gemini; Defining machine translation; Human transductions and translations; Machine transductions and translations; Evaluating machine translations; Preprocessing a WMT dataset; Preprocessing the raw data; Finalizing the preprocessing of the datasets; Evaluating machine translations with BLEU; Geometric evaluations; Applying a smoothing technique; Translations with Google Trax; Installing Trax; Creating the Original Transformer model; Initializing the model using pretrained weights; Tokenizing a sentence; Decoding from the Transformer; De-tokenizing and displaying the translation; Translation with Google Translate; Translation with a Google Translate AJAX API Wrapper; Implementing googletrans; Translation with Gemini; Gemini's potential; Chapter 5: Diving into Fine-Tuning through BERT; The architecture of BERT; Preparing the pretraining input environment; Pretraining and fine-tuning a BERT model; Fine-tuning BERT; Defining a goal; Hardware constraints; Installing Hugging Face Transformers; Importing the modules; Specifying CUDA as the device for torch; Loading the CoLA dataset; Creating sentences, label lists, and adding BERT tokens; Activating the BERT tokenizer; Processing the data; Creating attention masks; Splitting the data into training and validation sets.; Converting all the data into torch tensors; Selecting a batch size and creating an iterator; BERT model configuration; Loading the Hugging Face BERT uncased base model; Optimizer grouped parameters; The hyperparameters for the training loop; The training loop; Training evaluation; Predicting and evaluating using the holdout dataset; Exploring the prediction process; Evaluating using the Matthews correlation coefficient; Matthews correlation coefficient evaluation for the whole dataset; Building a Python interface to interact with the model; Saving the model; Creating an interface for the trained model; Interacting with the model; Chapter 6: Pretraining a Transformer from Scratch through RoBERTa; Training a tokenizer and pretraining a transformer; Building KantaiBERT from scratch; Step 1: Loading the dataset; Step 2: Installing Hugging Face transformers; Step 3: Training a tokenizer; Step 4: Saving the files to disk; Step 5: Loading the trained tokenizer files; Step 6: Checking resource constraints: GPU and CUDA; Step 7: Defining the configuration of the model; Step 8: Reloading the tokenizer in transformers; Step 9: Initializing a model from scratch; Exploring the parameters; Step 10: Building the dataset; Step 11: Defining a data collator; Step 12: Initializing the trainer; Step 13: Pretraining the model; Step 14: Saving the final model (+tokenizer + config) to disk; Step 15: Language modeling with FillMaskPipeline; Pretraining a Generative AI customer support model on X data; Step 1: Downloading the dataset; Step 3: Loading and filtering the data; Step 4: Checking Resource Constraints: GPU and CUDA; Step 5: Defining the configuration of the model.; Step 6: Creating and processing the dataset; Step 7: Initializing the trainer; Step 8: Pretraining the model; Step 9: Saving the model; Step 10: User interface to chat with the Generative AI agent; Further pretraining; Limitations; Next steps; Chapter 7: The Generative AI Revolution with ChatGPT; GPTs as GPTs; Improvement; Diffusion; New application sectors; Self-service assistants; Development assistants; Pervasiveness; The architecture of OpenAI GPT transformer models; The rise of billion-parameter transformer models; The increasing size of transformer models; Context size and maximum path length; From fine-tuning to zero-shot models; Stacking decoder layers; GPT models; OpenAI models as assistants; ChatGPT provides source code; GitHub Copilot code assistant; General-purpose prompt examples; Getting started with ChatGPT - GPT-4 as an assistant; 1. GPT-4 helps to explain how to write source code; 2. GPT-4 creates a function to show the YouTube presentation of GPT-4 by Greg Brockman on March 14, 2023; 3. GPT-4 creates an application for WikiArt to display images; 4. GPT-4 creates an application to display IMDb reviews; 5. GPT-4 creates an application to display a newsfeed; 6. GPT-4 creates a k-means clustering (KMC) algorithm; Getting started with the GPT-4 API; Running our first NLP task with GPT-4; Steps 1: Installing OpenAI and Step 2: Entering the API key; Step 3: Running an NLP task with GPT-4; Key hyperparameters; Running multiple NLP tasks; Retrieval Augmented Generation (RAG) with GPT-4; Installation; Document retrieval; Augmented retrieval generation; Chapter 8: Fine-Tuning OpenAI GPT Models; Risk management.; Fine-tuning a GPT model for completion (generative); 1. Preparing the dataset; 1.1. Preparing the data in JSON; 1.2. Converting the data to JSONL; 2. Fine-tuning an original model; 3. Running the fine-tuned GPT model; 4. Managing fine-tuned jobs and models; Before leaving; Chapter 9: Shattering the Black Box with Interpretable Tools; Transformer visualization with BertViz; Running BertViz; Step 1: Installing BertViz and importing the modules; Step 2: Load the models and retrieve attention; Step 3: Head view; Step 4: Processing and displaying attention heads; Step 5: Model view; Step 6: Displaying the output probabilities of attention heads; Streaming the output of the attention heads; Visualizing word relationships using attention scores with pandas; exBERT; Interpreting Hugging Face transformers with SHAP; Introducing SHAP; Explaining Hugging Face outputs with SHAP; Transformer visualization via dictionary learning; Transformer factors; Introducing LIME; The visualization interface; Other interpretable AI tools; LIT; PCA; Running LIT; OpenAI LLMs explain neurons in transformers; Limitations and human control; Chapter 10: Investigating the Role of Tokenizers in Shaping Transformer Models; Matching datasets and tokenizers; Best practices; Step 1: Preprocessing; Step 2: Quality control; Step 3: Continuous human quality control; Word2Vec tokenization; Case 0: Words in the dataset and the dictionary; Case 1: Words not in the dataset or the dictionary; Case 2: Noisy relationships; Case 3: Words in a text but not in the dictionary; Case 4: Rare words; Case 5: Replacing rare words.; Exploring sentence and WordPiece tokenizers to understand the efficiency of subword tokenizers for transformers.
Notes:: Includes bibliographical references and index.; Description based on publisher supplied metadata and other sources.; Description based on print version record.
ISBN:: 9781805123743; 1805123742
OCLC:: 1424949941

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

1 option

Transformers for Natural Language Processing and Computer Vision : Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3 / Denis Rothman.

Find

My Account

Guides