1 option
Learn Amazon SageMaker : a guide to building, training, and deploying machine learning models for developers and data scientists / Julien Simon.
- Format:
- Book
- Author/Creator:
- Simon, Julien, 1952- author.
- Language:
- English
- Subjects (All):
- Machine learning.
- Cloud computing.
- Physical Description:
- 1 online resource (554 pages)
- Edition:
- Second edition.
- Place of Publication:
- London, England : Packt Publishing, [2021]
- Summary:
- Swiftly build and deploy machine learning models without managing infrastructure and boost productivity using the latest Amazon SageMaker capabilities such as Studio, Autopilot, Data Wrangler, Pipelines, and Feature StoreKey FeaturesBuild, train, and deploy machine learning models quickly using Amazon SageMakerOptimize the accuracy, cost, and fairness of your modelsCreate and automate end-to-end machine learning workflows on Amazon Web Services (AWS)Book DescriptionAmazon SageMaker enables you to quickly build, train, and deploy machine learning models at scale without managing any infrastructure. It helps you focus on the machine learning problem at hand and deploy high-quality models by eliminating the heavy lifting typically involved in each step of the ML process. This second edition will help data scientists and ML developers to explore new features such as SageMaker Data Wrangler, Pipelines, Clarify, Feature Store, and much more. You'll start by learning how to use various capabilities of SageMaker as a single toolset to solve ML challenges and progress to cover features such as AutoML, built-in algorithms and frameworks, and writing your own code and algorithms to build ML models. The book will then show you how to integrate Amazon SageMaker with popular deep learning libraries, such as TensorFlow and PyTorch, to extend the capabilities of existing models. You'll also see how automating your workflows can help you get to production faster with minimum effort and at a lower cost. Finally, you'll explore SageMaker Debugger and SageMaker Model Monitor to detect quality issues in training and production. By the end of this Amazon book, you'll be able to use Amazon SageMaker on the full spectrum of ML workflows, from experimentation, training, and monitoring to scaling, deployment, and automation.What you will learnBecome well-versed with data annotation and preparation techniquesUse AutoML features to build and train machine learning models with AutoPilotCreate models using built-in algorithms and frameworks and your own codeTrain computer vision and natural language processing (NLP) models using real-world examplesCover training techniques for scaling, model optimization, model debugging, and cost optimizationAutomate deployment tasks in a variety of configurations using SDK and several automation toolsWho this book is forThis book is for software engineers, machine learning developers, data scientists, and AWS users who are new to using Amazon SageMaker and want to build high-quality machine learning models without worrying about infrastructure. Knowledge of AWS basics is required to grasp the concepts covered in this book more effectively. A solid understanding of machine learning concepts and the Python programming language will also be beneficial.
- Contents:
- Cover
- Title Page
- Copyright and Credits
- Contributors
- Table of Contents
- Preface
- Section 1: Introduction to Amazon SageMaker
- Chapter 1: Introducing Amazon SageMaker
- Technical requirements
- Exploring the capabilities of Amazon SageMaker
- The main capabilities of Amazon SageMaker
- The Amazon SageMaker API
- Setting up Amazon SageMaker on your local machine
- Installing the SageMaker SDK with virtualenv
- Installing the SageMaker SDK with Anaconda
- A word about AWS permissions
- Setting up Amazon SageMaker Studio
- Onboarding to Amazon SageMaker Studio
- Onboarding with the quick start procedure
- Deploying one-click solutions and models with Amazon SageMaker JumpStart
- Deploying a solution
- Deploying a model
- Fine-tuning a model
- Summary
- Chapter 2: Handling Data Preparation Techniques
- Labeling data with Amazon SageMaker Ground Truth
- Using workforces
- Creating a private workforce
- Uploading data for labeling
- Creating a labeling job
- Labeling images
- Labeling text
- Transforming data with Amazon SageMaker Data Wrangler
- Loading a dataset in SageMaker Data Wrangler
- Transforming a dataset in SageMaker Data Wrangler
- Exporting a SageMaker Data Wrangler pipeline
- Running batch jobs with Amazon SageMaker Processing
- Discovering the Amazon SageMaker Processing API
- Processing a dataset with scikit-learn
- Processing a dataset with your own code
- Section 2: Building and Training Models
- Chapter 3: AutoML with Amazon SageMaker Autopilot
- Discovering Amazon SageMaker Autopilot
- Analyzing data
- Feature engineering
- Model tuning
- Using Amazon SageMaker Autopilot in SageMaker Studio
- Launching a job
- Monitoring a job
- Comparing jobs
- Deploying and invoking a model.
- Using the SageMaker Autopilot SDK
- Cleaning up
- Diving deep on SageMaker Autopilot
- The job artifacts
- The data exploration notebook
- The candidate generation notebook
- Chapter 4: Training Machine Learning Models
- Discovering the built-in algorithms in Amazon SageMaker
- Supervised learning
- Unsupervised learning
- A word about scalability
- Training and deploying models with built-in algorithms
- Understanding the end-to-end workflow
- Using alternative workflows
- Using fully managed infrastructure
- Using the SageMaker SDK with built-in algorithms
- Preparing data
- Configuring a training job
- Launching a training job
- Working with more built-in algorithms
- Regression with XGBoost
- Recommendation with Factorization Machines
- Using Principal Component Analysis
- Detecting anomalies with Random Cut Forest
- Chapter 5: Training CV Models
- Discovering the CV built-in algorithms in Amazon SageMaker
- Discovering the image classification algorithm
- Discovering the object detection algorithm
- Discovering the semantic segmentation algorithm
- Training with CV algorithms
- Preparing image datasets
- Working with image files
- Working with RecordIO files
- Working with SageMaker Ground Truth files
- Using the built-in CV algorithms
- Training an image classification model
- Fine-tuning an image classification model
- Training an object detection model
- Training a semantic segmentation model
- Chapter 6: Training Natural Language Processing Models
- Discovering the NLP built-in algorithms in Amazon SageMaker
- Discovering the BlazingText algorithm
- Discovering the LDA algorithm
- Discovering the NTM algorithm.
- Discovering the seq2sea algorithm
- Training with NLP algorithms
- Preparing natural language datasets
- Preparing data for classification with BlazingText
- Preparing data for classification with BlazingText, version 2
- Preparing data for word vectors with BlazingText
- Preparing data for topic modeling with LDA and NTM
- Using datasets labeled with SageMaker Ground Truth
- Using the built-in algorithms for NLP
- Classifying text with BlazingText
- Computing word vectors with BlazingText
- Using BlazingText models with FastText
- Modeling topics with LDA
- Modeling topics with NTM
- Chapter 7: Extending Machine Learning Services Using Built-In Frameworks
- Discovering the built-in frameworks in Amazon SageMaker
- Running a first example with XGBoost
- Working with framework containers
- Training and deploying locally
- Training with script mode
- Understanding model deployment
- Managing dependencies
- Putting it all together
- Running your framework code on Amazon SageMaker
- Using the built-in frameworks
- Working with TensorFlow and Keras
- Working with PyTorch
- Working with Hugging Face
- Working with Apache Spark
- Chapter 8: Using Your Algorithms and Code
- Understanding how SageMaker invokes your code
- Customizing an existing framework container
- Setting up your build environment on EC2
- Building training and inference containers
- Using the SageMaker Training Toolkit with scikit-learn
- Building a fully custom container for scikit-learn
- Training with a fully custom container
- Deploying a fully custom container
- Building a fully custom container for R
- Coding with R and plumber
- Building a custom container
- Training and deploying a custom container on SageMaker
- Training and deploying with your own code on MLflow.
- Installing MLflow
- Training a model with MLflow
- Building a SageMaker container with MLflow
- Building a fully custom container for SageMaker Processing
- Section 3: Diving Deeper into Training
- Chapter 9: Scaling Your Training Jobs
- Understanding when and how to scale
- Understanding what scaling means
- Adapting training time to business requirements
- Right-sizing training infrastructure
- Deciding when to scale
- Deciding how to scale
- Scaling a BlazingText training job
- Monitoring and profiling training jobs with Amazon SageMaker Debugger
- Viewing monitoring and profiling information in SageMaker Studio
- Enabling profiling in SageMaker Debugger
- Solving training challenges
- Streaming datasets with pipe mode
- Using pipe mode with built-in algorithms
- Using pipe mode with other algorithms and frameworks
- Simplifying data loading with MLIO
- Training factorization machines with pipe mode
- Distributing training jobs
- Understanding data parallelism and model parallelism
- Distributing training for built-in algorithms
- Distributing training for built-in frameworks
- Distributing training for custom containers
- Scaling an image classification model on ImageNet
- Preparing the ImageNet dataset
- Defining our training job
- Training on ImageNet
- Updating batch size
- Adding more instances
- Summing things up
- Training with the SageMaker data and model parallel libraries
- Training on TensorFlow with SageMaker DDP
- Training on Hugging Face with SageMaker DDP
- Training on Hugging Face with SageMaker DMP
- Using other storage services
- Working with SageMaker and Amazon EFS
- Working with SageMaker and Amazon FSx for Lustre
- Chapter 10: Advanced Training Techniques
- Optimizing training costs with managed spot training.
- Comparing costs
- Understanding Amazon EC2 Spot Instances
- Understanding managed spot training
- Using managed spot training with object detection
- Using managed spot training and checkpointing with Keras
- Optimizing hyperparameters with automatic model tuning
- Understanding automatic model tuning
- Using automatic model tuning with object detection
- Using automatic model tuning with Keras
- Using automatic model tuning for architecture search
- Exploring models with SageMaker Debugger
- Debugging an XGBoost job
- Inspecting an XGBoost job
- Debugging and inspecting a Keras job
- Managing features and building datasets with SageMaker Feature Store
- Engineering features with SageMaker Processing
- Creating a feature group
- Ingesting features
- Querying features to build a dataset
- Exploring other capabilities of SageMaker Feature Store
- Detecting bias in datasets and explaining predictions with SageMaker Clarify
- Configuring a bias analysis with SageMaker Clarify
- Running a bias analysis
- Analyzing bias metrics
- Running an explainability analysis
- Mitigating bias
- Section 4: Managing Models in Production
- Chapter 11: Deploying Machine Learning Models
- Examining model artifacts and exporting models
- Examining and exporting built-in models
- Examining and exporting built-in CV models
- Examining and exporting XGBoost models
- Examining and exporting scikit-learn models
- Examining and exporting TensorFlow models
- Examining and exporting Hugging Face models
- Deploying models on real-time endpoints
- Managing endpoints with the SageMaker SDK
- Managing endpoints with the boto3 SDK
- Deploying models on batch transformers
- Deploying models on inference pipelines
- Monitoring prediction quality with Amazon SageMaker Model Monitor
- Capturing data.
- Creating a baseline.
- Notes:
- Description based on print version record.
- Includes bibliographical references and index.
- ISBN:
- 9781801814157
- OCLC:
- 1281955521
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.