1 option
Machine learning engineering on AWS : Machine learning engineering on AWS : build, scale, and secure machine learning systems and MLOps pipelines in production / Joshua Arvin Lat.
- Format:
- Book
- Author/Creator:
- Lat, Joshua Arvin, author.
- Language:
- English
- Subjects (All):
- Machine learning.
- Physical Description:
- 1 online resource (530 pages)
- Edition:
- First edition.
- Place of Publication:
- Birmingham, England : Packt Publishing Ltd., [2022]
- Summary:
- Work seamlessly with production-ready machine learning systems and pipelines on AWS by addressing key pain points encountered in the ML life cycle Key Features Gain practical knowledge of managing ML workloads on AWS using Amazon SageMaker, Amazon EKS, and more Use container and serverless services to solve a variety of ML engineering requirements Design, build, and secure automated MLOps pipelines and workflows on AWS Book Description There is a growing need for professionals with experience in working on machine learning (ML) engineering requirements as well as those with knowledge of automating complex MLOps pipelines in the cloud. This book explores a variety of AWS services, such as Amazon Elastic Kubernetes Service, AWS Glue, AWS Lambda, Amazon Redshift, and AWS Lake Formation, which ML practitioners can leverage to meet various data engineering and ML engineering requirements in production. This machine learning book covers the essential concepts as well as step-by-step instructions that are designed to help you get a solid understanding of how to manage and secure ML workloads in the cloud. As you progress through the chapters, you'll discover how to use several container and serverless solutions when training and deploying TensorFlow and PyTorch deep learning models on AWS. You'll also delve into proven cost optimization techniques as well as data privacy and model privacy preservation strategies in detail as you explore best practices when using each AWS. By the end of this AWS book, you'll be able to build, scale, and secure your own ML systems and pipelines, which will give you the experience and confidence needed to architect custom solutions using a variety of AWS services for ML engineering requirements. What you will learn Find out how to train and deploy TensorFlow and PyTorch models on AWS Use containers and serverless services for ML engineering requirements Discover how to set up a serverless data warehouse and data lake on AWS Build automated end-to-end MLOps pipelines using a variety of services Use AWS Glue DataBrew and SageMaker Data Wrangler for data engineering Explore different solutions for deploying deep learning models on AWS Apply cost optimization techniques to ML environments and systems Preserve data privacy and model privacy using a variety of techniques Who this book is for This book is for machine learning engineers, data scientists, and AWS cloud engineers interested in working on production data engineering, machine learning engineering, and MLOps requirements using a variety of AWS services such as Amazon EC2, Amazon Elastic Kubernetes Service (EKS), Amazon SageMaker, AWS Glue, Amazon Redshift, AWS Lake Formation, and AWS Lambda -- all you need is an AWS account to get started. Prior knowledge of AWS, machine learning, and the Python programming language will help you to grasp the concepts covered in this book more effectively.
- Contents:
- Cover
- Title Page
- Copyright and Credits
- Contributors
- Table of Contents
- Preface
- Part 1: Getting Started with Machine Learning Engineering on AWS
- Chapter 1: Introduction to ML Engineering on AWS
- Technical requirements
- What is expected from ML engineers?
- How ML engineers can get the most out of AWS
- Essential prerequisites
- Creating the Cloud9 environment
- Increasing Cloud9's storage
- Installing the Python prerequisites
- Preparing the dataset
- Generating a synthetic dataset using a deep learning model
- Exploratory data analysis
- Train-test split
- Uploading the dataset to Amazon S3
- AutoML with AutoGluon
- Setting up and installing AutoGluon
- Performing your first AutoGluon AutoML experiment
- Getting started with SageMaker and SageMaker Studio
- Onboarding with SageMaker Studio
- Adding a user to an existing SageMaker Domain
- No-code machine learning with SageMaker Canvas
- AutoML with SageMaker Autopilot
- Summary
- Further reading
- Chapter 2: Deep Learning AMIs
- Getting started with Deep Learning AMIs
- Launching an EC2 instance using a Deep Learning AMI
- Locating the framework-specific DLAMI
- Choosing the instance type
- Ensuring a default secure configuration
- Launching the instance and connecting to it using EC2 Instance Connect
- Downloading the sample dataset
- Training an ML model
- Loading and evaluating the model
- Cleaning up
- Understanding how AWS pricing works for EC2 instances
- Using multiple smaller instances to reduce the overall cost of running ML workloads
- Using spot instances to reduce the cost of running training jobs
- Chapter 3: Deep Learning Containers
- Getting started with AWS Deep Learning Containers
- Essential prerequisites.
- Preparing the Cloud9 environment
- Using AWS Deep Learning Containers to train an ML model
- Serverless ML deployment with Lambda's container image support
- Building the custom container image
- Testing the container image
- Pushing the container image to Amazon ECR
- Running ML predictions on AWS Lambda
- Completing and testing the serverless API setup
- Part 2: Solving Data Engineering and Analysis Requirements
- Chapter 4: Serverless Data Management on AWS
- Getting started with serverless data management
- Preparing the essential prerequisites
- Opening a text editor on your local machine
- Creating an IAM user
- Creating a new VPC
- Uploading the dataset to S3
- Running analytics at scale with Amazon Redshift Serverless
- Setting up a Redshift Serverless endpoint
- Opening Redshift query editor v2
- Creating a table
- Loading data from S3
- Querying the database
- Unloading data to S3
- Setting up Lake Formation
- Creating a database
- Creating a table using an AWS Glue Crawler
- Using Amazon Athena to query data in Amazon S3
- Setting up the query result location
- Running SQL queries using Athena
- Chapter 5: Pragmatic Data Processing and Analysis
- Getting started with data processing and analysis
- Downloading the Parquet file
- Preparing the S3 bucket
- Automating data preparation and analysis with AWS Glue DataBrew
- Creating a new dataset
- Creating and running a profile job
- Creating a project and configuring a recipe
- Creating and running a recipe job
- Verifying the results
- Preparing ML data with Amazon SageMaker Data Wrangler
- Accessing Data Wrangler
- Importing data
- Transforming the data.
- Analyzing the data
- Exporting the data flow
- Turning off the resources
- Part 3: Diving Deeper with Relevant Model Training and Deployment Solutions
- Chapter 6: SageMaker Training and Debugging Solutions
- Getting started with the SageMaker Python SDK
- Creating a service limit increase request
- Training an image classification model with the SageMaker Python SDK
- Creating a new Notebook in SageMaker Studio
- Downloading the training, validation, and test datasets
- Uploading the data to S3
- Using the SageMaker Python SDK to train an ML model
- Using the %store magic to store data
- Using the SageMaker Python SDK to deploy an ML model
- Using the Debugger Insights Dashboard
- Utilizing Managed Spot Training and Checkpoints
- Chapter 7: SageMaker Deployment Solutions
- Getting started with model deployments in SageMaker
- Preparing the pre-trained model artifacts
- Preparing the SageMaker script mode prerequisites
- Preparing the inference.py file
- Preparing the requirements.txt file
- Preparing the setup.py file
- Deploying a pre-trained model to a real-time inference endpoint
- Deploying a pre-trained model to a serverless inference endpoint
- Deploying a pre-trained model to an asynchronous inference endpoint
- Creating the input JSON file
- Adding an artificial delay to the inference script
- Deploying and testing an asynchronous inference endpoint
- Deployment strategies and best practices
- Part 4: Securing, Monitoring, and Managing Machine Learning Systems and Environments
- Chapter 8: Model Monitoring and Management Solutions
- Technical prerequisites.
- Registering models to SageMaker Model Registry
- Creating a new notebook in SageMaker Studio
- Registering models to SageMaker Model Registry using the boto3 library
- Deploying models from SageMaker Model Registry
- Enabling data capture and simulating predictions
- Scheduled monitoring with SageMaker Model Monitor
- Analyzing the captured data
- Deleting an endpoint with a monitoring schedule
- Chapter 9: Security, Governance, and Compliance Strategies
- Managing the security and compliance of ML environments
- Authentication and authorization
- Network security
- Encryption at rest and in transit
- Managing compliance reports
- Vulnerability management
- Preserving data privacy and model privacy
- Federated Learning
- Differential Privacy
- Privacy-preserving machine learning
- Other solutions and options
- Establishing ML governance
- Lineage Tracking and reproducibility
- Model inventory
- Model validation
- ML explainability
- Bias detection
- Model monitoring
- Traceability, observability, and auditing
- Data quality analysis and reporting
- Data integrity management
- Part 5: Designing and Building End-to-end MLOps Pipelines
- Chapter 10: Machine Learning Pipelines with Kubeflow on Amazon EKS
- Diving deeper into Kubeflow, Kubernetes, and EKS
- Preparing the IAM role for the EC2 instance of the Cloud9 environment
- Attaching the IAM role to the EC2 instance of the Cloud9 environment
- Updating the Cloud9 environment with the essential prerequisites
- Setting up Kubeflow on Amazon EKS
- Running our first Kubeflow pipeline
- Using the Kubeflow Pipelines SDK to build ML workflows
- Recommended strategies and best practices
- Further reading.
- Chapter 11: Machine Learning Pipelines with SageMaker Pipelines
- Diving deeper into SageMaker Pipelines
- Running our first pipeline with SageMaker Pipelines
- Defining and preparing our first ML pipeline
- Running our first ML pipeline
- Creating Lambda functions for deployment
- Preparing the Lambda function for deploying a model to a new endpoint
- Preparing the Lambda function for checking whether an endpoint exists
- Preparing the Lambda function for deploying a model to an existing endpoint
- Testing our ML inference endpoint
- Completing the end-to-end ML pipeline
- Defining and preparing the complete ML pipeline
- Running the complete ML pipeline
- Index
- Other Books You May Enjoy.
- Notes:
- Description based on publisher supplied metadata and other sources.
- Description based on print version record.
- ISBN:
- 9781523151516
- 152315151X
- 9781803231389
- 1803231386
- OCLC:
- 1348491798
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.