1 option
Serverless machine learning with Amazon Redshift ML : create, train, and deploy machine learning models using familiar SQL commands / Debu Panda [and four others].
- Format:
- Book
- Author/Creator:
- Panda, Debu, author.
- Language:
- English
- Subjects (All):
- Amazon Web Services (Firm).
- Machine learning.
- Cloud computing.
- Physical Description:
- 1 online resource (290 pages)
- Edition:
- 1st ed.
- Place of Publication:
- Birmingham, England : Packt Publishing, [2023]
- Biography/History:
- Panda Debu: Debu Panda, a Senior Manager, Product Management at AWS, is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world. Debu has published numerous articles on analytics, enterprise Java, and databases and has presented at multiple conferences such as re: Invent, Oracle Open World, and Java One. He is lead author of the EJB 3 in Action (Manning Publications 2007, 2014) and Middleware Management (Packt, 2009). Bates Phil: Phil Bates is a Senior Analytics Specialist Solutions Architect at AWS. He has more than 25 years of experience implementing large-scale data warehouse solutions. He is passionate about helping customers through their cloud journey and leveraging the power of ML within their data warehouse. Pittampally Bhanu: Bhanu Pittampally is Analytics Specialist Solutions Architect at Amazon Web Services. His background is in data and analytics and is in the field for over 16 years. He currently lives in Frisco, TX with his wife Kavitha and daughters Vibha and Medha. Joshi Sumeet: Sumeet Joshi is an Analytics Specialist Solutions Architect based out of New York. He specializes in building large-scale data warehousing solutions. He has over 17 years of experience in the data warehousing and analytical space.
- Summary:
- Amazon Redshift Serverless enables organizations to run petabyte-scale cloud data warehouses quickly and in a cost-effective way, enabling data science professionals to efficiently deploy cloud data warehouses and leverage easy-to-use tools to train models and run predictions. This practical guide will help developers and data professionals working with Amazon Redshift data warehouses to put their SQL knowledge to work for training and deploying machine learning models. The book begins by helping you to explore the inner workings of Redshift Serverless as well as the foundations of data analytics and types of data machine learning. With the help of step-by-step explanations of essential concepts and practical examples, you'll then learn to build your own classification and regression models. As you advance, you'll find out how to deploy various types of machine learning projects using familiar SQL code, before delving into Redshift ML. In the concluding chapters, you'll discover best practices for implementing serverless architecture with Redshift. By the end of this book, you'll be able to configure and deploy Amazon Redshift Serverless, train and deploy machine learning models using Amazon Redshift ML, and run inference queries at scale.
- Contents:
- Cover
- Title page
- Copyright
- Dedication
- Foreword
- Contributors
- Table of Contents
- Preface
- Part 1: Redshift Overview: Getting Started with Redshift Serverless and an Introduction to Machine Learning
- Chapter 1: Introduction to Amazon Redshift Serverless
- What is Amazon Redshift?
- Getting started with Amazon Redshift Serverless
- What is a namespace?
- What is a workgroup?
- Connecting to your data warehouse
- Using Amazon Redshift query editor v2
- Loading sample data
- Running your first query
- Summary
- Chapter 2: Data Loading and Analytics on Redshift Serverless
- Technical requirements
- Data loading using Amazon Redshift Query Editor v2
- Creating tables
- Loading data from Amazon S3
- Loading data from a local drive
- Data loading from Amazon S3 using the COPY command
- Loading data from a Parquet file
- Automating file ingestion with a COPY job
- Best practices for the COPY command
- Data loading using the Redshift Data API
- Creating table
- Loading data using the Redshift Data API
- Chapter 3: Applying Machine Learning in Your Data Warehouse
- Understanding the basics of ML
- Comparing supervised and unsupervised learning
- Classification
- Regression
- Traditional steps to implement ML
- Data preparation
- Evaluating an ML model
- Overcoming the challenges of implementing ML today
- Exploring the benefits of ML
- Part 2: Getting Started with Redshift ML
- Chapter 4: Leveraging Amazon Redshift ML
- Why Amazon Redshift ML?
- An introduction to Amazon Redshift ML
- A CREATE MODEL overview
- AUTO everything
- AUTO with user guidance
- XGBoost (AUTO OFF)
- K-means (AUTO OFF)
- BYOM
- Chapter 5: Building Your First Machine Learning Model
- Redshift ML simple CREATE MODEL
- Uploading and analyzing the data.
- Diving deep into the Redshift ML CREATE MODEL syntax
- Creating your first machine learning model
- Evaluating model performance
- Checking the Redshift ML objectives
- Running predictions
- Comparing ground truth to predictions
- Feature importance
- Model performance
- Chapter 6: Building Classification Models
- An introduction to classification algorithms
- Diving into the Redshift CREATE MODEL syntax
- Training a binary classification model using the XGBoost algorithm
- Establishing the business problem
- Uploading and analyzing the data
- Using XGBoost to train a binary classification model
- Prediction probabilities
- Training a multi-class classification model using the Linear Learner model type
- Using Linear Learner to predict the customer segment
- Evaluating the model quality
- Running prediction queries
- Exploring other CREATE MODEL options
- Chapter 7: Building Regression Models
- Introducing regression algorithms
- Redshift's CREATE MODEL with user guidance
- Creating a simple linear regression model using XGBoost
- Splitting data into training and validation sets
- Creating a simple linear regression model
- Creating multi-input regression models
- Linear Learner algorithm
- Understanding model evaluation
- Prediction query
- Chapter 8: Building Unsupervised Models with K-Means Clustering
- Grouping data through cluster analysis
- Determining the optimal number of clusters
- Creating a K-means ML model
- Creating a model syntax overview for K-means clustering
- Creating the K-means model
- Evaluating the results of the K-means clustering
- Summary.
- Part 3: Deploying Models with Redshift ML
- Chapter 9: Deep Learning with Redshift ML
- Introduction to deep learning
- Business problem
- Prediction goal
- Splitting data into training and test datasets
- Creating a multiclass classification model using MLP
- Chapter 10: Creating a Custom ML Model with XGBoost
- Introducing XGBoost
- Introducing an XGBoost use case
- Defining the business problem
- Uploading, analyzing, and preparing data for training
- Splitting data into train and test datasets
- Preprocessing the input variables
- Creating a model using XGBoost with Auto Off
- Creating a binary classification model using XGBoost
- Generating predictions and evaluating model performance
- Chapter 11: Bringing Your Own Models for Database Inference
- Benefits of BYOM
- Supported model types
- Creating the BYOM local inference model
- Creating a local inference model
- Running local inference on Redshift
- BYOM using a SageMaker endpoint for remote inference
- Creating BYOM remote inference
- Generating the BYOM remote inference command
- Chapter 12: Time-Series Forecasting in Your Data Warehouse
- Forecasting and time-series data
- Types of forecasting methods
- What is time-series forecasting?
- Time trending data
- Seasonality
- Structural breaks
- What is Amazon Forecast?
- Configuration and security
- Creating forecasting models using Redshift ML
- Creating a table with output results
- Chapter 13: Operationalizing and Optimizing Amazon Redshift ML Models
- Operationalizing your ML models.
- Model retraining process without versioning
- The model retraining process with versioning
- Automating the CREATE MODEL statement for versioning
- Optimizing the Redshift models' accuracy
- Model quality
- Model explainability
- Probabilities
- Using SageMaker Autopilot notebooks
- Index
- About Packt
- Other Books You May Enjoy.
- Notes:
- Includes index.
- Description based on print version record.
- ISBN:
- 9781804619698
- 1804619698
- OCLC:
- 1396226259
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.