My Account Log in

1 option

AWS Certified Data Engineer Study Guide : Associate (DEA-C01) Exam.

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Humair, Syed.
Contributor:
Gumbo, Chenjerai.
Gatt, Adam.
Abbasi, Asif.
Nair, Lakshmi.
Series:
Sybex Study Guide Series
Language:
English
Subjects (All):
Electronic data processing personnel--Certification--Examinations--Study guides.
Electronic data processing personnel.
Computer networks--Management--Examinations--Study guides.
Computer networks.
Computer systems--Examinations--Study guides.
Computer systems.
Cloud computing--Examinations--Study guides.
Cloud computing.
Computer technicians--Certification.
Computer technicians.
Physical Description:
1 online resource (659 pages)
Edition:
1st ed.
Place of Publication:
Newark : John Wiley & Sons, Incorporated, 2025.
Summary:
Your complete Guide to preparing for the AWS® Certified Data Engineer: Associate exam The AWS® Certified Data Engineer Study Guide is your one-stop resource for complete coverage of the challenging DEA-C01 Associate exam. This Sybex Study Guide covers 100% of the DEA-C01 objectives. Prepare for the exam faster and smarter with Sybex thanks to accurate content including, an assessment test that validates and measures exam readiness, real-world examples and scenarios, practical exercises, and challenging chapter review questions. Reinforce and retain what you've learned with the Sybex online learning environment and test bank, accessible across multiple devices. Get ready for the AWS Certified Data Engineer exam - quickly and efficiently - with Sybex. Coverage of 100% of all exam objectives in this Study Guide means you'll be ready for: Data Ingestion and Transformation Data Store Management Data Operations and Support Data Security and Governance ABOUT THE AWS DATA ENGINEER - ASSOCIATE CERTIFICATION The AWS Data Engineer - Associate certification validates skills and knowledge in core data-related Amazon Web Services. It recognizes your ability to implement data pipelines and to monitor, troubleshoot, and optimize cost and performance issues in accordance with best practices Interactive learning environment Take your exam prep to the next level with Sybex's superior interactive online study tools. To access our learning environment, simply visit www.wiley.com/go/sybextestprep, register your book to receive your unique PIN, and instantly gain one year of FREE access after activation to: • Interactive test bank with 5 practice exams to help you identify areas where further review is needed. Get more than 90% of the answers correct, and you're ready to take the certification exam. • 100 electronic flashcards to reinforce learning and last-minute prep before the exam • Comprehensive glossary in PDF format gives you instant access to the key terms so you are fully prepared.
Contents:
Cover
Title Page
Copyright
Acknowledgments
About the Authors
Contents at a Glance
Contents
Foreword
Introduction
The AWS Certified Data Engineering Associate Certification
The Purpose of This Book
The AWS Certified Data Engineer - Associate Exam
Study Guide Features
Interactive Online Learning Environment and TestBank
AWS Certified Data Engineer - Associate Exam (DEA-C01) Objectives
How to Contact the Publisher
Assessment Test
Answers to Assessment Test
Chapter 1 Streaming and Batch Data Ingestion
The Evolution of Application Architectures and Data Stores
Introduction to the Modern Data Architecture
Introduction to Data Ingestion
Data Generation
Understanding Data Sources and Storage
Ingestion Patterns and AWS Services
Data Ingestion
Streaming Ingestion
Amazon Kinesis Introduction
Amazon Kinesis Data Streams
Amazon Data Firehose
Amazon Managed Service for Apache Flink
Amazon Managed Streaming for Apache Kafka
Comparison of Streaming Services
Batch Ingestion
AWS Glue
Amazon Data Migration Service
AWS DataSync
Large-Scale Data Transfer Solutions
AWS Direct Connect
Summary
Exam Essentials
Review Questions
Chapter 2 Building Automated Data Pipelines
Introduction to Automated Data Pipelines
Data Pipeline Orchestration
AWS Step Functions
Amazon Managed Workflows for Apache Airflow
AWS Glue Workflows for Data Pipeline Orchestration
When to Use What (AWS Step Functions, Amazon MWAA, or AWS Glue)
Best Practices for Data Pipelines Orchestration
Supporting AWS Services for Enhanced Orchestration
AWS Lambda
Amazon EventBridge
Notification and Queuing Services
Applying Programming Concepts
CI/CD
Using AWS SAM for Serverless Data Pipeline Deployment.
SQL Queries in Data Pipeline Orchestration
Infrastructure as Code for Repeatable Data Pipeline Deployments
Data Structures and Algorithms
Optimizing Code to Reduce Runtime for Data Ingestion and Transformation
Structuring SQL Queries to Meet Data Pipeline Requirements
Using Git Commands for Data Pipeline Development
Testing and Debugging Techniques
Logging, Monitoring, and Auditing for Data Pipeline Orchestration
Extracting Logs for Audits
Deploying Logging and Monitoring Solutions
Using Notifications for Alerts
Case Studies and Real-World Examples
Case Study 1: Batch Data Processing Pipeline for Financial Transactions
Case Study 2: Real-Time Streaming Data Pipeline for IoT Sensor Data
Case Study 3: Data Lake Ingestion and Processing Pipeline
Case Study 4: Machine Learning Pipeline for Image Classification
Case Study 5: ETL Pipeline for Data Warehouse Loading
Chapter 3 Data Transformation
Introduction to Data Integration
Data Transformation
Amazon Athena
Amazon Redshift
Amazon Elastic MapReduce
Stream Processing
Introduction to Containers
Amazon ECS
Amazon Elastic Container Registry
Amazon Elastic Kubernetes Service
Optimizing Container Usage
API-Driven Data Pipelines on AWS
Amazon API Gateway
Introduction to Data Quality
Data Quality Challenges to Overcome
AWS Glue Data Quality
Data Quality Definition Language
Data Quality in Transit
Alerts and Monitoring
Extended Use Cases
Chapter 4 Storage Services
Introduction to Data Stores
Storage Platforms
Object Storage
Block Storage
Cloud File Storage
Comparison of Storage Types
Data Storage Formats.
Storage Services and Configurations for Specific Performance Demands
Amazon Simple Storage Service
S3 Storage Classes
Amazon S3 Store Management
Access Management and Security
Data Processing
Storage Logging and Monitoring
Analytics and Insights
Strong Consistency
Accessing S3
Paying for Amazon S3
Getting Started with S3 Buckets
Amazon Elastic Block Storage
EBS Volume Types
Data Protection
Amazon Elastic File System
Amazon EFS Features
Amazon File Cache
Amazon File Cache Features
Amazon FSx
Implementing the Appropriate Storage Services for Specific Cost and Performance Requirements
Aligning Data Storage with Data Migration Requirements
Determining the Appropriate Storage Solution for Specific Access Patterns
Managing the Lifecycle of Data
Legal and Compliance Requirements
Cost Optimization
Performance Improvement
Data Security
Disaster Recovery and Business Continuity
Selecting the Appropriate Storage Solutions for Hot and Cold Data
Optimizing Storage Costs Based on the Data Lifecycle
Deleting Data to Meet Business and Legal Requirements
Implementing Data Retention Policies and Archiving Strategies
Protecting Data with Appropriate Resiliency and Availability
Chapter 5 Databases and Data Warehouses on AWS
What Is a Data Warehouse?
Redshift Architecture
Data API
Data Distribution
Data Sorting
Vacuum
Compression Encoding
SQL Query Optimization
Workload Management (WLM)
Data Sharing
Cluster Resizing
Loading Data into Redshift
Unloading Data from Redshift
Transforming Data in Redshift
Materialized Views
Data Modeling on Redshift
Data Security in Redshift
Amazon DynamoDB
What Is a NoSQL Database?.
DynamoDB Main Concepts
DynamoDB Read Consistency
Global Tables
DynamoDB Read/Write Capacity
DynamoDB Accelerator
Amazon Relational Database Service
RDS Scalability
RDS Availability
Amazon Aurora
Amazon Neptune
What Is a Graph Database?
Amazon DocumentDB (with MongoDB Compatibility)
What Is a Document Database?
Amazon DocumentDB Architecture
Connecting to Amazon DocumentDB
Amazon MemoryDB for Redis
What Is Redis?
What Is Amazon MemoryDB?
Amazon Keyspaces (for Apache Cassandra)
What Is Apache Cassandra?
What Is a Wide-Column Store Database?
What Is Amazon Keyspaces?
AWS Database Comparison
Chapter 6 Data Catalogs
Data Catalogs
Benefits of Data Catalogs
AWS Glue Data Catalog
Data Quality at Rest
Rule Recommendations
Data Quality Rules
Chapter 7 Visualizing Your Data
Introduction to Data Visualization
Types of Data Visualizations
Bar Charts
Histograms
Line Charts
Scatter Plots
Pie and Donut Charts
Heatmaps
Treemaps
Geospatial Maps
Principles of Effective Data Visualization
Trade-offs Between Provisioned and Serverless Services
AWS Services for Data Analysis and Visualization
Visualizing Data with Amazon QuickSight
AWS Glue DataBrew
Amazon SageMaker Data Wrangler
Advanced SQL Techniques for Data Analysis
The Evolution of SQL in Modern Data Platforms
Complex Join Operations: Beyond Simple Table Combinations
Window Functions: The Analytical Powerhouse
The Art of Subqueries and Common Table Expressions
Pivot and Unpivot: Transforming Data Perspectives
The Future of SQL Analysis
Data Cleansing and Preparation
Common Data Quality Issues in Modern Analytics.
Data Cleansing and Transformation Techniques
Data Aggregation and Transformation Techniques
Best Practices for Data Analysis
Chapter 8 Monitoring and Auditing Data
How to Log Application Data
Logging Best Practices
Logging Levels
Cautions and Exclusions
Special Data Types
Access and Change Management
How to Log Access to AWS Services
AWS CloudTrail
Amazon CloudWatch
CloudWatch Logs
VPC Flow Logs
AWS X-Ray
Amazon Macie
Analyzing Logs Using AWS Services
Chapter 9 Maintaining and Troubleshooting Data Operations
Introduction to Automating Data Processing Using AWS Services
Maintaining and Troubleshooting Data Processing
API Calls for Data Processing
Services That Accept Scripting
Orchestrating Data Pipelines
Troubleshooting Amazon Managed Workflows
Airflow Logs
Audit Logs
Monitoring and Alarms
Using AWS Services for Data Processing
Consuming and Maintaining Data APIs
Amazon Redshift Data API
Calling the Data API and Available Commands
Example Statements and Their Output
Considerations When Using the Redshift Data API
Amazon Redshift Data API Use Case
Monitoring the Redshift Data API
Troubleshooting Common Issues for the Redshift Data API
Using Lambda for Data Processing
Chapter 10 Authentication and Authorization
Introduction to Authentication
API Endpoints
AWS Identity and Acess Management
IAM Users and Groups
IAM Roles
Access Keys and Credentials
Multi-Factor Authentication
AWS Security Token Service
Assuming Roles
Federation
Amazon Cognito
Kerberos-Based Authentication
Data Services Authentication Mechanisms
Authentication in the Data Engineering Exam.
Introduction to Authorization.
Notes:
Description based on publisher supplied metadata and other sources.
ISBN:
9781394286591
1394286597
9781394286607
1394286600
OCLC:
1507845637

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

Find

Home Release notes

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Find catalog Using Articles+ Using your account