1 option
MCA Microsoft Certified Associate Azure Data Engineer Study Guide : Exam DP-203.
- Format:
- Book
- Author/Creator:
- Perkins, Benjamin.
- Series:
- Sybex Study Guide Series
- Language:
- English
- Subjects (All):
- Microsoft Azure (Computing platform)--Examinations--Study guides.
- Microsoft Azure (Computing platform).
- Database management--Examinations--Study guides.
- Database management.
- Physical Description:
- 1 online resource (1011 pages)
- Edition:
- 1st ed.
- Place of Publication:
- Newark : John Wiley & Sons, Incorporated, 2023.
- Summary:
- Prepare for the Azure Data Engineering certification--and an exciting new career in analytics--with this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech. In the book, you'll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you'll get up to speed quickly and efficiently with Sybex's easy-to-use study aids and tools. This Study Guide also offers: Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the field Indispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxiety Complimentary access to Sybex's expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.
- Contents:
- Cover Page
- Title Page
- Copyright Page
- Acknowledgments
- About the Author
- About the Technical Editor
- Contents at a Glance
- Contents
- Table of Exercises
- Introduction
- Part I Azure Data Engineer Certification and Azure Products
- Chapter 1 Gaining the Azure Data Engineer Associate Certification
- The Journey to Certification
- How to Pass Exam DP-203
- Understanding the Exam Expectations and Requirements
- Use Azure Daily
- Read Azure Articles to Stay Current
- Have an Understanding of All Azure Products
- Azure Product Name Recognition
- Azure Data Analytics
- Azure Synapse Analytics
- Azure Databricks
- Azure HDInsight
- Azure Analysis Services
- Azure Data Factory
- Azure Event Hubs
- Azure Stream Analytics
- Other Products
- Azure Storage Products
- Azure Data Lake Storage
- Azure Storage
- Azure Databases
- Azure Cosmos DB
- Azure SQL Server Products
- Additional Azure Databases
- Azure Security
- Azure Active Directory
- Role-Based Access Control
- Attribute-Based Access Control
- Azure Key Vault
- Azure Networking
- Virtual Networks
- Azure Compute
- Azure Virtual Machines
- Azure Virtual Machine Scale Sets
- Azure App Service Web Apps
- Azure Functions
- Azure Batch
- Azure Management and Governance
- Azure Monitor
- Azure Purview
- Azure Policy
- Azure Blueprints (Preview)
- Azure Lighthouse
- Azure Cost Management and Billing
- Summary
- Exam Essentials
- Review Questions
- Chapter 2 CREATE DATABASE dbName
- The Brainjammer
- A Historical Look at Data
- Variety
- Velocity
- Volume
- Data Locations
- Data File Formats
- Data Structures, Types, and Concepts
- Data Structures
- Data Types and Management
- Data Concepts
- Data Programming and Querying for Data Engineers.
- Data Programming
- Querying Data
- Understanding Big Data Processing
- Big Data Stages
- ETL, ELT, ELTL
- Analytics Types
- Big Data Layers
- Part II Design and Implement Data Storage
- Chapter 3 Data Sources and Ingestion
- Where Does Data Come From?
- Design a Data Storage Structure
- Design an Azure Data Lake Solution
- Recommended File Types for Storage
- Recommended File Types for Analytical Queries
- Design for Efficient Querying
- Design for Data Pruning
- Design a Folder Structure That Represents the Levels of Data Transformation
- Design a Distribution Strategy
- Design a Data Archiving Solution
- Design a Partition Strategy
- Design a Partition Strategy for Files
- Design a Partition Strategy for Analytical Workloads
- Design a Partition Strategy for Efficiency and Performance
- Design a Partition Strategy for Azure Synapse Analytics
- Identify When Partitioning Is Needed in Azure Data Lake Storage Gen2
- Design the Serving/Data Exploration Layer
- Design Star Schemas
- Design Slowly Changing Dimensions
- Design a Dimensional Hierarchy
- Design a Solution for Temporal Data
- Design for Incremental Loading
- Design Analytical Stores
- Design Metastores in Azure Synapse Analytics and Azure Databricks
- The Ingestion of Data into a Pipeline
- Event Hubs and IoT Hub
- Apache Kafka for HDInsight
- Migrating and Moving Data
- Chapter 4 The Storage of Data
- Implement Physical Data Storage Structures
- Implement Compression
- Implement Partitioning
- Implement Sharding
- Implement Different Table Geometries with Azure Synapse Analytics Pools
- Implement Data Redundancy
- Implement Distributions.
- Implement Data Archiving
- Azure Synapse Analytics Develop Hub
- Implement Logical Data Structures
- Build a Temporal Data Solution
- Build a Slowly Changing Dimension
- Build a Logical Folder Structure
- Build External Tables
- Implement File and Folder Structures for Efficient Querying and Data Pruning
- Implement a Partition Strategy
- Implement a Partition Strategy for Files
- Implement a Partition Strategy for Analytical Workloads
- Implement a Partition Strategy for Streaming Workloads
- Implement a Partition Strategy for Azure Synapse Analytics
- Design and Implement the Data Exploration Layer
- Deliver Data in a Relational Star Schema
- Deliver Data in Parquet Files
- Maintain Metadata
- Implement a Dimensional Hierarchy
- Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster
- Recommend Azure Synapse Analytics Database Templates
- Implement Azure Synapse Analytics Database Templates
- Additional Data Storage Topics
- Storing Raw Data in Azure Databricks for Transformation
- Storing Data Using Azure HDInsight
- Storing Prepared, Trained, and Modeled Data
- Part III Develop Data Processing
- Chapter 5 Transform, Manage, and Prepare Data
- Ingest and Transform Data
- Transform Data Using Azure Synapse Pipelines
- Transform Data Using Azure Data Factory
- Transform Data Using Apache Spark
- Transform Data Using Transact-SQL
- Transform Data Using Stream Analytics
- Cleanse Data
- Split Data
- Shred JSON
- Encode and Decode Data
- Configure Error Handling for the Transformation
- Normalize and Denormalize Values
- Transform Data by Using Scala
- Perform Exploratory Data Analysis
- Transformation and Data Management Concepts
- Transformation
- Data Management
- Data Modeling and Usage.
- Data Modeling with Machine Learning
- Usage
- Chapter 6 Create and Manage Batch Processing and Pipelines
- Design and Develop a Batch Processing Solution
- Design a Batch Processing Solution
- Develop Batch Processing Solutions
- Create Data Pipelines
- Handle Duplicate Data
- Handle Missing Data
- Handle Late-Arriving Data
- Upsert Data
- Configure the Batch Size
- Configure Batch Retention
- Design and Develop Slowly Changing Dimensions
- Design and Implement Incremental Data Loads
- Integrate Jupyter/IPython Notebooks into a Data Pipeline
- Revert Data to a Previous State
- Handle Security and Compliance Requirements
- Design and Create Tests for Data Pipelines
- Scale Resources
- Design and Configure Exception Handling
- Debug Spark Jobs Using the Spark UI
- Implement Azure Synapse Link and Query the Replicated Data
- Use PolyBase to Load Data to a SQL Pool
- Read from and Write to a Delta Table
- Manage Batches and Pipelines
- Trigger Batches
- Schedule Data Pipelines
- Validate Batch Loads
- Implement Version Control for Pipeline Artifacts
- Manage Data Pipelines
- Manage Spark Jobs in a Pipeline
- Handle Failed Batch Loads
- Chapter 7 Design and Implement a Data Stream Processing Solution
- Develop a Stream Processing Solution
- Design a Stream Processing Solution
- Create a Stream Processing Solution
- Process Time Series Data
- Design and Create Windowed Aggregates
- Process Data Within One Partition
- Process Data Across Partitions
- Handle Schema Drift
- Configure Checkpoints/Watermarking During Processing
- Replay Archived Stream Data
- Monitor for Performance and Functional Regressions.
- Optimize Pipelines for Analytical or Transactional Purposes
- Handle Interruptions
- Transform Data Using Azure Stream Analytics
- Monitor Data Storage and Data Processing
- Monitor Stream Processing
- Part IV Secure, Monitor, and Optimize Data Storage and Data Processing
- Chapter 8 Keeping Data Safe and Secure
- Design Security for Data Policies and Standards
- Design a Data Auditing Strategy
- Design a Data Retention Policy
- Design for Data Privacy
- Design to Purge Data Based on Business Requirements
- Design Data Encryption for Data at Rest and in Transit
- Design Row-Level and Column-Level Security
- Design a Data Masking Strategy
- Design Access Control for Azure Data Lake Storage Gen2
- Implement Data Security
- Implement a Data Auditing Strategy
- Manage Sensitive Information
- Implement a Data Retention Policy
- Encrypt Data at Rest and in Motion
- Implement Row-Level and Column-Level Security
- Implement Data Masking
- Manage Identities, Keys, and Secrets Across Different Data Platform Technologies
- Implement Access Control for Azure Data Lake Storage Gen2
- Implement Secure Endpoints (Private and Public)
- Implement Resource Tokens in Azure Databricks
- Load a DataFrame with Sensitive Information
- Write Encrypted Data to Tables or Parquet Files
- Develop a Batch Processing Solution
- Browse and Search Metadata in Microsoft Purview Data Catalog
- Push New or Updated Data Lineage to Microsoft Purview
- Chapter 9 Monitoring Azure Data Storage and Processing
- Monitoring Data Storage and Data Processing.
- Implement Logging Used by Azure Monitor.
- Notes:
- Description based on publisher supplied metadata and other sources.
- Includes index.
- ISBN:
- 9781119885436
- 1119885434
- 9781119885443
- 1119885442
- OCLC:
- 1392402601
The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.