My Account Log in

1 option

MCA Microsoft Certified Associate Azure Data Engineer Study Guide : Exam DP-203.

O'Reilly Online Learning: Academic/Public Library Edition Available online

View online
Format:
Book
Author/Creator:
Perkins, Benjamin.
Series:
Sybex Study Guide Series
Language:
English
Subjects (All):
Microsoft Azure (Computing platform)--Examinations--Study guides.
Microsoft Azure (Computing platform).
Database management--Examinations--Study guides.
Database management.
Physical Description:
1 online resource (1011 pages)
Edition:
1st ed.
Place of Publication:
Newark : John Wiley & Sons, Incorporated, 2023.
Summary:
Prepare for the Azure Data Engineering certification--and an exciting new career in analytics--with this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech. In the book, you'll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you'll get up to speed quickly and efficiently with Sybex's easy-to-use study aids and tools. This Study Guide also offers: Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the field Indispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxiety Complimentary access to Sybex's expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.
Contents:
Cover Page
Title Page
Copyright Page
Acknowledgments
About the Author
About the Technical Editor
Contents at a Glance
Contents
Table of Exercises
Introduction
Part I Azure Data Engineer Certification and Azure Products
Chapter 1 Gaining the Azure Data Engineer Associate Certification
The Journey to Certification
How to Pass Exam DP-203
Understanding the Exam Expectations and Requirements
Use Azure Daily
Read Azure Articles to Stay Current
Have an Understanding of All Azure Products
Azure Product Name Recognition
Azure Data Analytics
Azure Synapse Analytics
Azure Databricks
Azure HDInsight
Azure Analysis Services
Azure Data Factory
Azure Event Hubs
Azure Stream Analytics
Other Products
Azure Storage Products
Azure Data Lake Storage
Azure Storage
Azure Databases
Azure Cosmos DB
Azure SQL Server Products
Additional Azure Databases
Azure Security
Azure Active Directory
Role-Based Access Control
Attribute-Based Access Control
Azure Key Vault
Azure Networking
Virtual Networks
Azure Compute
Azure Virtual Machines
Azure Virtual Machine Scale Sets
Azure App Service Web Apps
Azure Functions
Azure Batch
Azure Management and Governance
Azure Monitor
Azure Purview
Azure Policy
Azure Blueprints (Preview)
Azure Lighthouse
Azure Cost Management and Billing
Summary
Exam Essentials
Review Questions
Chapter 2 CREATE DATABASE dbName
The Brainjammer
A Historical Look at Data
Variety
Velocity
Volume
Data Locations
Data File Formats
Data Structures, Types, and Concepts
Data Structures
Data Types and Management
Data Concepts
Data Programming and Querying for Data Engineers.
Data Programming
Querying Data
Understanding Big Data Processing
Big Data Stages
ETL, ELT, ELTL
Analytics Types
Big Data Layers
Part II Design and Implement Data Storage
Chapter 3 Data Sources and Ingestion
Where Does Data Come From?
Design a Data Storage Structure
Design an Azure Data Lake Solution
Recommended File Types for Storage
Recommended File Types for Analytical Queries
Design for Efficient Querying
Design for Data Pruning
Design a Folder Structure That Represents the Levels of Data Transformation
Design a Distribution Strategy
Design a Data Archiving Solution
Design a Partition Strategy
Design a Partition Strategy for Files
Design a Partition Strategy for Analytical Workloads
Design a Partition Strategy for Efficiency and Performance
Design a Partition Strategy for Azure Synapse Analytics
Identify When Partitioning Is Needed in Azure Data Lake Storage Gen2
Design the Serving/Data Exploration Layer
Design Star Schemas
Design Slowly Changing Dimensions
Design a Dimensional Hierarchy
Design a Solution for Temporal Data
Design for Incremental Loading
Design Analytical Stores
Design Metastores in Azure Synapse Analytics and Azure Databricks
The Ingestion of Data into a Pipeline
Event Hubs and IoT Hub
Apache Kafka for HDInsight
Migrating and Moving Data
Chapter 4 The Storage of Data
Implement Physical Data Storage Structures
Implement Compression
Implement Partitioning
Implement Sharding
Implement Different Table Geometries with Azure Synapse Analytics Pools
Implement Data Redundancy
Implement Distributions.
Implement Data Archiving
Azure Synapse Analytics Develop Hub
Implement Logical Data Structures
Build a Temporal Data Solution
Build a Slowly Changing Dimension
Build a Logical Folder Structure
Build External Tables
Implement File and Folder Structures for Efficient Querying and Data Pruning
Implement a Partition Strategy
Implement a Partition Strategy for Files
Implement a Partition Strategy for Analytical Workloads
Implement a Partition Strategy for Streaming Workloads
Implement a Partition Strategy for Azure Synapse Analytics
Design and Implement the Data Exploration Layer
Deliver Data in a Relational Star Schema
Deliver Data in Parquet Files
Maintain Metadata
Implement a Dimensional Hierarchy
Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster
Recommend Azure Synapse Analytics Database Templates
Implement Azure Synapse Analytics Database Templates
Additional Data Storage Topics
Storing Raw Data in Azure Databricks for Transformation
Storing Data Using Azure HDInsight
Storing Prepared, Trained, and Modeled Data
Part III Develop Data Processing
Chapter 5 Transform, Manage, and Prepare Data
Ingest and Transform Data
Transform Data Using Azure Synapse Pipelines
Transform Data Using Azure Data Factory
Transform Data Using Apache Spark
Transform Data Using Transact-SQL
Transform Data Using Stream Analytics
Cleanse Data
Split Data
Shred JSON
Encode and Decode Data
Configure Error Handling for the Transformation
Normalize and Denormalize Values
Transform Data by Using Scala
Perform Exploratory Data Analysis
Transformation and Data Management Concepts
Transformation
Data Management
Data Modeling and Usage.
Data Modeling with Machine Learning
Usage
Chapter 6 Create and Manage Batch Processing and Pipelines
Design and Develop a Batch Processing Solution
Design a Batch Processing Solution
Develop Batch Processing Solutions
Create Data Pipelines
Handle Duplicate Data
Handle Missing Data
Handle Late-Arriving Data
Upsert Data
Configure the Batch Size
Configure Batch Retention
Design and Develop Slowly Changing Dimensions
Design and Implement Incremental Data Loads
Integrate Jupyter/IPython Notebooks into a Data Pipeline
Revert Data to a Previous State
Handle Security and Compliance Requirements
Design and Create Tests for Data Pipelines
Scale Resources
Design and Configure Exception Handling
Debug Spark Jobs Using the Spark UI
Implement Azure Synapse Link and Query the Replicated Data
Use PolyBase to Load Data to a SQL Pool
Read from and Write to a Delta Table
Manage Batches and Pipelines
Trigger Batches
Schedule Data Pipelines
Validate Batch Loads
Implement Version Control for Pipeline Artifacts
Manage Data Pipelines
Manage Spark Jobs in a Pipeline
Handle Failed Batch Loads
Chapter 7 Design and Implement a Data Stream Processing Solution
Develop a Stream Processing Solution
Design a Stream Processing Solution
Create a Stream Processing Solution
Process Time Series Data
Design and Create Windowed Aggregates
Process Data Within One Partition
Process Data Across Partitions
Handle Schema Drift
Configure Checkpoints/Watermarking During Processing
Replay Archived Stream Data
Monitor for Performance and Functional Regressions.
Optimize Pipelines for Analytical or Transactional Purposes
Handle Interruptions
Transform Data Using Azure Stream Analytics
Monitor Data Storage and Data Processing
Monitor Stream Processing
Part IV Secure, Monitor, and Optimize Data Storage and Data Processing
Chapter 8 Keeping Data Safe and Secure
Design Security for Data Policies and Standards
Design a Data Auditing Strategy
Design a Data Retention Policy
Design for Data Privacy
Design to Purge Data Based on Business Requirements
Design Data Encryption for Data at Rest and in Transit
Design Row-Level and Column-Level Security
Design a Data Masking Strategy
Design Access Control for Azure Data Lake Storage Gen2
Implement Data Security
Implement a Data Auditing Strategy
Manage Sensitive Information
Implement a Data Retention Policy
Encrypt Data at Rest and in Motion
Implement Row-Level and Column-Level Security
Implement Data Masking
Manage Identities, Keys, and Secrets Across Different Data Platform Technologies
Implement Access Control for Azure Data Lake Storage Gen2
Implement Secure Endpoints (Private and Public)
Implement Resource Tokens in Azure Databricks
Load a DataFrame with Sensitive Information
Write Encrypted Data to Tables or Parquet Files
Develop a Batch Processing Solution
Browse and Search Metadata in Microsoft Purview Data Catalog
Push New or Updated Data Lineage to Microsoft Purview
Chapter 9 Monitoring Azure Data Storage and Processing
Monitoring Data Storage and Data Processing.
Implement Logging Used by Azure Monitor.
Notes:
Description based on publisher supplied metadata and other sources.
Includes index.
ISBN:
9781119885436
1119885434
9781119885443
1119885442
OCLC:
1392402601

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account