DP-203: Microsoft Azure Data Engineering

About Course

Are you ready to take your career to the next level in Azure Data Engineering?

Look no further! This comprehensive Udemy course is your ultimate guide to preparing for the DP-203: Microsoft Certified Azure Data Engineer Associate exam.

In today’s data-driven world, the demand for skilled Azure Data Engineers is higher. Organizations seek professionals who can design, implement, and manage data solutions using Microsoft Azure services. Achieving the DP-203 certification is a testament to your expertise in the field, making you a valuable asset in the industry.

What You’ll Learn:

Master Azure Data Services: Dive deep into Azure Data Factory, Azure Databricks, Azure HDInsight, and other essential Azure data services. Understand how these services can be leveraged to build data solutions.
Data Storage and Processing: Learn how to store and process data effectively using Azure SQL Data Warehouse, Azure Cosmos DB, and Azure Data Lake Storage.
Data Integration: Explore data integration techniques and best practices using Azure Data Factory, Azure Logic Apps, and more.
Data Orchestration: Gain expertise in orchestrating data workflows, data movement, and data transformation.
Data Security and Compliance: Understand how to secure your data solutions and ensure compliance with industry standards and regulations.
Monitoring and Optimization: Learn how to monitor the performance of your data solutions and optimize them for efficiency.
Real-world Scenarios: Get hands-on experience with real-world scenarios and practical exercises that simulate the exam environment.
Exam Preparation: Receive expert guidance on the DP-203 exam structure, question types, and test-taking strategies.

Regards

Vishmita Data Labs

Pass the Microsoft Azure DP-203 Azure Data Solutions test
Learn Basic Data Engineering & Database Concept
Learn Various Storage Product like Azure Storage, Queue, File, Table Storage
Build structured data solution with various SQL database options
Store Semi-structured data with CosmosDB
Develop ETL, ELT Solution with Azure Data Factory
Analyze Data with Data ware housing system - Synapse analytics
Synapse SQL in Azure Synapse Analytics
Azure Data Factory Control Flow Activities
Apache Spark in Azure Synapse Analytics
Azure Databricks
Delta Lake & Data warehouse in Azure Databricks
Azure Stream Analytics

Course Content

Basics of Data

Structured vs Unstructured vs Semi-Structured Data
Batch vs Streaming Data
OLTP vs OLAP
Data Lake vs Data Warehouse
Section Quiz

Azure Basic Services

Introduction to Cloud Computing

00:00
Introduction to Azure
Create Azure Free Account
Azure Portal Walkthrough
Managed and Unmanaged Service
Resource Management group and Subscription
Tagging
Access Control
Set Budget

00:00
Create Azure Resource Groups

01:57
Create Azure Resources

00:00
Delete resources

Azure Storage

Azure SQL Database

Introduction to Azure SQL Database
Azure SQL IaaS & PaaS Offering
Different Paas Deployment
SQL Pricing Model & Service Tier
Azure SQL Server in Virtual Machine (IaaS)
Azure SQL Database Backup and Restore
Azure SQL Database Scaling
Azure Database Security options
Azure Managed Instance advance security
Encrypting Data at Rest and Motion
Dynamic Data Masking
High Availability vs Disaster Recovery
RTO vs RPO
Azure SQL Database High Availability and Disaster Recovery options
Azure Database vs Azure Data warehouse

Transact SQL (T-SQL)

Introduction
Lab – Installing Azure Data Studio
Lab – T-SQL – Create Table command
Lab – T-SQL – SELECT clause
Lab – T-SQL – WHERE clause
Lab – T-SQL – ORDER BY clause
Lab – T-SQL – Aggregate Functions
Lab – T-SQL – GROUP BY clause
Lab – Using PARTITION BY
Lab – LEAD and LAG functions
Lab – WITH Clause
Lab – T-SQL – Foreign Key constraints

Azure Synapse Analytics

Introduction to Data Warehouse
Traditional vs Modern Warehouse architecture
What is Synapse Analytics Service
Azure Synapse Benefits
Azure Synapse MPP Architecture
Lab – Let’s create a Azure Synapse workspace
Demo: Explore Synapse Studio V2
Demo: Monitor Synapse Analytics Studio
Storage and Sharding patterns
Data Distribution and Distributing Keys
About the serverless SQL pool
Lab – Using External tables – CSV – Part 1
Lab – Using External tables – CSV – Part 2
Lab – External Tables – Parquet file
Lab – External Tables – Multiple Parquet files
Lab – OPENROWSET – JSON files
The dedicated SQL pool
Lab – Creating a SQL pool
Lab – SQL Pool – External Tables – CSV
Lab – SQL Pool – External Tables – Parquet
Lab – External table – Hidden files and folders
Pausing the SQL Pool
Lab – Loading data into a SQL pool using Polybase
Lab – Loading data into a table – COPY Command – CSV
Lab – Loading data into a table – COPY Command – Parquet
Lab – Loading data – Pipelines – Storage accounts
Lab – Loading data – Pipelines – Azure SQL database
Designing a data warehouse
Fact and Dimension Tables
Lab – Building a Fact Table
Lab – Building a dimension table
Lab – Transfer data to our SQL Pool
Lab – Using Power BI for Star Schema
Understanding table types
Lab – Creating Hash-distributed Tables
Lab – Creating Replicated Tables
Lab – Surrogate keys for dimension tables
Fact as Hash and Dimensions as Replicate
Slowly Changing dimensions
Indexes in Azure Synapse
Which Load Method to use
Partitioning in Synapse Analytics
Lab – Creating a table with partitions
Lab – Switching partitions
Lab – CASE statement
Analyse Data using Apache Spark Notebook
Demo: Analyse Data using Serverless SQL Pool

Security layers in Azure Synapse Service

Azure Data Lake

What is Data Lake? and how it can solve this problem?
Blob Storage vs Data Lake
Hierarchical namespace
Demo – Create Azure Data Lake Gen2 Account
Demo – On-Premises to Data Lake Gen2 using portal and storage explorer
Demo – On-Premises to Data Lake Gen2 using azcopy
Demo – Azure Blob to Data Lake Gen 2 using Data Factory
Demo – Azure SQL Database to Data Lake Gen2 using Data Factory
Data flow around Data Lake
Data Lake and Transient clusters
Data Processing using HDInsight

Data Lake Security Layers

Introduction
Storage Access Keys
SAS – Shared Access Signature
Microsoft Entra ID (Azure Active Directory)
Role Based Access Control (RBAC)
Access Control list (ACL)
Firewalls and Virtual Networks
Encryption in Transit
Encryption at Rest
Advanced threat protection

Azure Cosmos DB

Introduction to NoSQL
SQL vs NoSQL
Introduction to Azure Cosmos DB
Components of Azure Cosmos DB for NoSQL
Cosmos DB Features
Cosmos DB – Multi Model 5 APIs
APIs in Azure Cosmos DB
Azure Table Storage vs Cosmos DB Table API
Provision Cosmos DB Account
Cosmos DB – Database , Containers and items
Cosmos DB – Throughput and Request units
Cosmos DB – Horizontally Scalable
Cosmos DB – Partition and Partitioning key
Cosmos DB – Dedicated vs Shared throughput
Cosmos DB – Avoiding hot partitions
Cosmos DB – Single partition vs Cross partition
Cosmos DB – Composite Key
Cosmos DB – Partition key best practice
Cosmos DB – Automatic Indexing
Demo Insert and query data in your cosmos DB
Cosmos DB – Time to Live feature
Cosmos DB – Globally Distribution feature
Cosmos DB – Multi Master feature
Cosmos DB – Manual vs Automatics Failover
Cosmos DB – 5 consistent levels
Cosmos DB – Azure CLI
Cosmos DB – Pricing
Cosmos DB – Monitoring through AMS (Azure Monitor Service)
Cosmos DB – Monitoring through Cosmos DB Portal
Cosmos DB – Security
Cosmos DB – High Availability and Disaster Recovery Option

Azure Databricks

Spark Basics
Why Databricks Evolved?
Introduction to Azure Databricks
Databricks Architecture
How to save Databricks demo Cost
Azure Databricks Clusters
Lab: Azure Databricks Workspace Creation
Lab: Provision Clusters and Notebooks
Lab: Create Databricks Community Edition
Lab: Magic Commands
Lab: Databricks Utilities (dbutils)
DBFS
Lab: Create Dataframe
Lab: Read csv file
Lab: Read Text File
Lab: Read JSON File
Lab: Read Parquet File
Lab: Write to Parquet File
Introduction to Spark SQL
Lab: Running SQL on Dataframes
Lab: Views in Spark SQL
Hive Metastore
Lab: Create Databases
Lab: Managed Tables
Lab: Unmanaged Tables
Delta Table
Lab: Write to Delta Table

Transformation in ADB

Select Columns
Add new column
Rename column
Calculated Columns
Drop Columns
Sort Columns
Manual schema
Read csv file using manual schema
Changing data types (Type Casting)
Math functions
Date function
String function
Sort function
UNION
JOIN
Broadcast Join
Filter
Grouping
Repartition()
Coalesce()
Salting

Connect ADB to ADLS Gen2 using Azure credentials

Azure Data Factory (ADF)

What is Azure Data Factory
Costing aspect when it comes to Azure Data Factory
Lab: Provision Azure Data Factory Instance
Data Factory Components
Data Factory – Pipeline and Activities
Types of activities
Data Factory – Linked Service and Datasets
Data Factory – Integration runtime
Lab: Create SSIR
Copy data activity in ADF
Lab: Create your first pipeline
Debug your ADF Pipeline
Lab: Copy data from GitHub to ADLS
Lab: Copy from Rest API to ADLS
Introduction to Parameterization
Parameterize linked services in Azure Data Factory
Parameterize datasets in Azure Data Factory
Parameterize pipeline in Azure Data Factory
System variable in ADF
Connectors in ADF
Supported file formats in ADF
User Properties in ADF
Lab: Copy from Azure SQL database
Get metadata activity
Delete activity
Fail activity
Set variable activity
Append variable activity
Execute activity
Deactivate activity
Lookup activity
For Each activity
If condition activity
Monitoring in Azure Data Factory
Mini Project
Execute pipeline activity
Notebook activity
Web activity
Triggers in ADF
Data Flow Concept
Mapping Data Flow
Wrangling Data Flow

CI/CD via Azure DevOps

Event Hub & Stream Analytics

Projects

Student Ratings & Reviews

No Review Yet

About Course

What Will You Learn?

Course Content

Basics of Data

Structured vs Unstructured vs Semi-Structured Data

Batch vs Streaming Data

OLTP vs OLAP

Data Lake vs Data Warehouse

Section Quiz

Azure Basic Services

Introduction to Cloud Computing

Introduction to Azure

Create Azure Free Account

Azure Portal Walkthrough

Managed and Unmanaged Service

Resource Management group and Subscription

Tagging

Access Control

Set Budget

Create Azure Resource Groups

Create Azure Resources

Delete resources

Azure Storage

Different Services for Azure Storage

Azure Blob Storage

Azure Queue

Azure File Share

Azure Disk Storage

Azure SQL Database

Introduction to Azure SQL Database

Azure SQL IaaS & PaaS Offering

Different Paas Deployment

SQL Pricing Model & Service Tier

Azure SQL Server in Virtual Machine (IaaS)

Azure SQL Database Backup and Restore

Azure SQL Database Scaling

Azure Database Security options

Azure Managed Instance advance security

Encrypting Data at Rest and Motion

Dynamic Data Masking

High Availability vs Disaster Recovery

RTO vs RPO

Azure SQL Database High Availability and Disaster Recovery options

Azure Database vs Azure Data warehouse

Transact SQL (T-SQL)

Introduction

Lab – Installing Azure Data Studio

Lab – T-SQL – Create Table command

Lab – T-SQL – SELECT clause

Lab – T-SQL – WHERE clause

Lab – T-SQL – ORDER BY clause

Lab – T-SQL – Aggregate Functions

Lab – T-SQL – GROUP BY clause

Lab – Using PARTITION BY

Lab – LEAD and LAG functions

Lab – WITH Clause

Lab – T-SQL – Foreign Key constraints

Azure Synapse Analytics

Introduction to Data Warehouse

Traditional vs Modern Warehouse architecture

What is Synapse Analytics Service

Azure Synapse Benefits

Azure Synapse MPP Architecture

Lab – Let’s create a Azure Synapse workspace

Demo: Explore Synapse Studio V2

Demo: Monitor Synapse Analytics Studio

Storage and Sharding patterns

Data Distribution and Distributing Keys

About the serverless SQL pool

Lab – Using External tables – CSV – Part 1

Lab – Using External tables – CSV – Part 2

Lab – External Tables – Parquet file

Lab – External Tables – Multiple Parquet files

Lab – OPENROWSET – JSON files

The dedicated SQL pool

Lab – Creating a SQL pool

Lab – SQL Pool – External Tables – CSV

Lab – SQL Pool – External Tables – Parquet

Lab – External table – Hidden files and folders

Pausing the SQL Pool