Mastering Azure Data Factory: ETL & Data Engineering for Beginners
Categories: Microsoft Azure
About Course
In this course you will primarily be using Azure Data Factory on Microsoft Azure and Microsoft Fabric in addition to other services such as Azure Blob Storage, Azure Data Lake Storage Gen 2 and Azure SQL Database.
The course is packed with lectures, code-along videos and a dedicated course project. As an added benefit you will also have lifetime access to all of the lectures…
Outcome of This Course
By the end, students will:
-
Understand data engineering and ETL concepts deeply
-
Be confident in building, monitoring, and managing ADF pipelines
-
Have multiple hands-on projects in their portfolio
-
Be ready for entry-level data engineering interviews
What Will You Learn?
- You will learn how to build a real-world data pipeline in Azure Data Factory (ADF).
- You will acquire good Data Engineering skills in Azure using Azure Data Factory (ADF), Azure Data Lake Storage Gen2, Azure SQL Database and Azure Monitor.
- You will learn how to ingest data from sources such as HTTP and Azure Blob Storage into Azure Data Lake Gen2 using Azure Data Factory (ADF).
- You will learn how to transform data using Data Flows in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2.
- You will learn how to transform data using Databricks Notebook Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2.
- You will learn how to transform data using Azure HDInsight Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2.
- You will learn how to load transformed data from Azure Data Lake Storage Gen2 to Azure SQL Database using Azure Data Factory (ADF).
- You will learn extensively about Triggers in Azure Data Factory (ADF) and how to use them to schedule the data pipelines.
- You will learn how to monitor pipelines using Azure Data Factory (ADF), Azure Monitor and Log Analytics with a real-world project.
- You will learn how to build production ready pipelines and good practices and naming standards.
- You will learn the topics required on Azure Data Factory to pass the Azure Data Engineer Associate Certification Exam DP203.
- You will learn about how to create CI/CD pipelines in Azure Devops to release ADF pipelines to higher environments (Testing/ Production).
Course Content
Welcome & Course Introduction
-
Welcome to the Course
-
What You Will Learn
-
Tools You’ll Need
-
How to Make the Most Out of This Course
Introduction to Data Engineering & ETL Concepts
Objective: Build strong foundational knowledge
-
What is Data Engineering?
00:00 -
What is ETL, ELT, and Data Pipelines?
09:45 -
OLTP vs OLAP
01:56 -
Data Warehousing Concepts
-
What is a Data Lake and Data Lakehouse?
-
Understanding Schema, Tables, Rows & Columns
-
Introduction to Star Schema and Snowflake Schema
-
Fact Tables vs Dimension Tables (with examples)
-
Understanding Slowly Changing Dimensions
Introduction to Azure Ecosystem
Objective: Provide context around where ADF fits
-
What is Azure? Overview of Azure Services
-
What is Azure Data Factory?
-
ADF in the Modern Data Architecture
-
Key Azure Services You Should Know
-
Setting Up an Azure Free Account
Azure Data Factory Basics
Objective: Hands-on ADF usage
-
ADF Architecture Overview
-
Understanding Linked Services, Datasets, Pipelines
-
ADF UI Walkthrough
-
Integration Runtime (IR): Self-hosted vs Azure IR
-
Creating Your First Pipeline
Source & Sink Integrations
Objective: Work with real data sources
-
Lab: Connecting to Azure Blob Storage
-
Lab: Connecting to Azure Data Lake Gen2
-
Lab: Connecting to Azure SQL Database
-
Lab: Connecting to On-Premise SQL Server
-
Lab: Connecting to REST API
-
Lab: Connecting to GitHub
Transformation Activities
Objective: Build simple to complex pipelines
-
Lab: Lookup Activity
-
Lab: ForEach Activity
-
Lab: Until Activity
-
Lab: Filter Activity
-
Lab: If Condition Activity
-
Lab: Switch Activity
-
Lab: Get metadata Activity
-
Lab: Delete Activity
-
Lab: Set variable Activity
-
Lab: Append variable Activity
-
Lab: Execute Activity
-
Lab: Notebook Activity
-
Lab: Web Activity
-
Lab: Fail Activity
-
Data Flow vs Control Flow
-
Mapping Data Flows Overview
-
Lab: Creating Your First Data Flow
-
Lab: Data Flow Debugging
Real-time Scenarios with Data Flows
Objective: Work on realistic data transformation scenarios
-
Lab: Data Cleaning & Filtering
-
Lab: Join, Lookup, Aggregate, Derived Column
-
Lab: Conditional Split
-
Lab: Surrogate Key Generation
-
Lab: Slowly Changing Dimension (SCD)
-
Incremental Load Strategy in ADF
-
Lab: Incremental Load
Parameterization & Reusability
Objective: Make pipelines dynamic
-
Parameters in Pipelines and Datasets
-
Variables and Expressions
-
Dynamic Content & Functions
-
Using Config Files (JSON-driven pipelines)
-
Reusable Pipelines & Templates
Monitoring, Triggers & Error Handling
Objective: Production-ready practices
-
Lab: Monitoring Pipelines & Data Flows
-
Lab: Alerts and Logs
-
Lab: Retry Policies and Error Handling
-
Lab: Debugging Failures
-
Lab: ADF Triggers & It’s Types
-
Lab: Creating Event-Driven Pipelines
DevOps & CI/CD with ADF
Objective: Bring ADF to enterprise level
-
Introduction to DevOps for Data Engineers
-
Lab: Source Control with Git Integration in ADF
-
Lab: Branching Strategy for ADF
-
Lab: Creating ARM Templates for Deployment
-
Lab: CI/CD Pipeline using Azure DevOps for ADF
-
Lab: Environment-based Deployment
Capstone Projects
Objective: Apply all the knowledge in real-life scenarios
-
Project 1: Retail Sales Data Pipeline
-
Load CSV files from Blob Storage
-
Clean & Transform Sales and Product Data
-
Join with Dimension Tables
-
Load into Azure SQL DB
-
Create Fact & Dimension Tables
-
Project 2: Incremental Data Load for HR System
-
Use SCD Type 1 & 2 on Employee Data
-
Maintain history in Data Warehouse
-
Use Mapping Data Flows + Lookup
-
Project 3: Event-based Pipeline using Azure Event Grid
-
Trigger pipeline when a new file lands in Blob
-
Process JSON data and push to SQL
-
Project 4: ETL from REST API to Azure SQL
-
Pull data from REST API (e.g., COVID data or dummy API)
-
Transform and store in SQL
-
Parameterize API calls
-
Project 5: CI/CD Pipeline for ADF
-
Integrate ADF with GitHub/Azure DevOps
-
Automate deployment across environments
Career Guidance & Next Steps
Objective: Help students take the next step
-
Resume Tips for Data Engineering Roles
-
Common Interview Questions (ADF, ETL, SQL)
-
How to Practice ADF Further
-
Learning Roadmap after ADF: Synapse, Data Bricks, Fabric
-
Final Course Recap
Student Ratings & Reviews
No Review Yet