Starts 13 June, 2026 · Cohort 1 · Live Data Engineering Bootcamp for Analysts

Bootcamp built to take you from Data Analyst → AI-Enabled Data Engineer

Become the end-to-end data person on your team.

An 8-week live cohort that takes you from dashboards to the full stack. Model the warehouse, build the pipeline, ship to production.

★ Inner Circle · Until 9 June, 2026
US$630 → US$840
A private feedback session with the core team to refine the curriculum before Cohort 1 begins. Price lowered in return for your input.
Standard · From 10 June, 2026
US$840
Cohort begins Sat, 13 June, 2026

Save US$210 · Enroll before 9 June, 2026


The Inner Circle

Have a say in what Cohort 1 becomes.

Price lowered in return for your input.

A few days before the cohort begins, the core team will host a private feedback session with Inner Circle members. You bring the stacks your team is moving to, the analyst pain you're tired of, the engineering gaps you want filled. We listen. Then we shape the deep-dives, migration labs, and capstone directions around what you asked for.

This room exists once. After 9 June, 2026, the Inner Circle closes.

That's the trade. Your time and input now, in exchange for a US$210 lower price and a curriculum genuinely enhanced by you.

This isn't a webinar. It's the feedback session that makes Cohort 1 yours.
Refine the curriculum
A private feedback session with the core team. You bring real analyst-to-engineer pain; we shape the deep-dives, migration labs, and capstone tracks around it.
A price benefit for your input
US$630 instead of US$840. Not a discount. Your time and feedback before Day 1 is what earns the lower price.
Lock in the Inner Circle price
US$630 secured before standard pricing opens on 10 June, 2026. No waitlist, no uncertainty, no chance of paying more later.
Claim your Inner Circle price
Save US$210 · Closes 9 June, 2026


The Promise

8 weeks from now, you'd have built and shipped a production data pipeline.

Not a tutorial. Not a notebook. A working end-to-end pipeline on the platform of your choice, defended in front of mentors, on your GitHub, something you can talk through confidently in interviews.

16
Live Sessions
8
Capstone Layers
3
Platforms Taught
Weeks 1-2 · 4 sessions

Foundations

Advanced SQL engineering, data modeling, production Python, OOP for pipelines.

Weeks 3-4 · 4 sessions

Platforms & Cloud

PySpark, Delta Lake, Azure stack, Microsoft Fabric. The platforms and engines job ads keep asking for.

Weeks 5-6 · 4 sessions

Transformation & Orchestration

dbt Core & Cloud, Airflow, enterprise pipelines, semantic layers, data quality.

Weeks 7-8 · 4 sessions

Production & Launch

Streaming, CI/CD, capstone build, system design and interview prep.


Who This Is For

Built for working analysts ready to own both the dashboard and the pipeline.

SQL fluency assumed. No deep engineering background needed. Just analyst experience, the right work ethic, and the willingness to ship.

The Working Data Analyst

1-4 years building dashboards in Power BI, Tableau, or Looker. You see "dbt," "Databricks," and "Microsoft Fabric" in job ads and recognise the words but not the work. You're ready to own the pipeline that feeds your dashboards, not hand it off to someone else.

The DA Bootcamp Alumnus

You completed the Codebasics Data Analytics Bootcamp and you're 12-18 months into your analyst role. Now you want to own the work end-to-end (the warehouse model, the pipeline, and the dashboard on top), instead of waiting on the senior analyst for the parts you don't touch yet.

The BI Developer or Business Analyst

You're already adjacent to the data team, writing SQL and building reports. You want to add the engineering layer, so you're the person who can model, pipe, and present a dataset end-to-end. This cohort takes you there.

Prerequisites

  • 1+ years of working experience as a data analyst, BI developer, business analyst, or equivalent
  • Comfort writing SQL queries (joins, aggregations, basic CTEs)
  • Willingness to ship a production-grade data pipeline by Week 8
This is not a beginner course. We assume SQL fluency and analyst-level data literacy. If you're starting from zero, our Data Analytics Bootcamp is the prior step.

The Framework

Model. Pipeline. Ship.

Most courses teach one layer. We teach all three, and how they connect into the engineering work analysts get hired to do.

01

Model

Engineer the data layer the warehouse actually runs on. Advanced SQL, dimensional modeling, SCD logic, medallion architecture, data contracts. The work that sits underneath every dashboard.

02

Pipeline

Build the pipelines an enterprise team can trust in production. PySpark, Delta Lake, Fabric, dbt, Airflow, CI/CD, streaming. The infrastructure your job ads keep mentioning.

03

Ship

An end-to-end production pipeline you defend on system design rounds. By Week 8 you have a GitHub repo, an architecture diagram, dbt tests passing, and a 5-minute walkthrough video.


What "AI-Enabled" Means Here

AI is woven into how you build, not bolted on as a topic.

The job market hires engineers who use AI to ship faster, not engineers who studied AI as theory. Through every phase, you learn the AI-assisted workflows working data teams actually use in 2026.

AI-assisted SQL & modelling

Use Copilot, Cursor, and Claude to draft, refactor, and document advanced SQL and dbt models, then own the final logic.

LLM-augmented pipeline debugging

Drive an LLM through stack traces, broken Airflow DAGs, and PySpark errors productively, instead of pasting raw logs and hoping.

AI-driven data quality

Generate dbt tests, schema contracts, and anomaly checks with AI assistance, so your quality layer scales with the pipeline.

Semantic layers & RAG over your warehouse

Make the warehouse queryable in natural language. We cover the architecture and how to build one.

Documentation & data contracts at AI speed

Auto-generated lineage, model documentation, and stakeholder data dictionaries. The unglamorous work that quietly makes you more valuable on the team.

The judgment AI can't replace

When to trust the AI, when to override it, and how to make architecture decisions an LLM can't make for you. The taste layer on top of the tooling.


How It Works

Learn. Practice. Build. Ship.

Designed for working analysts. Two live weekend sessions a week.

STEP 01
Learn

2 live weekend sessions per week · recordings within 24 hours.

STEP 02
Practice

Every session ships one hands-on lab. SQL optimization, PySpark workloads, dbt models, Fabric pipelines, you build it as we teach it.

STEP 03
Build

From Week 5, every learner builds one end-to-end capstone pipeline integrating eight production layers. Reviewed weekly by mentors.

STEP 04
Ship

Final capstone walkthrough · GitHub repo · architecture diagram · LinkedIn technical post · system design interview rehearsal.


The Curriculum

8 weeks. 16 sessions. One end-to-end pipeline you ship.

Two live weekend sessions per week. Recordings within 24 hours.

Phase 01 · Foundations
Phase 02 · Platforms & Cloud
Phase 03 · Transformation & Orchestration
Phase 04 · Production & Launch
Session 1 - SQL for Modern Data Engineering
Advanced joins & query optimization · window functions deep dive · recursive CTEs · MERGE statements · incremental loading patterns · CDC concepts · query execution plans · warehouse optimization
Hands-on: Optimize enterprise-scale SQL workloads · build incremental transformation logic
Session 2 - Data Modeling & Warehouse Engineering
OLTP vs OLAP · star schema · snowflake schema · fact vs dimension tables · SCD Type 1 & 2 · partitioning strategies · medallion architecture · data contracts
Hands-on: Design retail analytics warehouse · implement SCD Type 2 logic
Session 3 - Production Python for Data Engineers
Modular Python architecture · OOP for pipelines · config-driven frameworks · logging · exception handling · retry mechanisms · environment management · secrets handling
Hands-on: Build a reusable ingestion framework
Session 4 - Advanced Python Data Processing
APIs & ingestion patterns · async processing · parallel execution · file streaming · memory optimization · testing with pytest · packaging basics
Hands-on: Build an API ingestion pipeline
Session 5 - PySpark Deep Dive
Spark architecture · executors & DAGs · lazy evaluation · partitioning · broadcast joins · shuffle optimization · Spark UI analysis · caching strategies
Hands-on: Optimize large-scale Spark workloads
Session 6 - Delta Lake & Lakehouse Engineering
Delta internals · ACID transactions · OPTIMIZE & ZORDER · time travel · schema evolution · Change Data Feed · incremental ETL · Bronze / Silver / Gold architecture
Hands-on: Build a medallion architecture pipeline
Session 7 - Azure Data Engineering Stack
ADLS Gen2 · Event Hubs · Key Vault · managed identities · Integration Runtime · networking basics · Synapse vs Databricks vs Fabric
Hands-on: Build a secure cloud ingestion architecture
Session 8 - Microsoft Fabric Engineering
OneLake · Lakehouse · Warehouse · Fabric Data Factory · Eventstream · Real-Time Intelligence · DirectLake · Fabric governance
Hands-on: End-to-end Fabric implementation
Session 9 - dbt Core Fundamentals
Models · sources · refs() · materializations · snapshots · incremental models · tests · documentation
Hands-on: Build a modular dbt transformation project
Session 10 - Advanced Analytics Engineering
Macros & Jinja · semantic layer · MetricFlow · SQLFluff · lineage · data quality frameworks · governance · reusable transformation patterns
Hands-on: Enterprise dbt framework implementation
Session 11 - Apache Airflow Engineering
DAG architecture · dynamic DAGs · sensors · XCom · scheduling · monitoring · retry patterns · failure handling
Hands-on: Build orchestrated ETL workflows
Session 12 - Enterprise Data Pipelines
Azure Data Factory · Fabric Pipelines · Databricks Workflows · metadata-driven pipelines · config-based orchestration · parameterization · reusable frameworks
Hands-on: Build a metadata-driven orchestration framework
Session 13 - Streaming Data Engineering
Kafka fundamentals · event-driven architecture · Structured Streaming · watermarking · windowing · event-time processing · CDC streaming · Event Hub integration
Hands-on: Real-time streaming pipeline
Session 14 - CI/CD & Reliability Engineering
Git branching strategies · GitHub Actions · automated testing · deployment pipelines · monitoring · freshness checks · cost optimization · incident management
Hands-on: CI/CD pipeline for data engineering workloads
Session 15 - End-to-End Capstone Project
API ingestion · lakehouse architecture · PySpark transformations · dbt modeling · Airflow orchestration · CI/CD · Power BI reporting · monitoring layer
Hands-on: Enterprise-grade end-to-end implementation
Session 16 - Interview Preparation & System Design
SQL interview rounds · PySpark interview questions · data modeling rounds · system design · resume transformation · LinkedIn optimization · mock interviews
Hands-on: Mock interview + architecture discussion sessions

Faculty

Taught by founders and operating leaders who build production data infrastructure for a living.

Dhaval Patel
Dhaval Patel
Founder, Codebasics · Ex-Bloomberg, Ex-NVIDIA · 17+ years in data & AI
79K+ trusted learners · 22K+ data analysts trained
Hemanand Vadivel
Hemanand Vadivel
Co-founder, Codebasics · Ex-Data Analytics Manager, UK & Germany · 10+ years in analytics leadership
Built and led analytics teams across international markets.
Harun
Harun
Lead Faculty · 7+ years building enterprise data platforms on Azure & Databricks at Hitachi Solutions
Azure Data Engineer · Databricks Certified Spark Developer · Global Speaker
Naveen
Naveen
Faculty · Head of Content & Analytics, Codebasics · Microsoft Certified: Fabric Analytics Engineer Associate
Owns the production pipelines and Power BI reports running on Codebasics' own Microsoft Fabric stack.
Kiran
Kiran
Content Curator & Program Manager · Codebasics
Analytics Engineer at Codebasics.

Tools You'll Work With Hands-On

The modern data engineering stack you'll ship with.

Every tool below is used hands-on across the 8 weeks. You'll leave fluent in the stack data teams actually run on, not just the names of things.

Advanced SQL
Python & pandas
PySpark
Databricks & Delta Lake
Microsoft Fabric
ADLS Gen2
dbt Core & Cloud
Apache Airflow
Azure Data Factory
Apache Kafka
Azure Event Hubs
GitHub Actions
CI/CD
Great Expectations
Power BI

Investment

The earlier you join, the lesser you pay.

Inner Circle members shape the curriculum and lock in the lower price. Standard pricing opens 10 June, 2026, ahead of the Cohort 1 start.

★ Cohort 1 Enrollment
Live Data Engineering Bootcamp for Analysts
Starts Saturday, 13 June, 2026 · 2 live weekend sessions per week
★ Inner Circle · Until 9 June, 2026
US$630
Standard · From 10 June, 2026
US$840

One-time payment

What's included in both tiers
  • 8-week live bootcamp · 16 live sessions across 4 phases
  • One end-to-end capstone pipeline (8 production layers, from API ingestion to Power BI)
  • Weekly hands-on labs on the platform of your choice (Databricks or Microsoft Fabric)
  • System design & mock interview prep with the core faculty
  • 1 year access to all session recordings
  • Certificate of completion
  • No-questions-asked refund on or before 15 June, 2026
  • Data Engineering Bootcamp 1.0 (worth US$240)

Both tiers receive the same Cohort 1 curriculum, the same 16 live sessions, the same capstone, the same faculty. The only difference: Inner Circle members enroll on or before 9 June, 2026 and join a private feedback session with the core team a few days before the cohort to refine the curriculum, deep-dives, and capstone direction before Day 1. That contribution is what earns the US$210 lower price. It isn't a flash sale. It isn't a downgrade. It's a different deal for people willing to help build Cohort 1, not just consume it.


No-Questions-Asked Refund

Enroll with zero risk. Attend the first two live sessions of the cohort, and if you feel it's not the right fit, request a full refund on or before 15 June, 2026. 100% money back, no questions asked.

Included: Data Engineering Bootcamp (US$240 value)

Every new enrollment includes full access to the Data Engineering Bootcamp 1.0, with Job Assistance, Live Problem Solving & Virtual Internship, at no extra cost. You get both bootcamps for the price of one.

Already own the Data Engineering Bootcamp?

You only pay the difference. Your investment of US$240 is fully adjusted, so you pay just US$390 for this bootcamp. Get your adjusted pricing →


Questions

Frequently Asked.

Saturday, 13 June 2026, a few days after the Inner Circle session.
You are securing your seat now. Full bootcamp access opens on 13 June 2026.
The Inner Circle is early enrollment, open until 9 June 2026. Inner Circle members enroll at a reduced price and get a dedicated live session a few days before the bootcamp launches on 13 June to help shape the curriculum through their feedback. You are not just enrolling early, you are influencing what gets built.
Yes. Every enrollment includes full access to the Data Engineering Bootcamp 1.0 at no extra cost. It includes Job Assistance, Live Problem Solving, and a Virtual Internship. You get both for the price of one.
Saturdays and Sundays, 4 to 7 PM IST. Sessions are fully live and interactive with hands-on labs and real-time Q&A. Recordings are available for revision.
All live sessions are recorded and available within 24 hours. You can catch up at your own pace, though live attendance is strongly recommended as the labs and discussions are where most of the real learning happens.
No. You keep access to all session recordings for 1 year from the bootcamp start date.
All Inner Circle members join a live session with the core team a few days before 13 June 2026. You bring real analyst-to-engineer problems, the stacks your team is moving to, and the gaps you want filled. What you share directly shapes the deep-dives, labs, and capstone tracks. The exact date is communicated to Inner Circle members after enrollment.
No. This bootcamp is built for working data analysts who want to cross into data engineering. If you have at least 1 year of analyst experience and are comfortable with SQL, you are ready.
We strongly advise against it. This bootcamp moves fast and assumes analyst-level SQL fluency and data literacy. If you are starting from zero, the Codebasics Data Analytics Bootcamp is the right first step. Build that foundation and come back.
Working data analysts, BI developers, and business analysts with 1 to 4 years of experience who want to own the full data stack, not just the dashboard layer. If you already work as a data engineer, this bootcamp is likely below your current level.
Every enrolled learner gets access to the Discord community where you can ask questions, connect with fellow learners, share progress, and learn from each other throughout the bootcamp. The mentor team also provides weekly hands-on lab support.
The Data Engineering Bootcamp 1.0 included with your enrollment has dedicated job assistance. The bootcamp itself focuses on building your skills, shipping a production-grade capstone on GitHub, and preparing you for system design interviews with the core faculty.
You get back the amount you actually paid for this Bootcamp. Your original purchase stays intact and you keep access to it.
No. Once your existing purchase is applied as a subsidy to reduce your price, that original purchase becomes non-refundable.
Full refund, no questions asked, if you request it on or before 15 June 2026. That is after the first two live sessions (13 and 14 June), so you can see exactly how the bootcamp runs before deciding.
Inner Circle enrollment is open until 9 June 2026. After that, Standard pricing opens from 10 June 2026 until the bootcamp starts on 13 June 2026.
The amount you paid for the Data Engineering Bootcamp 1.0 is fully adjusted and deducted from your enrollment fee.
The amount you paid for those individual courses is deducted from your enrollment fee.
Please check them here: https://codebasics.io/courses/bootcamp/5/Welcome-to-The-Data-Engineering-Bootcamp-Experience/lecture/4359

The analysts who own the full data stack become the most indispensable people on their teams.

Join Inner Circle - US$630 →

Inner Circle closes 9 June, 2026 · Cohort starts Sat, 13 June, 2026

Talk to us Chat with us