All times in Coordinated Universal Time (UTC).

Wednesday, October 15, 2025

14:00
15:00
16:00
16:30
17:00
17:30
18:00
18:30
19:00
14:00 - 16:00.
By Amogh Desai, Ash Berlin-Taylor, Brent Bovenzi, Bugra Ozturk, Daniel Standish, Jed Cunningham, Jens Scheffler, Kaxil Naik, Pierre Jeambrun, Tzu-ping Chung, Vikram Koka, Vincent Beck & Constance Martineau
10/15/2025 2:00 PM 10/15/2025 4:00 PM UTC AS24: Introducing Apache Airflow® 3 – The Next Evolution in Orchestration

Apache Airflow® 3 is here, bringing major improvements to data orchestration. In this keynote, core Airflow contributors will walk through key enhancements that boost flexibility, efficiency, and user experience.

Vikram Koka will kick things off with an overview of Airflow 3, followed by deep dives into DAG versioning (Jed Cunningham), enhanced backfilling (Daniel Standish), and a modernized UI (Brent Bovenzi & Pierre Jeambrun).

Next, Ash Berlin-Taylor, Kaxil Naik, and Amogh Desai will introduce the Task Execution Interface and Task SDK, enabling tasks in any environment and language. Jens Scheffler will showcase the Edge Executor, while Tzu-ping Chung and Vincent Beck will demo event-driven scheduling and data assets. Finally, Buğra Öztürk will unveil CLI enhancements for automation and debugging.

This keynote sets the stage for Airflow 3—don’t miss the chance to learn from the experts shaping the future of workflow orchestration!

Online

Apache Airflow® 3 is here, bringing major improvements to data orchestration. In this keynote, core Airflow contributors will walk through key enhancements that boost flexibility, efficiency, and user experience.

Vikram Koka will kick things off with an overview of Airflow 3, followed by deep dives into DAG versioning (Jed Cunningham), enhanced backfilling (Daniel Standish), and a modernized UI (Brent Bovenzi & Pierre Jeambrun).

Next, Ash Berlin-Taylor, Kaxil Naik, and Amogh Desai will introduce the Task Execution Interface and Task SDK, enabling tasks in any environment and language. Jens Scheffler will showcase the Edge Executor, while Tzu-ping Chung and Vincent Beck will demo event-driven scheduling and data assets. Finally, Buğra Öztürk will unveil CLI enhancements for automation and debugging.

16:00 - 16:30.
By Bugra Ozturk
10/15/2025 4:00 PM 10/15/2025 4:30 PM UTC AS24: Securing Airflow CLI with API

This talk will explore the key changes introduced by AIP-81, focusing on security enhancements and user experience improvements across the entire software development lifecycle.

We will break down the technical advancements from both a security and usability perspective, addressing key questions for Apache Airflow users of all levels. Topics include and not limited to isolating CLI communication to enhance security via leveraging Role-Based Access Control (RBAC) within the API for secure database interactions, clearly defining local vs. remote command execution and future improvements.

Online

This talk will explore the key changes introduced by AIP-81, focusing on security enhancements and user experience improvements across the entire software development lifecycle.

We will break down the technical advancements from both a security and usability perspective, addressing key questions for Apache Airflow users of all levels. Topics include and not limited to isolating CLI communication to enhance security via leveraging Role-Based Access Control (RBAC) within the API for secure database interactions, clearly defining local vs. remote command execution and future improvements.

16:30 - 17:00.
By Rafal Biegacz & Piotr Wieczorek
10/15/2025 4:30 PM 10/15/2025 5:00 PM UTC AS24: Enterprise Auditing: "The Verifiable Data Pipeline"

In this session we will dive deep into leveraging the robust logging and audit capabilities of Google Cloud Platform, Cloud Composer and Apache Airflow to establish a fully transparent and verifiable data orchestration layer.

We’ll demonstrate how to track and attribute every change—from environment configuration to individual task execution—essential for meeting stringent enterprise governance, compliance, and auditing requirements.

Online

In this session we will dive deep into leveraging the robust logging and audit capabilities of Google Cloud Platform, Cloud Composer and Apache Airflow to establish a fully transparent and verifiable data orchestration layer.

We’ll demonstrate how to track and attribute every change—from environment configuration to individual task execution—essential for meeting stringent enterprise governance, compliance, and auditing requirements.

17:00 - 17:30.
By Basil Faruqui
10/15/2025 5:00 PM 10/15/2025 5:30 PM UTC AS24: Orchestrator of Orchestrators: Uniting Airflow Pipelines with Business Applications in Production

Airflow powers thousands of data and ML pipelines—but in the enterprise, these pipelines often need to interact with business-critical systems like ERPs, CRMs, and core banking platforms.

In this demo-driven session we will connect Airflow with Control-M from BMC and showcase how Airflow can participate in end-to-end workflows that span not just data platforms but also transactional business applications.

Session highlights

  • Trigger Airflow DAGs based on business events (e.g., invoice approvals, trade settlements)
  • Feed Airflow pipeline outputs into ERP systems (e.g., SAP) or CRMs (e.g., Salesforce)
  • Orchestrate multi-platform workflows from cloud to mainframe with SLA enforcement, dependency management, and centralized control.
  • Provide unified monitoring and auditing across data and application layers
Online

Airflow powers thousands of data and ML pipelines—but in the enterprise, these pipelines often need to interact with business-critical systems like ERPs, CRMs, and core banking platforms.

In this demo-driven session we will connect Airflow with Control-M from BMC and showcase how Airflow can participate in end-to-end workflows that span not just data platforms but also transactional business applications.

Session highlights

  • Trigger Airflow DAGs based on business events (e.g., invoice approvals, trade settlements)
  • Feed Airflow pipeline outputs into ERP systems (e.g., SAP) or CRMs (e.g., Salesforce)
  • Orchestrate multi-platform workflows from cloud to mainframe with SLA enforcement, dependency management, and centralized control.
  • Provide unified monitoring and auditing across data and application layers
17:30 - 18:00.
By Vishvesh Pandey
10/15/2025 5:30 PM 10/15/2025 6:00 PM UTC AS24: Using Airflow for Real-Time Data Processing at Scale: Architecture, Challenges & Wins

Airflow is a powerhouse for batch data pipelines—but can it be tuned for real-time workloads? In this session, we’ll share how we adapted Apache Airflow to orchestrate near-real-time data processing at scale. From leveraging event-driven triggers and external APIs to minimizing latency with smart DAG design, we’ll dive into real-world architectural patterns, challenges, and optimizations that helped us handle time-sensitive data workflows with confidence. This talk is ideal for teams seeking to expand beyond batch and explore hybrid or real-time orchestration using Airflow.

Online

Airflow is a powerhouse for batch data pipelines—but can it be tuned for real-time workloads? In this session, we’ll share how we adapted Apache Airflow to orchestrate near-real-time data processing at scale. From leveraging event-driven triggers and external APIs to minimizing latency with smart DAG design, we’ll dive into real-world architectural patterns, challenges, and optimizations that helped us handle time-sensitive data workflows with confidence. This talk is ideal for teams seeking to expand beyond batch and explore hybrid or real-time orchestration using Airflow.

18:00 - 18:30.
By Julian LaNeve
10/15/2025 6:00 PM 10/15/2025 6:30 PM UTC AS24: Simplifying DAG creation with an AI-powered IDE for Airflow

As the demand for data products grows, data engineering teams face mounting pressure to deliver more and even faster, often becoming bottlenecks. Astro IDE changes the game.

Astro IDE is an AI-powered code editor built for Apache Airflow. It helps data teams go from idea to production in minutes—generating production-ready DAGs, enabling in-browser testing, and integrating directly with Git.

In this session, see how Astro IDE accelerates DAG creation, debugging, and deployment so data engineering teams can deliver more, 10x faster.

Online

As the demand for data products grows, data engineering teams face mounting pressure to deliver more and even faster, often becoming bottlenecks. Astro IDE changes the game.

Astro IDE is an AI-powered code editor built for Apache Airflow. It helps data teams go from idea to production in minutes—generating production-ready DAGs, enabling in-browser testing, and integrating directly with Git.

In this session, see how Astro IDE accelerates DAG creation, debugging, and deployment so data engineering teams can deliver more, 10x faster.

18:30 - 19:00.
By Milcah Mbithi
10/15/2025 6:30 PM 10/15/2025 7:00 PM UTC AS24: Automating Healthcare Triage with Airflow and Large Language Models

I will talk about how Apache Airflow is used in the healthcare sector with the integration of LLMs to enhance efficiency.

Healthcare generates vast volumes of unstructured data daily, from clinical notes and patient intake forms to chatbot conversations and telehealth reports. Medical teams struggle to keep up, leading to delays in triage and missed critical symptoms. This session explores how Apache Airflow can be the backbone of an automated healthcare triage system powered by Large Language Models (LLMs).

I’ll demonstrate how I designed and implemented an Airflow DAG orchestration pipeline that automates the ingestion, processing, and analysis of patient data from diverse sources in real-time. Airflow schedules and coordinates data extraction, preprocessing, LLM-based symptom extraction, and urgency classification, and finally routes actionable insights to healthcare professionals.

The session will focus on the following;

  • Managing complex workflows in healthcare data pipelines
  • Safely integrating LLM inference calls into Airflow tasks
  • Designing human-in-the-loop checkpoints for ethical AI usage
  • Monitoring workflow health and data quality with Airflow.
Online

I will talk about how Apache Airflow is used in the healthcare sector with the integration of LLMs to enhance efficiency.

Healthcare generates vast volumes of unstructured data daily, from clinical notes and patient intake forms to chatbot conversations and telehealth reports. Medical teams struggle to keep up, leading to delays in triage and missed critical symptoms. This session explores how Apache Airflow can be the backbone of an automated healthcare triage system powered by Large Language Models (LLMs).

19:00 - 19:30.
By Rajesh Bishundeo
10/15/2025 7:00 PM 10/15/2025 7:30 PM UTC AS24: Managed Workflow for Apache Airflow (MWAA): What's New?

MWAA is an AWS-managed service that simplifies the deployment and maintenance of the open-source Apache Airflow data orchestration platform. MWAA has recently introduced several new features to enhance the experience for data engineering teams. Features such as Graceful Worker Replacement Strategy that enable seamless MWAA environment updates with zero downtime, IPv6 support, and in place minor Airflow Version Downgrade are some of the many new improvements MWAA has brought to their users in 2025. Last, but not the least, the release of Airflow 3.0 support brings the latest open-source features introducing a new web-server UI, better isolation and security for environments. These enhancements demonstrate Amazon’s continued investment in making Airflow more accessible and scalable for enterprises through the MWAA service.

Online

MWAA is an AWS-managed service that simplifies the deployment and maintenance of the open-source Apache Airflow data orchestration platform. MWAA has recently introduced several new features to enhance the experience for data engineering teams. Features such as Graceful Worker Replacement Strategy that enable seamless MWAA environment updates with zero downtime, IPv6 support, and in place minor Airflow Version Downgrade are some of the many new improvements MWAA has brought to their users in 2025. Last, but not the least, the release of Airflow 3.0 support brings the latest open-source features introducing a new web-server UI, better isolation and security for environments. These enhancements demonstrate Amazon’s continued investment in making Airflow more accessible and scalable for enterprises through the MWAA service.

14:00 - 16:00. Online
By Amogh Desai, Ash Berlin-Taylor, Brent Bovenzi, Bugra Ozturk, Daniel Standish, Jed Cunningham, Jens Scheffler, Kaxil Naik, Pierre Jeambrun, Tzu-ping Chung, Vikram Koka, Vincent Beck & Constance Martineau
Track: Keynote

Apache Airflow® 3 is here, bringing major improvements to data orchestration. In this keynote, core Airflow contributors will walk through key enhancements that boost flexibility, efficiency, and user experience.

Vikram Koka will kick things off with an overview of Airflow 3, followed by deep dives into DAG versioning (Jed Cunningham), enhanced backfilling (Daniel Standish), and a modernized UI (Brent Bovenzi & Pierre Jeambrun).

Next, Ash Berlin-Taylor, Kaxil Naik, and Amogh Desai will introduce the Task Execution Interface and Task SDK, enabling tasks in any environment and language. Jens Scheffler will showcase the Edge Executor, while Tzu-ping Chung and Vincent Beck will demo event-driven scheduling and data assets. Finally, Buğra Öztürk will unveil CLI enhancements for automation and debugging.

16:00 - 16:30. Online
By Bugra Ozturk
Track: Airflow 3

This talk will explore the key changes introduced by AIP-81, focusing on security enhancements and user experience improvements across the entire software development lifecycle.

We will break down the technical advancements from both a security and usability perspective, addressing key questions for Apache Airflow users of all levels. Topics include and not limited to isolating CLI communication to enhance security via leveraging Role-Based Access Control (RBAC) within the API for secure database interactions, clearly defining local vs. remote command execution and future improvements.

16:30 - 17:00. Online
By Rafal Biegacz & Piotr Wieczorek
Track: Sponsored

In this session we will dive deep into leveraging the robust logging and audit capabilities of Google Cloud Platform, Cloud Composer and Apache Airflow to establish a fully transparent and verifiable data orchestration layer.

We’ll demonstrate how to track and attribute every change—from environment configuration to individual task execution—essential for meeting stringent enterprise governance, compliance, and auditing requirements.

17:00 - 17:30. Online
By Basil Faruqui
Track: Sponsored

Airflow powers thousands of data and ML pipelines—but in the enterprise, these pipelines often need to interact with business-critical systems like ERPs, CRMs, and core banking platforms.

In this demo-driven session we will connect Airflow with Control-M from BMC and showcase how Airflow can participate in end-to-end workflows that span not just data platforms but also transactional business applications.

Session highlights

  • Trigger Airflow DAGs based on business events (e.g., invoice approvals, trade settlements)
  • Feed Airflow pipeline outputs into ERP systems (e.g., SAP) or CRMs (e.g., Salesforce)
  • Orchestrate multi-platform workflows from cloud to mainframe with SLA enforcement, dependency management, and centralized control.
  • Provide unified monitoring and auditing across data and application layers
17:30 - 18:00. Online
By Vishvesh Pandey
Track: Use cases

Airflow is a powerhouse for batch data pipelines—but can it be tuned for real-time workloads? In this session, we’ll share how we adapted Apache Airflow to orchestrate near-real-time data processing at scale. From leveraging event-driven triggers and external APIs to minimizing latency with smart DAG design, we’ll dive into real-world architectural patterns, challenges, and optimizations that helped us handle time-sensitive data workflows with confidence. This talk is ideal for teams seeking to expand beyond batch and explore hybrid or real-time orchestration using Airflow.

18:00 - 18:30. Online
By Julian LaNeve
Track: Sponsored

As the demand for data products grows, data engineering teams face mounting pressure to deliver more and even faster, often becoming bottlenecks. Astro IDE changes the game.

Astro IDE is an AI-powered code editor built for Apache Airflow. It helps data teams go from idea to production in minutes—generating production-ready DAGs, enabling in-browser testing, and integrating directly with Git.

In this session, see how Astro IDE accelerates DAG creation, debugging, and deployment so data engineering teams can deliver more, 10x faster.

18:30 - 19:00. Online
By Milcah Mbithi
Track: Use cases

I will talk about how Apache Airflow is used in the healthcare sector with the integration of LLMs to enhance efficiency.

Healthcare generates vast volumes of unstructured data daily, from clinical notes and patient intake forms to chatbot conversations and telehealth reports. Medical teams struggle to keep up, leading to delays in triage and missed critical symptoms. This session explores how Apache Airflow can be the backbone of an automated healthcare triage system powered by Large Language Models (LLMs).

19:00 - 19:30. Online
By Rajesh Bishundeo
Track: Sponsored

MWAA is an AWS-managed service that simplifies the deployment and maintenance of the open-source Apache Airflow data orchestration platform. MWAA has recently introduced several new features to enhance the experience for data engineering teams. Features such as Graceful Worker Replacement Strategy that enable seamless MWAA environment updates with zero downtime, IPv6 support, and in place minor Airflow Version Downgrade are some of the many new improvements MWAA has brought to their users in 2025. Last, but not the least, the release of Airflow 3.0 support brings the latest open-source features introducing a new web-server UI, better isolation and security for environments. These enhancements demonstrate Amazon’s continued investment in making Airflow more accessible and scalable for enterprises through the MWAA service.

Thursday, October 16, 2025

14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
14:00 - 14:30.
By Yuan Luo & Huiliang Zhang
10/16/2025 2:00 PM 10/16/2025 2:30 PM UTC AS24: Empowering Precision Healthcare with Apache Airflow-iKang Healthcare Group’s DataHub Journey

iKang Healthcare Group, serving nearly 10 million patients annually, built a centralized healthcare data hub powered by Apache Airflow to support its large-scale, real-time clinical operations. The platform integrates batch and streaming data in a lakehouse architecture, orchestrating complex workflows from data ingestion (HL7/FHIR) to clinical decision support.

Healthcare data’s inherent complexity—spanning structured lab results to unstructured clinical notes—requires dynamic, reliable orchestration. iKang uses Airflow’s DAGs, extensibility, and workflow-as-code capabilities to address challenges like multi-system coordination, semantic data linking, and fault-tolerant automation.

iKang extended Airflow with cross-DAG event triggers, task priority weights, LLM-driven clinical text processing, and a visual drag-and-drop DAG builder for medical teams. These innovations improved diagnostic turnaround, patient safety, and cross-system workflow visibility.

iKang’s work demonstrates Airflow’s power in transforming healthcare data infrastructure and advancing intelligent, scalable patient care.

Online

iKang Healthcare Group, serving nearly 10 million patients annually, built a centralized healthcare data hub powered by Apache Airflow to support its large-scale, real-time clinical operations. The platform integrates batch and streaming data in a lakehouse architecture, orchestrating complex workflows from data ingestion (HL7/FHIR) to clinical decision support.

Healthcare data’s inherent complexity—spanning structured lab results to unstructured clinical notes—requires dynamic, reliable orchestration. iKang uses Airflow’s DAGs, extensibility, and workflow-as-code capabilities to address challenges like multi-system coordination, semantic data linking, and fault-tolerant automation.

14:30 - 15:00.
By Supriya Badgujar & Vishesh Garg
10/16/2025 2:30 PM 10/16/2025 3:00 PM UTC AS24: DAG Doctor: AI-Powered Validation and Optimization for Airflow Pipelines

We’ve all watched our Airflow DAGs grow from simple pipelines into complex beasts that nobody wants to touch. But what if AI could be your DAG whisperer? In this session, I’ll show you how we’re teaching machines to speak Airflow.

By combining the pattern-recognition superpowers of Large Language Models with traditional code analysis, we’ve built a framework that doesn’t just find problems—it fixes them. Think of it as a smart co-pilot for your data orchestration that catches missing dependencies before they cause midnight alerts and suggests restructuring that actually makes sense.

This isn’t theoretical—we’ve seen 40% fewer pipeline failures and 30% faster execution times when teams adopt these approaches. I’ll walk through real examples where AI spotted inefficient task groupings and resource bottlenecks that humans had missed for months. You’ll discover practical ways to build AI validation into your workflow, train models to recognize Airflow’s quirks, and even generate documentation that people actually want to read. Best of all, you’ll leave with code samples you can implement right away and a framework for building your own AI assistants.

Online

We’ve all watched our Airflow DAGs grow from simple pipelines into complex beasts that nobody wants to touch. But what if AI could be your DAG whisperer? In this session, I’ll show you how we’re teaching machines to speak Airflow.

By combining the pattern-recognition superpowers of Large Language Models with traditional code analysis, we’ve built a framework that doesn’t just find problems—it fixes them. Think of it as a smart co-pilot for your data orchestration that catches missing dependencies before they cause midnight alerts and suggests restructuring that actually makes sense.

15:00 - 15:30.
By Prerit Munjal
10/16/2025 3:00 PM 10/16/2025 3:30 PM UTC AS24: Airflow on Kubernetes & Firecracker: From Chaos to Clarity with Observability

Four months ago, our Airflow environment was a black box. DAGs would fail silently, resource contention caused random task failures, and our data engineers were bypassing Airflow entirely because they couldn’t trust it. Our solution? Rebuild Airflow as an observable, self-scaling platform on Kubernetes and Firecracker.

This talk dissects our transformation from Airflow uncertainty to operational confidence. I’ll walk through how we instrumented our entire Airflow stack to provide real-time visibility into workflow execution, built a dynamic scaling system that handles our 5-10x daily traffic spikes, and implemented ephemeral environments(offering true isolation) that cut development cycles by 70%. We’ll explore the exact architecture behind our ““Airflow Observatory”” – a monitoring system that tracks task execution across thousands of DAGs and predicts failures before they happen. Closing the session with a live demo of intelligent autoscaling and how we implemented custom sensors that dramatically improved pipeline reliability.

Online

Four months ago, our Airflow environment was a black box. DAGs would fail silently, resource contention caused random task failures, and our data engineers were bypassing Airflow entirely because they couldn’t trust it. Our solution? Rebuild Airflow as an observable, self-scaling platform on Kubernetes and Firecracker.

This talk dissects our transformation from Airflow uncertainty to operational confidence. I’ll walk through how we instrumented our entire Airflow stack to provide real-time visibility into workflow execution, built a dynamic scaling system that handles our 5-10x daily traffic spikes, and implemented ephemeral environments(offering true isolation) that cut development cycles by 70%. We’ll explore the exact architecture behind our ““Airflow Observatory”” – a monitoring system that tracks task execution across thousands of DAGs and predicts failures before they happen. Closing the session with a live demo of intelligent autoscaling and how we implemented custom sensors that dramatically improved pipeline reliability.

15:30 - 16:00.
By Jon Hiett
10/16/2025 3:30 PM 10/16/2025 4:00 PM UTC AS24: Airflow & Your Automation CoE: Streamlining Integration for Enterprise-Wide Governance and Value

As Apache Airflow adoption accelerates for data pipeline orchestration, integrating it effectively into your enterprise’s Automation Center of Excellence (CoE) is crucial for maximizing ROI, ensuring governance, and standardizing best practices. This session explores common challenges faced when bringing specialized tools like Airflow into a broader CoE framework. We’ll demonstrate how leveraging enterprise automation platforms like Automic Automation can simplify this integration by providing centralized orchestration, standardized lifecycle management, and unified auditing for Airflow DAGs alongside other enterprise workloads. Furthermore, discover how Automation Analytics & Intelligence (AAI) can offer the CoE a single pane of glass for monitoring performance, tracking SLAs, and proving the business value of Airflow initiatives within the complete automation landscape. Learn practical strategies to ensure Airflow becomes a well-governed, high-performing component of your overall automation strategy.

Online

As Apache Airflow adoption accelerates for data pipeline orchestration, integrating it effectively into your enterprise’s Automation Center of Excellence (CoE) is crucial for maximizing ROI, ensuring governance, and standardizing best practices. This session explores common challenges faced when bringing specialized tools like Airflow into a broader CoE framework. We’ll demonstrate how leveraging enterprise automation platforms like Automic Automation can simplify this integration by providing centralized orchestration, standardized lifecycle management, and unified auditing for Airflow DAGs alongside other enterprise workloads. Furthermore, discover how Automation Analytics & Intelligence (AAI) can offer the CoE a single pane of glass for monitoring performance, tracking SLAs, and proving the business value of Airflow initiatives within the complete automation landscape. Learn practical strategies to ensure Airflow becomes a well-governed, high-performing component of your overall automation strategy.

16:00 - 16:30.
By Aleksandr Shirokov, Roman Khomenko & Tarasov Alexey
10/16/2025 4:00 PM 10/16/2025 4:30 PM UTC AS24: Building an MLOps Platform for 300+ ML/DS Specialists on Top of Airflow

As your organization scales to 20+ data science teams and 300+ DS/ML/DE engineers, you face a critical challenge: how to build a secure, reliable, and scalable orchestration layer that supports both fast experimentation and stable production workflows. We chose Airflow — and didn’t regret it! But to make it truly work at our scale, we had to rethink its architecture from the ground up.

In this talk, we’ll share how we turned Airflow into a powerful MLOps platform through its core capability: running pipelines across multiple K8s GPU clusters from a single UI (!) using per-cluster worker pools. To support ease of use, we developed MLTool — our own library for fast and standardized DAG development, integrated Vault for secure secret management across teams, enabled real-time logging with S3 persistence and built a custom SparkSubmitOperator for Kerberos-authenticated Spark/Hadoop jobs in Kubernetes. We also streamlined the developer experience — users can generate a GitLab repo and deploy a versioned pipeline to prod in under 10 minutes!

We’re proud of what we’ve built — and our users are too. Now we want to share it with the world!

As your organization scales to 20+ data science teams and 300+ DS/ML/DE engineers, you face a critical challenge: how to build a secure, reliable, and scalable orchestration layer that supports both fast experimentation and stable production workflows. We chose Airflow — and didn’t regret it! But to make it truly work at our scale, we had to rethink its architecture from the ground up.

In this talk, we’ll share how we turned Airflow into a powerful MLOps platform through its core capability: running pipelines across multiple K8s GPU clusters from a single UI (!) using per-cluster worker pools. To support ease of use, we developed MLTool — our own library for fast and standardized DAG development, integrated Vault for secure secret management across teams, enabled real-time logging with S3 persistence and built a custom SparkSubmitOperator for Kerberos-authenticated Spark/Hadoop jobs in Kubernetes. We also streamlined the developer experience — users can generate a GitLab repo and deploy a versioned pipeline to prod in under 10 minutes!

16:30 - 17:00.
By Shanelle Roman & Tahir Fayyaz
10/16/2025 4:30 PM 10/16/2025 5:00 PM UTC AS24: Orchestrating Databricks with Airflow: Unlocking the Power of MVs, Streaming Tables, and AI

As data workloads grow in complexity, teams need seamless orchestration to manage pipelines across batch, streaming, and AI/ML workflows. Apache Airflow provides a flexible and open-source way to orchestrate Databricks’ entire platform, from SQL analytics with Materialized Views (MVs) and Streaming Tables (STs) to AI/ML model training and deployment.

In this session, we’ll showcase how Airflow can automate and optimize Databricks workflows, reducing costs and improving performance for large-scale data processing. We’ll highlight how MVs and STs eliminate manual incremental logic, enable real-time ingestion, and enhance query performance—all while maintaining governance and flexibility. Additionally, we’ll demonstrate how Airflow simplifies ML model lifecycle management by integrating Databricks’ AI/ML capabilities into end-to-end data pipelines.

Whether you’re a dbt user seeking better performance, a data engineer managing streaming pipelines, or an ML practitioner scaling AI workloads, this session will provide actionable insights on using Airflow and Databricks together to build efficient, cost-effective, and future-proof data platforms.

Online

As data workloads grow in complexity, teams need seamless orchestration to manage pipelines across batch, streaming, and AI/ML workflows. Apache Airflow provides a flexible and open-source way to orchestrate Databricks’ entire platform, from SQL analytics with Materialized Views (MVs) and Streaming Tables (STs) to AI/ML model training and deployment.

In this session, we’ll showcase how Airflow can automate and optimize Databricks workflows, reducing costs and improving performance for large-scale data processing. We’ll highlight how MVs and STs eliminate manual incremental logic, enable real-time ingestion, and enhance query performance—all while maintaining governance and flexibility. Additionally, we’ll demonstrate how Airflow simplifies ML model lifecycle management by integrating Databricks’ AI/ML capabilities into end-to-end data pipelines.

17:00 - 17:30.
By Ayoade Adegbite
10/16/2025 5:00 PM 10/16/2025 5:30 PM UTC AS24: Driving Analytics with Open Source: Airbyte, dbt, Airflow & Metabase

In this talk, I’ll walk through how we built an end-to-end analytics pipeline using open-source tools ( Airbyte, dbt, Airflow, and Metabase). At Kunda Kids, we extract data from multiple sources using Airbyte OSS into PostgreSQL, transform it into business-specific data marts with dbt, and automate the entire workflow using Airflow. Our Metabase dashboards provide real-time insights, and we integrate Slack notifications to alert stakeholders when key business metrics change.

This session will cover:

  • Data extraction: Using Airbyte OSS to pull data from multiple sources
  • Transformation & Modeling: How dbt helps create reusable data marts
  • Automation & Orchestration: Managing the workflow with Airflow
  • Data-driven decision-making: Delivering insights through Metabase & Slack alerts
Online

In this talk, I’ll walk through how we built an end-to-end analytics pipeline using open-source tools ( Airbyte, dbt, Airflow, and Metabase). At Kunda Kids, we extract data from multiple sources using Airbyte OSS into PostgreSQL, transform it into business-specific data marts with dbt, and automate the entire workflow using Airflow. Our Metabase dashboards provide real-time insights, and we integrate Slack notifications to alert stakeholders when key business metrics change.

17:30 - 18:00.
By Azhar Izzannada Elbachtiar
10/16/2025 5:30 PM 10/16/2025 6:00 PM UTC AS24: Safe Airflow Upgrades with LLM Guardians: Automating Version Migration and Error Prevention

Upgrading Airflow can be challenging, small change may trigger pipeline failures and long fixing. This session presents solution that simplifies migrations across multiple Airflow releases version by automate analyzing code repositories, change notes, best practices, and other essential documentation using LLM Guardian that identifies potential issues. It scans DAGs and dependency to predict incompatibilities, and generate fixes (e.g., deprecated operator replacements).

You’ll see how the system scans DAG for hidden risks, auto-generates targeted code patches, and post-migration to verify stability and performance. The Demo will highlight rapid, error-free transitions even for legacy systems using backward-compatible code adjustments and rollback plans. Imagine an upgrade that took months into just a couple of hours while reducing risk and keep your environment secure.

Key takeaways:

  • Intelligent Analysis: Cross-references changelogs, GitHub histories, and provider packages to flag risks.
  • Precision Fixes: Updates syntax, dependencies, and configurations while preserving workflows.
  • Confidence Building: Generates audit-ready reports and dry-run results to justify upgrades.
Online

Upgrading Airflow can be challenging, small change may trigger pipeline failures and long fixing. This session presents solution that simplifies migrations across multiple Airflow releases version by automate analyzing code repositories, change notes, best practices, and other essential documentation using LLM Guardian that identifies potential issues. It scans DAGs and dependency to predict incompatibilities, and generate fixes (e.g., deprecated operator replacements).

You’ll see how the system scans DAG for hidden risks, auto-generates targeted code patches, and post-migration to verify stability and performance. The Demo will highlight rapid, error-free transitions even for legacy systems using backward-compatible code adjustments and rollback plans. Imagine an upgrade that took months into just a couple of hours while reducing risk and keep your environment secure.

14:00 - 14:30. Online
By Yuan Luo & Huiliang Zhang
Track: Use cases

iKang Healthcare Group, serving nearly 10 million patients annually, built a centralized healthcare data hub powered by Apache Airflow to support its large-scale, real-time clinical operations. The platform integrates batch and streaming data in a lakehouse architecture, orchestrating complex workflows from data ingestion (HL7/FHIR) to clinical decision support.

Healthcare data’s inherent complexity—spanning structured lab results to unstructured clinical notes—requires dynamic, reliable orchestration. iKang uses Airflow’s DAGs, extensibility, and workflow-as-code capabilities to address challenges like multi-system coordination, semantic data linking, and fault-tolerant automation.

14:30 - 15:00. Online
By Supriya Badgujar & Vishesh Garg
Track: Use cases

We’ve all watched our Airflow DAGs grow from simple pipelines into complex beasts that nobody wants to touch. But what if AI could be your DAG whisperer? In this session, I’ll show you how we’re teaching machines to speak Airflow.

By combining the pattern-recognition superpowers of Large Language Models with traditional code analysis, we’ve built a framework that doesn’t just find problems—it fixes them. Think of it as a smart co-pilot for your data orchestration that catches missing dependencies before they cause midnight alerts and suggests restructuring that actually makes sense.

15:00 - 15:30. Online
By Prerit Munjal
Track: Airflow & ...

Four months ago, our Airflow environment was a black box. DAGs would fail silently, resource contention caused random task failures, and our data engineers were bypassing Airflow entirely because they couldn’t trust it. Our solution? Rebuild Airflow as an observable, self-scaling platform on Kubernetes and Firecracker.

This talk dissects our transformation from Airflow uncertainty to operational confidence. I’ll walk through how we instrumented our entire Airflow stack to provide real-time visibility into workflow execution, built a dynamic scaling system that handles our 5-10x daily traffic spikes, and implemented ephemeral environments(offering true isolation) that cut development cycles by 70%. We’ll explore the exact architecture behind our ““Airflow Observatory”” – a monitoring system that tracks task execution across thousands of DAGs and predicts failures before they happen. Closing the session with a live demo of intelligent autoscaling and how we implemented custom sensors that dramatically improved pipeline reliability.

15:30 - 16:00. Online
By Jon Hiett
Track: Sponsored

As Apache Airflow adoption accelerates for data pipeline orchestration, integrating it effectively into your enterprise’s Automation Center of Excellence (CoE) is crucial for maximizing ROI, ensuring governance, and standardizing best practices. This session explores common challenges faced when bringing specialized tools like Airflow into a broader CoE framework. We’ll demonstrate how leveraging enterprise automation platforms like Automic Automation can simplify this integration by providing centralized orchestration, standardized lifecycle management, and unified auditing for Airflow DAGs alongside other enterprise workloads. Furthermore, discover how Automation Analytics & Intelligence (AAI) can offer the CoE a single pane of glass for monitoring performance, tracking SLAs, and proving the business value of Airflow initiatives within the complete automation landscape. Learn practical strategies to ensure Airflow becomes a well-governed, high-performing component of your overall automation strategy.

16:00 - 16:30.
By Aleksandr Shirokov, Roman Khomenko & Tarasov Alexey
Track: Airflow & ...

As your organization scales to 20+ data science teams and 300+ DS/ML/DE engineers, you face a critical challenge: how to build a secure, reliable, and scalable orchestration layer that supports both fast experimentation and stable production workflows. We chose Airflow — and didn’t regret it! But to make it truly work at our scale, we had to rethink its architecture from the ground up.

In this talk, we’ll share how we turned Airflow into a powerful MLOps platform through its core capability: running pipelines across multiple K8s GPU clusters from a single UI (!) using per-cluster worker pools. To support ease of use, we developed MLTool — our own library for fast and standardized DAG development, integrated Vault for secure secret management across teams, enabled real-time logging with S3 persistence and built a custom SparkSubmitOperator for Kerberos-authenticated Spark/Hadoop jobs in Kubernetes. We also streamlined the developer experience — users can generate a GitLab repo and deploy a versioned pipeline to prod in under 10 minutes!

16:30 - 17:00. Online
By Shanelle Roman & Tahir Fayyaz
Track: Sponsored

As data workloads grow in complexity, teams need seamless orchestration to manage pipelines across batch, streaming, and AI/ML workflows. Apache Airflow provides a flexible and open-source way to orchestrate Databricks’ entire platform, from SQL analytics with Materialized Views (MVs) and Streaming Tables (STs) to AI/ML model training and deployment.

In this session, we’ll showcase how Airflow can automate and optimize Databricks workflows, reducing costs and improving performance for large-scale data processing. We’ll highlight how MVs and STs eliminate manual incremental logic, enable real-time ingestion, and enhance query performance—all while maintaining governance and flexibility. Additionally, we’ll demonstrate how Airflow simplifies ML model lifecycle management by integrating Databricks’ AI/ML capabilities into end-to-end data pipelines.

17:00 - 17:30. Online
By Ayoade Adegbite
Track: Airflow & ...

In this talk, I’ll walk through how we built an end-to-end analytics pipeline using open-source tools ( Airbyte, dbt, Airflow, and Metabase). At Kunda Kids, we extract data from multiple sources using Airbyte OSS into PostgreSQL, transform it into business-specific data marts with dbt, and automate the entire workflow using Airflow. Our Metabase dashboards provide real-time insights, and we integrate Slack notifications to alert stakeholders when key business metrics change.

17:30 - 18:00. Online
By Azhar Izzannada Elbachtiar
Track: Best practices

Upgrading Airflow can be challenging, small change may trigger pipeline failures and long fixing. This session presents solution that simplifies migrations across multiple Airflow releases version by automate analyzing code repositories, change notes, best practices, and other essential documentation using LLM Guardian that identifies potential issues. It scans DAGs and dependency to predict incompatibilities, and generate fixes (e.g., deprecated operator replacements).

You’ll see how the system scans DAG for hidden risks, auto-generates targeted code patches, and post-migration to verify stability and performance. The Demo will highlight rapid, error-free transitions even for legacy systems using backward-compatible code adjustments and rollback plans. Imagine an upgrade that took months into just a couple of hours while reducing risk and keep your environment secure.