All times in Coordinated Universal Time (UTC).

Wednesday, October 15, 2025

14:00
15:00
16:00
16:30
17:00
17:30
18:00
18:30
19:00
14:00 - 16:00.
By Amogh Desai, Ash Berlin-Taylor, Brent Bovenzi, Bugra Ozturk, Daniel Standish, Jed Cunningham, Jens Scheffler, Kaxil Naik, Pierre Jeambrun, Tzu-ping Chung, Vikram Koka, Vincent Beck & Constance Martineau
10/15/2025 2:00 PM 10/15/2025 4:00 PM UTC AS25: Introducing Apache Airflow® 3 – The Next Evolution in Orchestration

Apache Airflow® 3 is here, bringing major improvements to data orchestration. In this keynote, core Airflow contributors will walk through key enhancements that boost flexibility, efficiency, and user experience.

Vikram Koka will kick things off with an overview of Airflow 3, followed by deep dives into DAG versioning (Jed Cunningham), enhanced backfilling (Daniel Standish), and a modernized UI (Brent Bovenzi & Pierre Jeambrun).

Next, Ash Berlin-Taylor, Kaxil Naik, and Amogh Desai will introduce the Task Execution Interface and Task SDK, enabling tasks in any environment and language. Jens Scheffler will showcase the Edge Executor, while Tzu-ping Chung and Vincent Beck will demo event-driven scheduling and data assets. Finally, Buğra Öztürk will unveil CLI enhancements for automation and debugging.

This keynote sets the stage for Airflow 3—don’t miss the chance to learn from the experts shaping the future of workflow orchestration!

Online

Apache Airflow® 3 is here, bringing major improvements to data orchestration. In this keynote, core Airflow contributors will walk through key enhancements that boost flexibility, efficiency, and user experience.

Vikram Koka will kick things off with an overview of Airflow 3, followed by deep dives into DAG versioning (Jed Cunningham), enhanced backfilling (Daniel Standish), and a modernized UI (Brent Bovenzi & Pierre Jeambrun).

Next, Ash Berlin-Taylor, Kaxil Naik, and Amogh Desai will introduce the Task Execution Interface and Task SDK, enabling tasks in any environment and language. Jens Scheffler will showcase the Edge Executor, while Tzu-ping Chung and Vincent Beck will demo event-driven scheduling and data assets. Finally, Buğra Öztürk will unveil CLI enhancements for automation and debugging.

16:00 - 16:30.
By Bugra Ozturk
10/15/2025 4:00 PM 10/15/2025 4:30 PM UTC AS25: Securing Airflow CLI with API

This talk will explore the key changes introduced by AIP-81, focusing on security enhancements and user experience improvements across the entire software development lifecycle.

We will break down the technical advancements from both a security and usability perspective, addressing key questions for Apache Airflow users of all levels. Topics include and not limited to isolating CLI communication to enhance security via leveraging Role-Based Access Control (RBAC) within the API for secure database interactions, clearly defining local vs. remote command execution and future improvements.

Online

This talk will explore the key changes introduced by AIP-81, focusing on security enhancements and user experience improvements across the entire software development lifecycle.

We will break down the technical advancements from both a security and usability perspective, addressing key questions for Apache Airflow users of all levels. Topics include and not limited to isolating CLI communication to enhance security via leveraging Role-Based Access Control (RBAC) within the API for secure database interactions, clearly defining local vs. remote command execution and future improvements.

16:30 - 17:00.
By Rafal Biegacz & Piotr Wieczorek
10/15/2025 4:30 PM 10/15/2025 5:00 PM UTC AS25: Enterprise Auditing: "The Verifiable Data Pipeline"

In this session we will dive deep into leveraging the robust logging and audit capabilities of Google Cloud Platform, Cloud Composer and Apache Airflow to establish a fully transparent and verifiable data orchestration layer.

We’ll demonstrate how to track and attribute every change—from environment configuration to individual task execution—essential for meeting stringent enterprise governance, compliance, and auditing requirements.

Online

In this session we will dive deep into leveraging the robust logging and audit capabilities of Google Cloud Platform, Cloud Composer and Apache Airflow to establish a fully transparent and verifiable data orchestration layer.

We’ll demonstrate how to track and attribute every change—from environment configuration to individual task execution—essential for meeting stringent enterprise governance, compliance, and auditing requirements.

17:00 - 17:30.
By Basil Faruqui
10/15/2025 5:00 PM 10/15/2025 5:30 PM UTC AS25: Orchestrator of Orchestrators: Uniting Airflow Pipelines with Business Applications in Production

Airflow powers thousands of data and ML pipelines—but in the enterprise, these pipelines often need to interact with business-critical systems like ERPs, CRMs, and core banking platforms.

In this demo-driven session we will connect Airflow with Control-M from BMC and showcase how Airflow can participate in end-to-end workflows that span not just data platforms but also transactional business applications.

Session highlights

  • Trigger Airflow DAGs based on business events (e.g., invoice approvals, trade settlements)
  • Feed Airflow pipeline outputs into ERP systems (e.g., SAP) or CRMs (e.g., Salesforce)
  • Orchestrate multi-platform workflows from cloud to mainframe with SLA enforcement, dependency management, and centralized control.
  • Provide unified monitoring and auditing across data and application layers
Online

Airflow powers thousands of data and ML pipelines—but in the enterprise, these pipelines often need to interact with business-critical systems like ERPs, CRMs, and core banking platforms.

In this demo-driven session we will connect Airflow with Control-M from BMC and showcase how Airflow can participate in end-to-end workflows that span not just data platforms but also transactional business applications.

Session highlights

  • Trigger Airflow DAGs based on business events (e.g., invoice approvals, trade settlements)
  • Feed Airflow pipeline outputs into ERP systems (e.g., SAP) or CRMs (e.g., Salesforce)
  • Orchestrate multi-platform workflows from cloud to mainframe with SLA enforcement, dependency management, and centralized control.
  • Provide unified monitoring and auditing across data and application layers
17:30 - 18:00.
By Vishvesh Pandey
10/15/2025 5:30 PM 10/15/2025 6:00 PM UTC AS25: Using Airflow for Real-Time Data Processing at Scale: Architecture, Challenges & Wins

Airflow is a powerhouse for batch data pipelines—but can it be tuned for real-time workloads? In this session, we’ll share how we adapted Apache Airflow to orchestrate near-real-time data processing at scale. From leveraging event-driven triggers and external APIs to minimizing latency with smart DAG design, we’ll dive into real-world architectural patterns, challenges, and optimizations that helped us handle time-sensitive data workflows with confidence. This talk is ideal for teams seeking to expand beyond batch and explore hybrid or real-time orchestration using Airflow.

Online

Airflow is a powerhouse for batch data pipelines—but can it be tuned for real-time workloads? In this session, we’ll share how we adapted Apache Airflow to orchestrate near-real-time data processing at scale. From leveraging event-driven triggers and external APIs to minimizing latency with smart DAG design, we’ll dive into real-world architectural patterns, challenges, and optimizations that helped us handle time-sensitive data workflows with confidence. This talk is ideal for teams seeking to expand beyond batch and explore hybrid or real-time orchestration using Airflow.

18:00 - 18:30.
By Julian LaNeve
10/15/2025 6:00 PM 10/15/2025 6:30 PM UTC AS25: Simplifying DAG creation with an AI-powered IDE for Airflow

As the demand for data products grows, data engineering teams face mounting pressure to deliver more and even faster, often becoming bottlenecks. Astro IDE changes the game.

Astro IDE is an AI-powered code editor built for Apache Airflow. It helps data teams go from idea to production in minutes—generating production-ready DAGs, enabling in-browser testing, and integrating directly with Git.

In this session, see how Astro IDE accelerates DAG creation, debugging, and deployment so data engineering teams can deliver more, 10x faster.

Online

As the demand for data products grows, data engineering teams face mounting pressure to deliver more and even faster, often becoming bottlenecks. Astro IDE changes the game.

Astro IDE is an AI-powered code editor built for Apache Airflow. It helps data teams go from idea to production in minutes—generating production-ready DAGs, enabling in-browser testing, and integrating directly with Git.

In this session, see how Astro IDE accelerates DAG creation, debugging, and deployment so data engineering teams can deliver more, 10x faster.

18:30 - 19:00.
By Milcah Mbithi
10/15/2025 6:30 PM 10/15/2025 7:00 PM UTC AS25: Automating Healthcare Triage with Airflow and Large Language Models

I will talk about how Apache Airflow is used in the healthcare sector with the integration of LLMs to enhance efficiency.

Healthcare generates vast volumes of unstructured data daily, from clinical notes and patient intake forms to chatbot conversations and telehealth reports. Medical teams struggle to keep up, leading to delays in triage and missed critical symptoms. This session explores how Apache Airflow can be the backbone of an automated healthcare triage system powered by Large Language Models (LLMs).

I’ll demonstrate how I designed and implemented an Airflow DAG orchestration pipeline that automates the ingestion, processing, and analysis of patient data from diverse sources in real-time. Airflow schedules and coordinates data extraction, preprocessing, LLM-based symptom extraction, and urgency classification, and finally routes actionable insights to healthcare professionals.

The session will focus on the following;

  • Managing complex workflows in healthcare data pipelines
  • Safely integrating LLM inference calls into Airflow tasks
  • Designing human-in-the-loop checkpoints for ethical AI usage
  • Monitoring workflow health and data quality with Airflow.
Online

I will talk about how Apache Airflow is used in the healthcare sector with the integration of LLMs to enhance efficiency.

Healthcare generates vast volumes of unstructured data daily, from clinical notes and patient intake forms to chatbot conversations and telehealth reports. Medical teams struggle to keep up, leading to delays in triage and missed critical symptoms. This session explores how Apache Airflow can be the backbone of an automated healthcare triage system powered by Large Language Models (LLMs).

19:00 - 19:30.
By Rajesh Bishundeo & Krystal Nzeadibe
10/15/2025 7:00 PM 10/15/2025 7:30 PM UTC AS25: Managed Workflow for Apache Airflow (MWAA): What's New?

MWAA is an AWS-managed service that simplifies the deployment and maintenance of the open-source Apache Airflow data orchestration platform. MWAA has recently introduced several new features to enhance the experience for data engineering teams. Features such as Graceful Worker Replacement Strategy that enable seamless MWAA environment updates with zero downtime, IPv6 support, and in place minor Airflow Version Downgrade are some of the many new improvements MWAA has brought to their users in 2025. Last, but not the least, the release of Airflow 3.0 support brings the latest open-source features introducing a new web-server UI, better isolation and security for environments. These enhancements demonstrate Amazon’s continued investment in making Airflow more accessible and scalable for enterprises through the MWAA service.

Online

MWAA is an AWS-managed service that simplifies the deployment and maintenance of the open-source Apache Airflow data orchestration platform. MWAA has recently introduced several new features to enhance the experience for data engineering teams. Features such as Graceful Worker Replacement Strategy that enable seamless MWAA environment updates with zero downtime, IPv6 support, and in place minor Airflow Version Downgrade are some of the many new improvements MWAA has brought to their users in 2025. Last, but not the least, the release of Airflow 3.0 support brings the latest open-source features introducing a new web-server UI, better isolation and security for environments. These enhancements demonstrate Amazon’s continued investment in making Airflow more accessible and scalable for enterprises through the MWAA service.

14:00 - 16:00. Online
By Amogh Desai, Ash Berlin-Taylor, Brent Bovenzi, Bugra Ozturk, Daniel Standish, Jed Cunningham, Jens Scheffler, Kaxil Naik, Pierre Jeambrun, Tzu-ping Chung, Vikram Koka, Vincent Beck & Constance Martineau
Track: Keynote

Apache Airflow® 3 is here, bringing major improvements to data orchestration. In this keynote, core Airflow contributors will walk through key enhancements that boost flexibility, efficiency, and user experience.

Vikram Koka will kick things off with an overview of Airflow 3, followed by deep dives into DAG versioning (Jed Cunningham), enhanced backfilling (Daniel Standish), and a modernized UI (Brent Bovenzi & Pierre Jeambrun).

Next, Ash Berlin-Taylor, Kaxil Naik, and Amogh Desai will introduce the Task Execution Interface and Task SDK, enabling tasks in any environment and language. Jens Scheffler will showcase the Edge Executor, while Tzu-ping Chung and Vincent Beck will demo event-driven scheduling and data assets. Finally, Buğra Öztürk will unveil CLI enhancements for automation and debugging.

16:00 - 16:30. Online
By Bugra Ozturk
Track: Airflow 3

This talk will explore the key changes introduced by AIP-81, focusing on security enhancements and user experience improvements across the entire software development lifecycle.

We will break down the technical advancements from both a security and usability perspective, addressing key questions for Apache Airflow users of all levels. Topics include and not limited to isolating CLI communication to enhance security via leveraging Role-Based Access Control (RBAC) within the API for secure database interactions, clearly defining local vs. remote command execution and future improvements.

16:30 - 17:00. Online
By Rafal Biegacz & Piotr Wieczorek
Track: Sponsored

In this session we will dive deep into leveraging the robust logging and audit capabilities of Google Cloud Platform, Cloud Composer and Apache Airflow to establish a fully transparent and verifiable data orchestration layer.

We’ll demonstrate how to track and attribute every change—from environment configuration to individual task execution—essential for meeting stringent enterprise governance, compliance, and auditing requirements.

17:00 - 17:30. Online
By Basil Faruqui
Track: Sponsored

Airflow powers thousands of data and ML pipelines—but in the enterprise, these pipelines often need to interact with business-critical systems like ERPs, CRMs, and core banking platforms.

In this demo-driven session we will connect Airflow with Control-M from BMC and showcase how Airflow can participate in end-to-end workflows that span not just data platforms but also transactional business applications.

Session highlights

  • Trigger Airflow DAGs based on business events (e.g., invoice approvals, trade settlements)
  • Feed Airflow pipeline outputs into ERP systems (e.g., SAP) or CRMs (e.g., Salesforce)
  • Orchestrate multi-platform workflows from cloud to mainframe with SLA enforcement, dependency management, and centralized control.
  • Provide unified monitoring and auditing across data and application layers
17:30 - 18:00. Online
By Vishvesh Pandey
Track: Use cases

Airflow is a powerhouse for batch data pipelines—but can it be tuned for real-time workloads? In this session, we’ll share how we adapted Apache Airflow to orchestrate near-real-time data processing at scale. From leveraging event-driven triggers and external APIs to minimizing latency with smart DAG design, we’ll dive into real-world architectural patterns, challenges, and optimizations that helped us handle time-sensitive data workflows with confidence. This talk is ideal for teams seeking to expand beyond batch and explore hybrid or real-time orchestration using Airflow.

18:00 - 18:30. Online
By Julian LaNeve
Track: Sponsored

As the demand for data products grows, data engineering teams face mounting pressure to deliver more and even faster, often becoming bottlenecks. Astro IDE changes the game.

Astro IDE is an AI-powered code editor built for Apache Airflow. It helps data teams go from idea to production in minutes—generating production-ready DAGs, enabling in-browser testing, and integrating directly with Git.

In this session, see how Astro IDE accelerates DAG creation, debugging, and deployment so data engineering teams can deliver more, 10x faster.

18:30 - 19:00. Online
By Milcah Mbithi
Track: Use cases

I will talk about how Apache Airflow is used in the healthcare sector with the integration of LLMs to enhance efficiency.

Healthcare generates vast volumes of unstructured data daily, from clinical notes and patient intake forms to chatbot conversations and telehealth reports. Medical teams struggle to keep up, leading to delays in triage and missed critical symptoms. This session explores how Apache Airflow can be the backbone of an automated healthcare triage system powered by Large Language Models (LLMs).

19:00 - 19:30. Online
By Rajesh Bishundeo & Krystal Nzeadibe
Track: Sponsored

MWAA is an AWS-managed service that simplifies the deployment and maintenance of the open-source Apache Airflow data orchestration platform. MWAA has recently introduced several new features to enhance the experience for data engineering teams. Features such as Graceful Worker Replacement Strategy that enable seamless MWAA environment updates with zero downtime, IPv6 support, and in place minor Airflow Version Downgrade are some of the many new improvements MWAA has brought to their users in 2025. Last, but not the least, the release of Airflow 3.0 support brings the latest open-source features introducing a new web-server UI, better isolation and security for environments. These enhancements demonstrate Amazon’s continued investment in making Airflow more accessible and scalable for enterprises through the MWAA service.

Thursday, October 16, 2025

14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
14:00 - 14:30.
By Yuan Luo & Huiliang Zhang
10/16/2025 2:00 PM 10/16/2025 2:30 PM UTC AS25: Empowering Precision Healthcare with Apache Airflow-iKang Healthcare Group’s DataHub Journey

iKang Healthcare Group, serving nearly 10 million patients annually, built a centralized healthcare data hub powered by Apache Airflow to support its large-scale, real-time clinical operations. The platform integrates batch and streaming data in a lakehouse architecture, orchestrating complex workflows from data ingestion (HL7/FHIR) to clinical decision support.

Healthcare data’s inherent complexity—spanning structured lab results to unstructured clinical notes—requires dynamic, reliable orchestration. iKang uses Airflow’s DAGs, extensibility, and workflow-as-code capabilities to address challenges like multi-system coordination, semantic data linking, and fault-tolerant automation.

iKang extended Airflow with cross-DAG event triggers, task priority weights, LLM-driven clinical text processing, and a visual drag-and-drop DAG builder for medical teams. These innovations improved diagnostic turnaround, patient safety, and cross-system workflow visibility.

iKang’s work demonstrates Airflow’s power in transforming healthcare data infrastructure and advancing intelligent, scalable patient care.

Online

iKang Healthcare Group, serving nearly 10 million patients annually, built a centralized healthcare data hub powered by Apache Airflow to support its large-scale, real-time clinical operations. The platform integrates batch and streaming data in a lakehouse architecture, orchestrating complex workflows from data ingestion (HL7/FHIR) to clinical decision support.

Healthcare data’s inherent complexity—spanning structured lab results to unstructured clinical notes—requires dynamic, reliable orchestration. iKang uses Airflow’s DAGs, extensibility, and workflow-as-code capabilities to address challenges like multi-system coordination, semantic data linking, and fault-tolerant automation.

14:30 - 15:00.
By Muhammed Irshad & Sanchit Sreekanth
10/16/2025 2:30 PM 10/16/2025 3:00 PM UTC AS25: Vayu: The Airflow Copilot

Vayu is a conversational copilot for Apache Airflow, developed at Prevalent AI to help data engineers manage, troubleshoot, and fix pipelines using natural language. Deployments often fail silently due to misconfigurations, missing connections, or runtime issues impossible to identify in unit tests. Vayu tackles these via a troubleshooting agent that inspects logs, metrics, configs, and runtime state to find root causes and suggest fixes saving engineers significant troubleshooting time. It can also apply approved fixes to DAG code and commit them to your version control system.

Key Capabilities:

  • Troubleshooting Agent: Inspects logs, configs, variables, and connections to find root causes and suggest fixes.
  • Pipeline Mechanic Agent: Suggests code-level fixes e.g., missing connections or bad imports and, once approved, commits them to version control.
  • DAG Manager Agent: Understands DAG logic, suggests improvements, and can trigger DAGs conversationally.

Architecture: Built with LangGraph and the open-source Airflow MCP server, ensuring secure, audited interactions. LLMs never access Airflow directly. Full codebase will be open-sourced.

Online

Vayu is a conversational copilot for Apache Airflow, developed at Prevalent AI to help data engineers manage, troubleshoot, and fix pipelines using natural language. Deployments often fail silently due to misconfigurations, missing connections, or runtime issues impossible to identify in unit tests. Vayu tackles these via a troubleshooting agent that inspects logs, metrics, configs, and runtime state to find root causes and suggest fixes saving engineers significant troubleshooting time. It can also apply approved fixes to DAG code and commit them to your version control system.

15:00 - 15:30.
By Salih Goktug Kose & Burak Ozdemir
10/16/2025 3:00 PM 10/16/2025 3:30 PM UTC AS25: From Complexity to Simplicity with TaskHarbor: Trendyol's Path to a Unified Orchestration Platform

At Trendyol, Turkey’s leading e-commerce company, Apache Airflow powers our task orchestration, handling DAGs with 500+ tasks, complex interdependencies, and diverse environments. Managing on-prem Airflow instances posed challenges in scalability, maintenance, and deployment. To address these, we built TaskHarbor, a fully managed orchestration platform with a hybrid architecture—combining Airflow on GKE with on-prem resources for optimal performance and efficiency.

This talk covers how we:

  • Enabled seamless DAG synchronization across environments using GCS Fuse.
  • Optimized workload distribution via GCP’s HTTPS & TCP Load Balancers.
  • Automated infrastructure provisioning (GKE, CloudSQL, Kubernetes) using Terraform.
  • Simplified Airflow deployments by replacing Helm YAML files with a custom templating tool, reducing configurations to 10-15 lines.
  • Built a fully automated deployment pipeline, ensuring zero developer intervention.

We enhanced efficiency, reliability, and automation in hybrid orchestration by embracing a scalable, maintainable, and cloud-native strategy. Attendees will obtain practical insights into architecting Airflow at scale and optimizing deployments.

Online

At Trendyol, Turkey’s leading e-commerce company, Apache Airflow powers our task orchestration, handling DAGs with 500+ tasks, complex interdependencies, and diverse environments. Managing on-prem Airflow instances posed challenges in scalability, maintenance, and deployment. To address these, we built TaskHarbor, a fully managed orchestration platform with a hybrid architecture—combining Airflow on GKE with on-prem resources for optimal performance and efficiency.

This talk covers how we:

  • Enabled seamless DAG synchronization across environments using GCS Fuse.
  • Optimized workload distribution via GCP’s HTTPS & TCP Load Balancers.
  • Automated infrastructure provisioning (GKE, CloudSQL, Kubernetes) using Terraform.
  • Simplified Airflow deployments by replacing Helm YAML files with a custom templating tool, reducing configurations to 10-15 lines.
  • Built a fully automated deployment pipeline, ensuring zero developer intervention.

We enhanced efficiency, reliability, and automation in hybrid orchestration by embracing a scalable, maintainable, and cloud-native strategy. Attendees will obtain practical insights into architecting Airflow at scale and optimizing deployments.

15:30 - 16:00.
By Jon Hiett
10/16/2025 3:30 PM 10/16/2025 4:00 PM UTC AS25: Airflow & Your Automation CoE: Streamlining Integration for Enterprise-Wide Governance and Value

As Apache Airflow adoption accelerates for data pipeline orchestration, integrating it effectively into your enterprise’s Automation Center of Excellence (CoE) is crucial for maximizing ROI, ensuring governance, and standardizing best practices. This session explores common challenges faced when bringing specialized tools like Airflow into a broader CoE framework. We’ll demonstrate how leveraging enterprise automation platforms like Automic Automation can simplify this integration by providing centralized orchestration, standardized lifecycle management, and unified auditing for Airflow DAGs alongside other enterprise workloads. Furthermore, discover how Automation Analytics & Intelligence (AAI) can offer the CoE a single pane of glass for monitoring performance, tracking SLAs, and proving the business value of Airflow initiatives within the complete automation landscape. Learn practical strategies to ensure Airflow becomes a well-governed, high-performing component of your overall automation strategy.

Online

As Apache Airflow adoption accelerates for data pipeline orchestration, integrating it effectively into your enterprise’s Automation Center of Excellence (CoE) is crucial for maximizing ROI, ensuring governance, and standardizing best practices. This session explores common challenges faced when bringing specialized tools like Airflow into a broader CoE framework. We’ll demonstrate how leveraging enterprise automation platforms like Automic Automation can simplify this integration by providing centralized orchestration, standardized lifecycle management, and unified auditing for Airflow DAGs alongside other enterprise workloads. Furthermore, discover how Automation Analytics & Intelligence (AAI) can offer the CoE a single pane of glass for monitoring performance, tracking SLAs, and proving the business value of Airflow initiatives within the complete automation landscape. Learn practical strategies to ensure Airflow becomes a well-governed, high-performing component of your overall automation strategy.

16:00 - 16:30.
By Aleksandr Shirokov, Roman Khomenko & Tarasov Alexey
10/16/2025 4:00 PM 10/16/2025 4:30 PM UTC AS25: Building an MLOps Platform for 300+ ML/DS Specialists on Top of Airflow

As your organization scales to 20+ data science teams and 300+ DS/ML/DE engineers, you face a critical challenge: how to build a secure, reliable, and scalable orchestration layer that supports both fast experimentation and stable production workflows. We chose Airflow — and didn’t regret it! But to make it truly work at our scale, we had to rethink its architecture from the ground up.

In this talk, we’ll share how we turned Airflow into a powerful MLOps platform through its core capability: running pipelines across multiple K8s GPU clusters from a single UI (!) using per-cluster worker pools. To support ease of use, we developed MLTool — our own library for fast and standardized DAG development, integrated Vault for secure secret management across teams, enabled real-time logging with S3 persistence and built a custom SparkSubmitOperator for Kerberos-authenticated Spark/Hadoop jobs in Kubernetes. We also streamlined the developer experience — users can generate a GitLab repo and deploy a versioned pipeline to prod in under 10 minutes!

We’re proud of what we’ve built — and our users are too. Now we want to share it with the world!

As your organization scales to 20+ data science teams and 300+ DS/ML/DE engineers, you face a critical challenge: how to build a secure, reliable, and scalable orchestration layer that supports both fast experimentation and stable production workflows. We chose Airflow — and didn’t regret it! But to make it truly work at our scale, we had to rethink its architecture from the ground up.

In this talk, we’ll share how we turned Airflow into a powerful MLOps platform through its core capability: running pipelines across multiple K8s GPU clusters from a single UI (!) using per-cluster worker pools. To support ease of use, we developed MLTool — our own library for fast and standardized DAG development, integrated Vault for secure secret management across teams, enabled real-time logging with S3 persistence and built a custom SparkSubmitOperator for Kerberos-authenticated Spark/Hadoop jobs in Kubernetes. We also streamlined the developer experience — users can generate a GitLab repo and deploy a versioned pipeline to prod in under 10 minutes!

16:30 - 17:00.
By Shanelle Roman & Tahir Fayyaz
10/16/2025 4:30 PM 10/16/2025 5:00 PM UTC AS25: Orchestrating Databricks with Airflow: Unlocking the Power of MVs, Streaming Tables, and AI

As data workloads grow in complexity, teams need seamless orchestration to manage pipelines across batch, streaming, and AI/ML workflows. Apache Airflow provides a flexible and open-source way to orchestrate Databricks’ entire platform, from SQL analytics with Materialized Views (MVs) and Streaming Tables (STs) to AI/ML model training and deployment.

In this session, we’ll showcase how Airflow can automate and optimize Databricks workflows, reducing costs and improving performance for large-scale data processing. We’ll highlight how MVs and STs eliminate manual incremental logic, enable real-time ingestion, and enhance query performance—all while maintaining governance and flexibility. Additionally, we’ll demonstrate how Airflow simplifies ML model lifecycle management by integrating Databricks’ AI/ML capabilities into end-to-end data pipelines.

Whether you’re a dbt user seeking better performance, a data engineer managing streaming pipelines, or an ML practitioner scaling AI workloads, this session will provide actionable insights on using Airflow and Databricks together to build efficient, cost-effective, and future-proof data platforms.

Online

As data workloads grow in complexity, teams need seamless orchestration to manage pipelines across batch, streaming, and AI/ML workflows. Apache Airflow provides a flexible and open-source way to orchestrate Databricks’ entire platform, from SQL analytics with Materialized Views (MVs) and Streaming Tables (STs) to AI/ML model training and deployment.

In this session, we’ll showcase how Airflow can automate and optimize Databricks workflows, reducing costs and improving performance for large-scale data processing. We’ll highlight how MVs and STs eliminate manual incremental logic, enable real-time ingestion, and enhance query performance—all while maintaining governance and flexibility. Additionally, we’ll demonstrate how Airflow simplifies ML model lifecycle management by integrating Databricks’ AI/ML capabilities into end-to-end data pipelines.

17:00 - 17:30.
By Ayoade Adegbite
10/16/2025 5:00 PM 10/16/2025 5:30 PM UTC AS25: Driving Analytics with Open Source: Airbyte, dbt, Airflow & Metabase

In this talk, I’ll walk through how we built an end-to-end analytics pipeline using open-source tools ( Airbyte, dbt, Airflow, and Metabase). At WirePick, we extract data from multiple sources using Airbyte OSS into PostgreSQL, transform it into business-specific data marts with dbt, and automate the entire workflow using Airflow. Our Metabase dashboards provide real-time insights, and we integrate Slack notifications to alert stakeholders when key business metrics change.

This session will cover:

  • Data extraction: Using Airbyte OSS to pull data from multiple sources
  • Transformation & Modeling: How dbt helps create reusable data marts
  • Automation & Orchestration: Managing the workflow with Airflow
  • Data-driven decision-making: Delivering insights through Metabase & Slack alerts
Online

In this talk, I’ll walk through how we built an end-to-end analytics pipeline using open-source tools ( Airbyte, dbt, Airflow, and Metabase). At WirePick, we extract data from multiple sources using Airbyte OSS into PostgreSQL, transform it into business-specific data marts with dbt, and automate the entire workflow using Airflow. Our Metabase dashboards provide real-time insights, and we integrate Slack notifications to alert stakeholders when key business metrics change.

This session will cover:

17:30 - 18:00.
By Azhar Izzannada Elbachtiar
10/16/2025 5:30 PM 10/16/2025 6:00 PM UTC AS25: Safe Airflow Upgrades with LLM Guardians: Automating Version Migration and Error Prevention

Upgrading Airflow can be challenging, small change may trigger pipeline failures and long fixing. This session presents solution that simplifies migrations across multiple Airflow releases version by automate analyzing code repositories, change notes, best practices, and other essential documentation using LLM Guardian that identifies potential issues. It scans DAGs and dependency to predict incompatibilities, and generate fixes (e.g., deprecated operator replacements).

You’ll see how the system scans DAG for hidden risks, auto-generates targeted code patches, and post-migration to verify stability and performance. The Demo will highlight rapid, error-free transitions even for legacy systems using backward-compatible code adjustments and rollback plans. Imagine an upgrade that took months into just a couple of hours while reducing risk and keep your environment secure.

Key takeaways:

  • Intelligent Analysis: Cross-references changelogs, GitHub histories, and provider packages to flag risks.
  • Precision Fixes: Updates syntax, dependencies, and configurations while preserving workflows.
  • Confidence Building: Generates audit-ready reports and dry-run results to justify upgrades.
Online

Upgrading Airflow can be challenging, small change may trigger pipeline failures and long fixing. This session presents solution that simplifies migrations across multiple Airflow releases version by automate analyzing code repositories, change notes, best practices, and other essential documentation using LLM Guardian that identifies potential issues. It scans DAGs and dependency to predict incompatibilities, and generate fixes (e.g., deprecated operator replacements).

You’ll see how the system scans DAG for hidden risks, auto-generates targeted code patches, and post-migration to verify stability and performance. The Demo will highlight rapid, error-free transitions even for legacy systems using backward-compatible code adjustments and rollback plans. Imagine an upgrade that took months into just a couple of hours while reducing risk and keep your environment secure.

14:00 - 14:30. Online
By Yuan Luo & Huiliang Zhang
Track: Use cases

iKang Healthcare Group, serving nearly 10 million patients annually, built a centralized healthcare data hub powered by Apache Airflow to support its large-scale, real-time clinical operations. The platform integrates batch and streaming data in a lakehouse architecture, orchestrating complex workflows from data ingestion (HL7/FHIR) to clinical decision support.

Healthcare data’s inherent complexity—spanning structured lab results to unstructured clinical notes—requires dynamic, reliable orchestration. iKang uses Airflow’s DAGs, extensibility, and workflow-as-code capabilities to address challenges like multi-system coordination, semantic data linking, and fault-tolerant automation.

14:30 - 15:00. Online
By Muhammed Irshad & Sanchit Sreekanth
Track: Use cases

Vayu is a conversational copilot for Apache Airflow, developed at Prevalent AI to help data engineers manage, troubleshoot, and fix pipelines using natural language. Deployments often fail silently due to misconfigurations, missing connections, or runtime issues impossible to identify in unit tests. Vayu tackles these via a troubleshooting agent that inspects logs, metrics, configs, and runtime state to find root causes and suggest fixes saving engineers significant troubleshooting time. It can also apply approved fixes to DAG code and commit them to your version control system.

15:00 - 15:30. Online
By Salih Goktug Kose & Burak Ozdemir
Track: Airflow & ...

At Trendyol, Turkey’s leading e-commerce company, Apache Airflow powers our task orchestration, handling DAGs with 500+ tasks, complex interdependencies, and diverse environments. Managing on-prem Airflow instances posed challenges in scalability, maintenance, and deployment. To address these, we built TaskHarbor, a fully managed orchestration platform with a hybrid architecture—combining Airflow on GKE with on-prem resources for optimal performance and efficiency.

This talk covers how we:

  • Enabled seamless DAG synchronization across environments using GCS Fuse.
  • Optimized workload distribution via GCP’s HTTPS & TCP Load Balancers.
  • Automated infrastructure provisioning (GKE, CloudSQL, Kubernetes) using Terraform.
  • Simplified Airflow deployments by replacing Helm YAML files with a custom templating tool, reducing configurations to 10-15 lines.
  • Built a fully automated deployment pipeline, ensuring zero developer intervention.

We enhanced efficiency, reliability, and automation in hybrid orchestration by embracing a scalable, maintainable, and cloud-native strategy. Attendees will obtain practical insights into architecting Airflow at scale and optimizing deployments.

15:30 - 16:00. Online
By Jon Hiett
Track: Sponsored

As Apache Airflow adoption accelerates for data pipeline orchestration, integrating it effectively into your enterprise’s Automation Center of Excellence (CoE) is crucial for maximizing ROI, ensuring governance, and standardizing best practices. This session explores common challenges faced when bringing specialized tools like Airflow into a broader CoE framework. We’ll demonstrate how leveraging enterprise automation platforms like Automic Automation can simplify this integration by providing centralized orchestration, standardized lifecycle management, and unified auditing for Airflow DAGs alongside other enterprise workloads. Furthermore, discover how Automation Analytics & Intelligence (AAI) can offer the CoE a single pane of glass for monitoring performance, tracking SLAs, and proving the business value of Airflow initiatives within the complete automation landscape. Learn practical strategies to ensure Airflow becomes a well-governed, high-performing component of your overall automation strategy.

16:00 - 16:30.
By Aleksandr Shirokov, Roman Khomenko & Tarasov Alexey
Track: Airflow & ...

As your organization scales to 20+ data science teams and 300+ DS/ML/DE engineers, you face a critical challenge: how to build a secure, reliable, and scalable orchestration layer that supports both fast experimentation and stable production workflows. We chose Airflow — and didn’t regret it! But to make it truly work at our scale, we had to rethink its architecture from the ground up.

In this talk, we’ll share how we turned Airflow into a powerful MLOps platform through its core capability: running pipelines across multiple K8s GPU clusters from a single UI (!) using per-cluster worker pools. To support ease of use, we developed MLTool — our own library for fast and standardized DAG development, integrated Vault for secure secret management across teams, enabled real-time logging with S3 persistence and built a custom SparkSubmitOperator for Kerberos-authenticated Spark/Hadoop jobs in Kubernetes. We also streamlined the developer experience — users can generate a GitLab repo and deploy a versioned pipeline to prod in under 10 minutes!

16:30 - 17:00. Online
By Shanelle Roman & Tahir Fayyaz
Track: Sponsored

As data workloads grow in complexity, teams need seamless orchestration to manage pipelines across batch, streaming, and AI/ML workflows. Apache Airflow provides a flexible and open-source way to orchestrate Databricks’ entire platform, from SQL analytics with Materialized Views (MVs) and Streaming Tables (STs) to AI/ML model training and deployment.

In this session, we’ll showcase how Airflow can automate and optimize Databricks workflows, reducing costs and improving performance for large-scale data processing. We’ll highlight how MVs and STs eliminate manual incremental logic, enable real-time ingestion, and enhance query performance—all while maintaining governance and flexibility. Additionally, we’ll demonstrate how Airflow simplifies ML model lifecycle management by integrating Databricks’ AI/ML capabilities into end-to-end data pipelines.

17:00 - 17:30. Online
By Ayoade Adegbite
Track: Airflow & ...

In this talk, I’ll walk through how we built an end-to-end analytics pipeline using open-source tools ( Airbyte, dbt, Airflow, and Metabase). At WirePick, we extract data from multiple sources using Airbyte OSS into PostgreSQL, transform it into business-specific data marts with dbt, and automate the entire workflow using Airflow. Our Metabase dashboards provide real-time insights, and we integrate Slack notifications to alert stakeholders when key business metrics change.

This session will cover:

17:30 - 18:00. Online
By Azhar Izzannada Elbachtiar
Track: Best practices

Upgrading Airflow can be challenging, small change may trigger pipeline failures and long fixing. This session presents solution that simplifies migrations across multiple Airflow releases version by automate analyzing code repositories, change notes, best practices, and other essential documentation using LLM Guardian that identifies potential issues. It scans DAGs and dependency to predict incompatibilities, and generate fixes (e.g., deprecated operator replacements).

You’ll see how the system scans DAG for hidden risks, auto-generates targeted code patches, and post-migration to verify stability and performance. The Demo will highlight rapid, error-free transitions even for legacy systems using backward-compatible code adjustments and rollback plans. Imagine an upgrade that took months into just a couple of hours while reducing risk and keep your environment secure.