These are the confirmed sessions for Airflow Summit 2026.

Title

Advanced Deadine Alerts: Writing your own custom References and Callbacks

by Dennis Ferruzzi

Airflow 3’s Deadline Alerts let you set “need-by” times on DAGs and fire callbacks when deadlines are missed. The built-in references cover common cases, but the real power is the feature’s extensibility. In this workshop, led by the feature’s author, we will go beyond the basics and explore these more advanced features.

We start with an overview of how DeadlineAlert, DeadlineReference, and Callback fit together, and how the scheduler detects misses. Then, a guided project: coding our own Callback implementation and building custom DeadlineReference classes using the @deadline_reference decorator, implementing _evaluate_with(), serialization, and required_kwargs. We wrap up with a hackathon-style “competition” to build the most creative WORKING DeadlineReference (business hours, the last time it didn’t rain in Vancouver, the moon phases, the last time the Leafs won the cup… anything goes, as long as it serializes and returns a valid datetime).

Airflow 3.0 Asset Watchers: Cross-Domain Data Mesh Orchestration with AI-Assisted Deployment

by Corrine Tan & Haofei Feng

Data Mesh decentralises data ownership across business domains. In regulated industries each domain operates in its own account where producers publish data products and consumers subscribe. This enforces governance, limits blast radius and preserves autonomy. When each domain runs its own Airflow, orchestrating across these boundaries is the central challenge. Airflow 2.4 introduced data-aware scheduling which were designed for single Airflow instance with no native cross-instance event propagation. In practice this meant building polling sensors that queried the producer REST API to check upstream completion, but it is unreliable as events were lost and ordering not guaranteed. Airflow 3.0 resolves this with Event-driven scheduling via AssetWatcher. The Triggerer monitors a message queue and triggers the consumer DAG when the producer publishes a completion event. This talk traces that journey through a regulated enterprise Data Mesh. We also share how we built an agentic AI skills framework that encodes operational Airflow knowledge into reusable skills, enabling an AI agent to autonomously deploy, validate and troubleshoot the cross-environment pattern end-to-end.

Airflow as a Harness: The Workflow That Merged Itself

by Ryan Hatter

This talk is the story of getting a PR merged into a Apache Airflow without writing a single line of code, using Apache Airflow itself as an agentic orchestration harness to replicate the functionality of Claude Code for any pluggable LLM.

We’ll walk through how Airflow’s AIP-99 Dag functionality map naturally onto the tool-use loops, context management, and decision branching that power modern agentic coding workflows. The result is a model-agnostic harness that can read a codebase, reason about changes, write and test code, and deploy a commit to a git repository, all orchestrated as an Airflow Dag.

Airflow at the Heart of Equifaxs Sata Processing

by Yuvaraj Sankaran

At Equifax, Apache Airflow is used across many departments, helping Data Engineers, Data Scientists, and Business Analysts in their daily work.

This presentation is about how to use modern orchestration technology at the heart of data processing and business processes to support daily company operations.

Airflow Autopilot: The Generate-Verify-Refine Loop That Makes Pipeline Authoring Truly AI-Native

by Yifan Wang

Today’s Pipeline authoring is synchronous: writing code, chasing error - every step blocks the engineer until resolved. You can’t step away or parallelize. Airflow Autopilot reimagines this to be AI-native and asynchronous. Describe your pipeline’s intent. The agent takes over - orchestrating two classes of purpose-built tools: tools that generate the DAG code and automate setup, and scorer tools that evaluate it across dimensions: e.g. data discovery, auth, compliance, DAG validation, even end-to-end execution. Every scorer returns a deterministic result and structured, prioritized hints. The agent runs the generate → verify → refine loop — calling scorers, reading hints, fixing code, re-scoring — until every dimension passes. You come back to a PR with DAGs that have been iteratively built, tested, and ready for review. For 10,000+ Airflow users, this shifts the engineer from executor to reviewer: you own the intent and final judgment, the agent owns the execution. Attendees leave with the architecture for an AI-native authoring experience, the principles behind decomposing work into scorer-sized verification units, and what it takes to scale this in production.

Airflow in a Box: Methodology or Madness?

by Nicholas Redd

Airflow testing today is a patchwork: you can validate code and catch obvious breakage early, but many production failures live in the seams—runtime state, persistence, serialization boundaries, API behavior, and the way a real deployment executes work across components. The fast tools are valuable, yet they don’t fully model Airflow as a system. Meanwhile, the default development posture nudges you toward single-process behavior and away from realistic concurrency and state interactions. The result is a familiar trade: quick feedback vs. meaningful confidence. “Airflow in a Box” is a step toward collapsing that trade—making deeper, more production-relevant tests accessible without requiring a full, heavyweight instance for every iteration. In this talk, we’ll discuss methodology, quantify slickness, and share real code!

Airflow orchestration in Data Mesh Architecture

by Ramesh Babu

Airflow has become default orchestration tool in our day to day Data Engineering business. Here I want to focus on the Data Mesh Architecture and how we can plug airflow orchestration to our major workflows. Data Mesh has become default architecture for most of the large organisations as well to build data platform. And orchestration is a key enabler for this across data products, ingestion and transformation. How they work together: Airflow for Data Mesh Pipelines: Airflow can be used to orchestrate data pipelines within a data mesh architecture, ensuring the smooth flow of data between different domains. OpenMetadata and Data Mesh: OpenMetadata can be used to provide visibility into the metadata of data products within a data mesh, helping to understand data structure, relationships, and context. Benefits of using Airflow with Data Mesh: Increased Agility: Airflow allows for flexible data pipelines, enabling quick adaptation to changing business needs within a data mesh. Enhanced Data Governance: Data mesh promotes federated data governance, which, when combined with Airflow, helps ensure data quality and compliance across the organizations.

Airflow to the rescue: managing chemical emergencies

by Eloi Codina Torras

At Meteosim, Airflow is the engine for our entire decision system. It runs daily weather and air quality forecasts on schedule, but it also enables OnaChem React, a software that lets users manage chemical emergencies in real-time, and helps us manage consultancy projects.

This talk covers how we set up Airflow 3 to handle five very different types of workloads:

1. Daily Forecasts: Running physics simulations for weather and air quality.

2. Sensor Validation: Ingest data from thousands of sensors and validate it.

3. Human-in-the-Loop: Managing long-running consultancy projects where Dags pause and wait for expert approval.

4. Emergency Response: Help users manage chemical emergencies using multiple real-time toxic dispersion simulations with pre-defined workflows through our SaaS platform.

5. Training AI models: Track multiple experiments.

We will explain why Airflow 3 was necessary to make this work. You will see how we orchestrate physics, AI, and human decisions in a single environment.

An SQL Query is Just a DAG: Building an SQL Engine on Apache Airflow

by Hussein Awala

Ever wondered what happens between typing SELECT ... GROUP BY and getting results back? Inside every SQL engine lives a scheduler that breaks your query into a DAG of tasks — shuffling, sorting, aggregating, and parallelizing work across partitions. Sound familiar?

In this talk, I’ll demystify SQL engine internals by building one on top of Apache Airflow. We’ll take a SQL query, parse it, optimize it, and transform it into a DAG of Airflow tasks that you can watch execute step by step in the Airflow UI.

Anatomy of a Task Instance: From Scheduled to Done

by Cedrik Neumann

In this session I will provide a deep dive into a task instance’s lifetime. From when the scheduler decides for it to be scheduled until it is marked as success or failed.

We will explore when in the process concepts like concurrency, pools and priority weights apply, what it means for a task to be “queued” and where things like cluster policies, operator links, callbacks and event listeners are evaluated.

Architecting the Center of Excellence: A Strategic Blueprint for Federated Airflow at Scale

by Akanksha Khushboo

As Airflow becomes mission-critical, centralized data teams often become a bottleneck. This session provides a framework for building a Center of Excellence (CoE) that empowers autonomous domain teams while maintaining global standards.

We detail the shift toward “Data Platform Engineering,” treating orchestration as a product. Using case studies from large-scale organizations, we discuss a three-layer model: Strategic (governance), Tactical (platform development), and Operational (business unit execution).

Attendees will learn to design a self-service platform with guardrails that manages multiple teams without interference. We will explore using Airflow 3.0’s architecture for task isolation and conclude with a guide on aligning cross-functional teams and measuring value through consumption-based billing.

Asset Partitions: Matching Workflow to the Right Data

by Wei Lee

Asset partitions are a key building block in Expanded Data Awareness. This session explains the core semantics of partition definitions, partition mappings, and backfill behavior in AIP-76. I will show how these pieces fit together in the current design, then discuss where asset partitions can go next, including improvements in authoring ergonomics, observability, and partition-aware workflow capabilities. Attendees will leave with a clear mental model of today’s implementation and a practical view of future direction.

Beyond Containers: Securely Orchestrating AI Agents with Strong Isolation in Airflow

by Uriel Munoz

AI agents break the traditional Airflow trust model. While standard tasks are deterministic, agents execute dynamic logic and invoke external tools, meaning untrusted code is suddenly running inside standard containers sharing your host kernel. This session demonstrates how to secure AI workloads in Airflow without rewriting the orchestrator or building custom executors. We will introduce a custom, policy-driven @agent TaskFlow abstraction that leverages Kubernetes executor_config overrides (like runtimeClassName) to isolate workloads on the fly.

Beyond Multi-Cluster Airflow: Operating GPU Workloads at Scale

by Aleksandr Shirokov, Tarasov Alexey & Vladislav Repev

At last year’s Airflow Summit, we shared how we built a multi-cluster orchestration layer on top of Apache Airflow to run ML workloads across multiple Kubernetes GPU clusters.

Once hundreds of ML engineers started running GPU pipelines in production, we discovered that orchestration alone is not enough. Operating multi-cluster GPU infrastructure introduces new challenges: controlling GPU allocation across teams, observing pipelines across clusters, and helping users run workloads efficiently without wasting expensive GPU resources.

Breaking the Monolith: Implementing Airflow 3.x Remote Execution for Multi-Team Environments

by Kowsy Narayan

Problem Statement: As our data platform scaled, our shared Airflow 2.9 deployment became a bottleneck with critical challenges: development friction from shared repositories, custom security workarounds, release coordination complexity, data isolation concerns, and cost attribution opacity. When Airflow 3.x launched with hybrid execution support, we restructured our architecture. Following a successful proof of value, we implemented remote execution - enabling teams to run workloads in isolated Kubernetes clusters while maintaining centralized orchestration. This session shares our journey, architectural decisions, and how we leveraged agentic AI to streamline migration and developer experience.

Build AI Pipelines with Apache Airflow 3

by Kenten Danas

Apache Airflow® has long been the control plane for data pipelines. As AI workflows move into production, teams are discovering the same challenges apply: LLM calls fail, embeddings need regenerating, and agent outputs need human review. The operational discipline that Airflow brings to data pipelines is exactly what AI workflows need too.

Rather than managing data pipelines in Airflow and AI workflows in a separate system, Airflow lets you build both in one observable, reliable control plane. You get scheduling, retries, lineage, versioning, and human-in-the-loop capabilities for your LLM tasks the same way you already have them for your SQL transformations.

Building a Context-Aware Agentic Coding Platform for Airflow at Scale

by Yarden Wolf

Generic AI coding assistants like Cursor and Claude code are powerful, but they struggle with proprietary infrastructures. At Wix, managing 7,500 active DAGs across 120 Data Engineers, we found that standard AI tools lacked the context to be truly effective - they didn’t know our custom operators, DWH modeling patterns, or strict governance rules. In this session, we introduce our internal “Agentic IDE Configuration Manager” that bridges this gap. We will demonstrate how we leverage MCPs to inject deep Airflow context into our AI agents. You will learn how we enabled our coding agents to: Generate compliant code by utilizing custom Cursor rules to ensure every DAG meets production standards and naming conventions. Interact with Airflow by using our custom MCPs to run DAGs locally, parse error logs, and autonomously fix pipeline failures. Understand data by accessing our Data Catalog and Trino engine to validate schema logic in real-time. Whether you are trying to optimize your team’s workflows or simply curious how far can coding agents go in the current age, join us in this exciting talk.

Building a low-cost, scalable Airflow Platform for Small Teams

by Aniruddha Sengupta

Apache Airflow is often perceived as a platform best suited for large organisations with significant infrastructure budgets and dedicated platform teams. In this talk, I want to share how we built and scaled a robust Airflow platform with tight cost constraints whilst still maintaining reliability, governance and developer productivity.

Starting from a small Airflow setup, we have evolved our architecture to support multiple teams and increasingly complex workflows. This includes standardising environments and making sure best practises are adopted around observability, resource management and version control.

Building storage analytics pipelines for cloud cost optimization with Airflow

by Bao Nguyen

Storage usage is a major driver of infrastructure cost for media collaboration platforms. Understanding how storage grows across accounts, assets, and workflows requires analytics pipelines that combine product data with infrastructure metrics.

In this talk, I’ll share how we built storage analytics pipelines that model storage usage across accounts and plan tiers to help leadership understand infrastructure cost drivers. Using warehouse data models orchestrated with Airflow, we developed pipelines that track storage usage over time, identify discrepancies in legacy storage calculations, and resolve edge-cases.

Cloud Composer Workshop - Managing DAGs at Scale

by Danny De Leo

During this workshop you are going to learn how to effectively set up CI/CD for Composer environment and build observance of your DAGs across many Cloud Composer environments

Common Issues When Running dbt in Airflow (and How to Fix Them)

by Tatiana Al-Chueyr Martins

In many modern data platforms, orchestration tools are combined with transformation frameworks. A common pattern is orchestrating dbt (data build tool) transformations using Apache Airflow — something reported by roughly 44% of the community.

At first glance, the integration seems straightforward: simply run dbt run inside an Airflow task. Some teams go further and use libraries that convert dbt projects into native Airflow DAGs, such as Astronomer Cosmos.

In practice, however, teams quickly run into operational and architectural challenges. Slowness, out-of-memory errors, zombie tasks, and DAGs that take minutes to appear in the UI are just a few of the issues that can emerge as projects scale.

Dag Versioning in Airflow: Version Proliferation and Open Questions

by Ephraim Anierobi

This session explores the next phase of Dag versioning in Airflow and the practical questions users face in real deployments. Dag versioning moved Airflow beyond a “latest only” model, but it also introduced confusion around why Dag versions keep increasing, what disabling Dag bundle versioning actually does, what creates a new version, and how users should think about clears, reruns, and backfills after a Dag changes. I will examine a common misconception: disabling bundle versioning does not stop Dag version changes. I will also connect Dag versioning to Dag delivery in Airflow 3, showing how Git backed Dag bundles provide a more native alternative to git-sync in Helm-based deployments.

DAGs Move Robots: Closed‑Loop Orchestration for Silicon Validation Labs with Airflow

by Dheeraj Turaga, Deva Madhavan & Shubham Raj

What if your Airflow DAG could orchestrate robots, thermal chambers, and silicon tests, not just code?

Silicon validation labs rely on scarce, stateful physical resources: robotic handlers, DUT boards, thermal/power systems, instruments, and shared hardware queues. Teams often coordinate these via spreadsheets and ad hoc reservations, causing contention, idle gaps, conflicts, poor observability, and slow triage.

This talk presents a closed-loop orchestration model where Apache Airflow is the control plane for a software-defined validation lab. A central DAG coordinates robotic handling, thermal/power setup, stress and performance runs, and parametric characterization on hosts connected to silicon. It continuously ingests hardware health, measurements, and test outcomes, then feeds results into AI-assisted analysis to choose the next physical action: refine parameters, schedule follow-up experiments, or trigger mitigation.

Debugging the Undebuggable: Lessons from Real Airflow Incidents

by Pankaj Singh

Debugging Airflow failures in production can be harder than building the pipelines themselves. Engineers often encounter issues such as disappearing DAGs, hanging tasks, missing logs, zombie tasks, or sudden performance degradation, often with little visibility into the root cause.

Over the past year, while supporting multiple Airflow deployments and integrations, we investigated several such incidents across different teams and environments. This session shares lessons from these real debugging cases and explains how the issues were diagnosed and resolved.

Declarative Pipelines Meet Declarative Orchestration: Spark Declarative Pipelines + Airflow 3

by Lisa Cao & Andreas Neumann

Apache Spark’s new Declarative Pipelines (SDP) let engineers define WHAT their data should look like, not HOW to build it. Apache Airflow 3 brings a declarized orchestration model. Together, they eliminate an entire category of boilerplate: the DAG that exists only to babysit a pipeline. This talk walks through building a production Spark SDP pipeline orchestrated by Airflow 3, showing how dependency graphs replace imperative task chains, how testing and recovery patterns change when your pipeline is declarative end-to-end, and what this means for the 80% of data engineering time currently spent on operational plumbing.

Designing Domain-Oriented dbt Projects and Making Them Work in Airflow

by Pankaj Koti

As analytics teams grow, monolithic dbt projects can become tightly coupled and difficult to scale. Cross-domain dependencies multiply, deployment cycles slow down, and ownership boundaries blur.

dbt Mesh proposes a domain-oriented approach with independently owned dbt projects, explicit cross-project contracts, and controlled exposure to dependencies. Applying Mesh principles is not just about splitting repositories; orchestration must also support these boundaries.

In this session, we explore how to design dbt projects according to Mesh principles and how Airflow orchestration can reinforce those architectural decisions. Using multi-project capabilities in Cosmos that leverage dbt Loom-style cross-project referencing, we demonstrate how Airflow can model domain separation while still enabling controlled cross-project dependencies.

Designing Self-Healing Airflow Platforms: Autonomous DAG Recovery at Scale

by Kumuda Sreenivasa & Sandeep Bommisetti

Most Airflow failures are still handled manually — retries, Slack alerts, and late-night debugging. This talk shows how to design Airflow as a self-healing platform that detects problems early, limits blast radius, and automatically recovers. We’ll cover practical patterns for DAG, schema, and dependency-drift detection; safe, selective backfills; predictive failure modeling using metadata; lineage-aware rollbacks; and canary deployment for DAGs. You’ll learn how to isolate unstable workloads before they impact others and how to turn Airflow into an intelligent control plane — not just a scheduler.

Developer Velocity at Scale: Production-Like Airflow Environments on Kubernetes

by Matthew Davis & Matt Koski

Teams running Airflow on Kubernetes know the trade‑off all too well: Kubernetes scales beautifully in production, but makes local development slow, brittle, and unrealistic. Engineers struggle to replicate production environments locally, forcing them into inefficient “test-in-production” cycles that slow delivery velocity, increase deployment risk, and frustrate data teams.

In this talk, we’ll walk through the architectural patterns and platform engineering approach we used to give engineers on‑demand, isolated, production‑like Airflow environments, without sacrificing the benefits of shared Kubernetes infrastructure.

Developing an AI-powered personal endurance sports coach

by Bas Harenslak

During this session, I’ll deep dive into the implementation of an AI-powered endurance sports coach using Apache Airflow as the backbone for data ingestion and processing. Beyond data pipelines, I’ll explain what’s required to build a conversational AI system, from structured data modeling to orchestration and retrieval. We’ll explore how metrics are precomputed, how vector search enables contextual memory, and which front-end patterns work best for interacting with AI agents. The result is a reproducible architecture where Airflow powers the data layer and an LLM provides the reasoning on top to help athletes perform at their best in numbers-driven endurance sports.

Enterprise-Grade Airflow Upgrade: Strategies & Deep Dive

by M Waqas Shahid

Upgrading to Apache Airflow in large, production-grade environments can be complex—especially in enterprise setups with hundreds of DAGs, custom plugins, and mission-critical pipelines. The challenge grows even more complex in decentralized setups, where platform teams are responsible for the system’s stability, but the DAG code lives across multiple teams you don’t directly control.

You will have the chance for personalised review of your current organizational setup, assess testing coverage, and identify concrete ways to improve your upgrade process. This hands-on workshop will provide:

Event-Driven Orchestration Monitoring: Streaming Airflow Metadata to Kafka via CDC

by Vipin Kataria

How do you monitor Airflow across 50 teams in real-time? How do downstream systems react instantly to pipeline completions without polling APIs? How do you build custom dashboards without overloading Airflow’s database? This talk demonstrates how we use Change Data Capture to stream Airflow’s metadata to Kafka, making orchestration events consumable by any system in real-time. By capturing changes in Airflow’s Postgres database and publishing them to Kafka topics, we enable instant notifications, real-time dashboards, compliance audit trails, and cross-system orchestration without modifying Airflow code or impacting performance. You’ll learn how to set up Debezium CDC for Airflow’s metadata tables, design Kafka topics for task and DAG events, build real-time consumers for monitoring and alerting, handle schema evolution across Airflow upgrades, and implement cost attribution and SLA monitoring in real-time. Using production examples processing millions of events daily, I’ll share architecture decisions, performance optimizations, and lessons from running CDC at scale. You’ll leave with patterns for making Airflow observable to your entire organization.

Fixing The Token Authentication: Revocation, Scoping, and Securing the Execution Boundary

by Anish Giri

When Airflow 3 introduced JWT based task authentication, it also introduced new attack surfaces; such as, Tokens that can’t be revoked,Tasks that lose authentication while waiting in queues and Forked processes that inherit signing keys and also can forge tokens for other tasks.

In this talk, I’ll walk through three security challenges at the task execution boundary and the code contributed to fix them:

Token revocation (merged, PR #61339): Airflow 3.x had no way to invalidate issued JWTs with implications for common compliance frameworks.

From Airflow 2 to Airflow 3: Migrating 100+ DAGs Without Downtime or Developer Burden

by Goncalo Costa

Migrating a production Airflow deployment from version 2 to 3 without disrupting hundreds of DAGs across multiple teams sounds scary (and it is). In this talk I will share how we migrated versions without a big-bang cutover, without weeks of cross-team change requests, and without leaving our pipelines in a broken state.

I’ll walk through how we built a compatibility layer to make sure our code runs on both versions during the migration, how we used AI-tooling to orchestrate 400+ DAG changes and how our on-demand ephemeral environments - full k8s deployments deployed for each pull request - helped us experiment and test all the required changes.

From Chaos to Control: Navigating Airflow Sprawl with Centralized Observability

by Jon Hiett

As data platforms mature, organizations often experience “Airflow Sprawl”—the rapid, organic growth of isolated Airflow instances across different teams and projects. While this empowers localized control, it creates dangerous silos that hinder visibility, increase operational risk, and erode developer productivity. In this session, we will explore the critical challenges of managing a fragmented Airflow ecosystem and discuss strategies for regaining control. We will examine why centralizing execution history and establishing unified observability is essential for reducing Mean Time to Recovery (MTTR), mitigating hidden security risks, and transforming fragmented instances into a cohesive, reliable data service. Attendees will leave with a strategic framework for managing Airflow at scale.

From Experiments to Production: How We Built a Lightweight ML Platform on Airflow

by Marion Azoulai

Many data teams can build machine learning models, but operationalizing them reliably remains a challenge.

At Astronomer, our data team recently moved from exploratory modeling to running multiple production ML models powering go-to-market analytics and workflows. Rather than introducing heavy MLOps infrastructure, we integrated the full ML lifecycle directly into our Airflow-based data platform.

In this talk, we’ll share how we use Airflow to orchestrate production ML end-to-end: from feature pipelines in Snowflake, to model training and artifact promotion, to batch scoring and prediction delivery.

From Hours to Minutes: Orchestrating Local LLMs for Sensitive Data Pipelines with Apache Airflow

by Chhayank Jain

Processing unstructured data in regulated industries, healthcare, finance, legal, is one of the hardest data engineering challenges: the data is messy, privacy constraints prevent sending it to external APIs, and scale makes manual processing impossible.

In this talk, I’ll walk through how to design and deploy an Apache Airflow–orchestrated LangChain pipeline powered by LLMs to digitize unstructured documents into a unified structured platform.

I’ll cover the full architecture: how Airflow DAGs coordinate multi-step LLM inference, validation, and ingestion stages; how LoRA/PEFT fine-tuning adapted open-source LLMs for domain-specific language without leaking sensitive data; and how failure handling, retries, and data quality checks were built natively into Airflow.

Get Certified: DAG Authoring for Apache Airflow 3

by Marc Lamberti

We’re excited to offer Airflow Summit 2026 attendees an exclusive opportunity to earn their DAG Authoring certification in person, now updated to include all the latest Airflow 3 features. This certification workshop comes at no additional cost to summit attendees.

The DAG Authoring for Apache Airflow certification validates your expertise in advanced Airflow concepts and demonstrates your ability to build production-grade data pipelines. It covers TaskFlow API, Dynamic task mapping, Templating, Asset-driven scheduling, Best practices for production DAGs, and new Airflow 3.0 features and optimizations.

Gleaming the Cube: Exploring the limits of Airflow through the Rubik's Cube

by Jonathan Leek

What does solving a Rubik’s Cube have to do with Apache Airflow? More than you’d think.

In this talk, I’ll walk through a project where Airflow orchestrates the process of solving a Rubik’s Cube — not as a gimmick, but as a framework for exploring cyclic workflows, state management, and iterative computation in a system designed for DAGs. Cube-solving algorithms naturally require feedback loops, evolving state, and conditional branching — all things that challenge Airflow’s acyclic model.

Healthcare Interoperability Meets Airflow Extensibility

by Wyatt Shapiro

In healthcare data, standards are often anything but standard. Every new partner arrives with its own requirements for data exchange spanning FHIR APIs, HL7 feeds, SFTP drops, and custom vendor extracts.

The result? Integration projects that stretch from weeks into months, custom pipelines that only one engineer understands, and implementation teams who are already counting down to your next missed deadline.

This session shows how Airflow can change your approach to managing data transfer for healthcare partners.

It Works! Now What? Fast Iteration for AI Capabilities in Airflow

by Alex Guglielmone

Building an AI capability in Airflow is the easy part. The hard part is what comes next.

You want to swap a model, refactor a prompt, cut token costs, or try a local model instead of paying for cloud. How do you know it still works as expected? Without a fast feedback loop, every change is a gamble.

This talk shows practical patterns for building that feedback loop, with real examples using agent skills, MCPs, and local and cloud models. It covers the challenges too: sandboxing, observability, non-determinism, and keeping checks simple enough that people actually use them.

Migrating Airflow 2 to 3 for Infrastructure Operations at Scale

by Ethan (Tianyang) Lin & Rumeysa Ozaydin

This talk covers migrating a production Airflow platform that orchestrates a large VM fleet — provisioning, OS patching, and decommissioning at high concurrency. This is not a data pipeline — it is infrastructure operations at fleet scale We’ll share workflow patterns that make fleet-scale orchestration possible in Airflow, then cover how we moved from an Airflow 2 monolith — all components on every node with fixed worker counts — to Airflow 3 with independently scalable services, each with its own release cycle. We’ll dig into a silent breaking change in Airflow 3’s XCom behavior: xcom_pull(key=…) without task_ids no longer searches upstream tasks, returning None with no warning. We’ll present three iterations of solving this — from O(n) DAG traversal to a custom XCom backend that restores Airflow 2 semantics with zero DAG code changes — and the design tradeoffs at each stage. Attendees will learn how Airflow powers infrastructure operations beyond data pipelines, how Airflow 3’s XCom silently breaks Airflow 2 workflows, three approaches to the same migration problem, and lessons from running both versions in parallel.

Migrating Airflow at Scale - What the Docs Don't Tell You

by Olivier Daneau

If you are migrating from self-hosted Airflow to any of the managed platforms, most migration guides you’ll find online assume one environment, one team, one version. Large organizations are never that simple.

This talk comes from four years of assisting customers with real migrations across some of the biggest Airflow deployments out there, from self-hosted open source to managed cloud platforms like MWAA, GCC, and Astro, and between major version upgrades.

Multi-Team Airflow: A Customer-Driven Journey

by Niko Oliveira & Vincent Beck

As Airflow deployments scale and the number of Dag authors increases the question arises: how do we support many teams with different needs and requirements on a shared platform? Over the years we’ve observed many organizations building their own multi-tenant layers on top of Apache Airflow to solve this problem and we’re now adding native support for this type of deployment. This talk explores building multi-team support in Airflow, working backwards from those real deployment challenges and community pain points we’ve observed.

One Codebase, Many Distributions: Airflow’s Modular Approach

by Jarek Potiuk & Amogh Desai

Airflow’s evolution toward a client-server architecture faced a fundamental challenge: splitting a monolithic codebase into independent distributions (airflow-core, task-sdk, providers) without triggering dependency hell. Traditional PyPi packaging and code duplication both fail at Airflow’s scale.

Airflow 3.2 solves this through modular isolation and shared libraries using in-repository symlinks. This approach ensures each distribution ships with the exact version of shared code it requires, eliminating runtime version conflicts and allowing for independent dependency management. We have already migrated 10+ critical components—including the config parser, observability, and secrets masking—into this shared model.

Optimising Airflow in Real-World Deployments: Profiling, Performance Drift, and Confident Upgrades

by Pankaj Koti & Vara Prasad Regani

Performance issues in Apache Airflow rarely appear as clear failures. Instead, they surface as subtle signals: longer task queue times, slower DAG parsing, scheduler lag, or workers hitting limits as workloads grow.

In this talk, we share lessons from profiling real production deployments across Airflow 2.x and 3.x. Combining frontline operational insights with focused technical investigation, we analysed task latency, DAG parsing time, worker behaviour, and metadata database performance under sustained load.

Orchestrating 100 ML Models using Airflow

by Ryan Stevens

Productionizing ML workflows is complicated; scaling them is harder. At Ramp, we grew from zero to nearly 100 production ML models powering systems like credit risk assessment and sales lead valuation.

This talk covers how Airflow became the backbone of our ML platform, orchestrating ETL jobs, data quality checks, and model runs. We’ll discuss how we evolved it to meet the increasing complexity of our ML systems.

Every ML system consists of feature creation and large-batch inference. We started with a few DBT models and one cloud-hosted notebook, which evolved into thousands of upstream tables and hundreds of AWS batch inference jobs.

Orchestrating AI-Enabled Prescription Workflows with Apache Airflow: Improving Accuracy, Efficiency,

by Chandra Kiran Yelagam

Modern pharmacy enterprise systems must process high volumes of complex prescriptions while maintaining strict safety, compliance, and operational efficiency. However, traditional rule-based platforms frequently generate low-specificity alerts that contribute to alert fatigue, workflow bottlenecks, and increased manual intervention. As clinical guidelines, payer requirements, and treatment protocols evolve, static rule engines struggle to keep pace with the dynamic nature of modern pharmacy operations.

This session presents a practical architecture for AI-enabled prescription workflow automation orchestrated through Apache Airflow, enabling scalable, transparent, and auditable clinical workflows. By combining rule-based safety checks with machine learning models for classification, anomaly detection, and intelligent workflow routing, the system significantly improves routing precision, reduces false positives, and accelerates prescription verification.

Orchestrating Graph Database Workloads in Apache Airflow with Apache TinkerPop

by Ahmad Farhan

Graph databases are increasingly used for relationship-heavy data such as fraud detection, knowledge graphs and CRM systems, yet integrating them into orchestration workflows has remained difficult. This session introduces the Apache TinkerPop Provider for Airflow, enabling graph databases to be orchestrated as first-class citizens. I will demonstrate how it works with both self-hosted and managed services such as AWS Neptune and Azure Cosmos DB.

Orchestrating Streaming Data Pipelines with Airflow, Kafka, Spark, and Kubernetes on GCP

by Karan Alang

Modern data platforms rely on real-time pipelines to process and analyze large volumes of streaming events. Apache Airflow is widely used for batch orchestration, but it can also coordinate complex streaming architectures. In this session, we explore how Airflow orchestrates scalable pipelines built with Apache Kafka and Apache Spark running on Kubernetes in cloud environments.

We walk through an architecture where Kafka handles high-throughput event ingestion, Spark processes streaming data for analytics and transformation, and Kubernetes provides scalable infrastructure for distributed workloads. Airflow acts as the orchestration layer, coordinating job scheduling, pipeline dependencies, and operational visibility.

Performance Debugging in Airflow: From Symptoms to Solutions

by Rahul Vats

Airflow running slow? Memory is spiking. Tasks are queuing forever. Now what? Debugging performance issues in a distributed system like Airflow can feel overwhelming—is it the scheduler, the database, the DAG Processor, or your DAG code? This talk shares practical techniques for isolating and fixing performance problems, using real examples from the Airflow codebase.

We’ll cover:

  1. Understanding Airflow’s moving parts – Where bottlenecks typically hide (scheduler loop, DAG parsing, database queries).

Pushing to Prod on a Friday

by Ashley Gough

If the idea of pushing to production on a Friday still makes your stomach drop, you’re in good company because most data professionals know that particular flavor of dread. But that fear says more about systemic fragility than the day of the week. This talk explores how unclear ownership, hidden dependencies, and late validation create production risk in data platforms. I’ll show how data contracts clarify expectations between producers and consumers, how Behavior‑Driven Development (BDD) provides a shared language for system behavior, and how Airflow can enforce guardrails that shift validation earlier and reduce blast radius. This session focuses on the organizational and architectural decisions that shape platform reliability. Because Airflow often becomes the visible surface of upstream uncertainty, its teams feel the impact of broader design and governance choices. Attendees will learn to interpret “Friday fear” as a strategic signal, how contracts and BDD strengthen alignment and predictability, and how Airflow can act as a platform‑level safety system that builds trust and supports confident deployments - Fridays included.

Remote Control Isolation: airflowctl becomes the new default

by Bugra Ozturk

Meet airflowctl, the new default for API-driven remote operations. You will see how separating control from execution enhances security, enables isolation, and simplifies automation across different environments. I will discuss the development of airflowctl, demonstrate practical examples of secure remote execution, and provide a guide for transitioning from legacy workflows. You will learn how to easily migrate towards airflowctl and leverage the flexibility of an API-driven approach.

Scaling Airflow for Capacity Forecasting at Amazon Prime Video

by Shivam Rastogi

Amazon Prime Video uses Airflow to forecast traffic for hundreds of micro-services to deliver the best customer experience for some of the world’s biggest live events across multiple global regions. The forecasting methodology involves complex job dependencies between customer interaction metrics and geographies - translating to ~50 production DAGs with cross-DAG dependencies that process terabytes of customer activity data daily across tens of thousands of compute cores. In this talk, we’ll cover how we manage dependency complexity at scale, coordinate data flows across geographical boundaries, and keep forecasts reliable as the system grows.

Scaling to 1,000 DAGs: Idelic's Blueprint for Airflow Automation and Reliability

by Matthew Stavinga, Ray Carroll & Matt McCormack

This session details Idelic’s critical journey to a robust, scaled Astronomer Airflow environment. We’ll share technical lessons from overcoming initial orchestration challenges and successfully scaling to over 1,000 active DAGs. The session will showcase our advanced, Jenkins-integrated testing deployment for managing this scale, and the development of a standardized framework that simplifies DAG creation, eliminates code repetition, and enables configuration changes without a full deployment. This is essential for any team managing complex data pipelines, offering a blueprint for standardized Airflow development, maximum data reliability, and future growth at a large scale.

Securing Apache Airflow with Keycloak: A Deep Dive into the Keycloak Auth Manager

by Vincent Beck

As organizations scale their data platforms, managing access to Apache Airflow becomes increasingly complex. In this talk, we introduce the Keycloak Auth Manager — a pluggable authentication and authorization backend for Airflow that delegates identity management to Keycloak, a battle-tested open-source Identity and Access Management solution.

We’ll start with the big picture: what problem does the Keycloak Auth Manager solve, and why Keycloak? We’ll walk through the architecture — how Airflow’s auth manager interface works, how the Keycloak integration hooks into it, and how authentication flows (OIDC/OAuth2) and authorization (role mapping, resource-based permissions) are handled under the hood.

Self-Service DAGs: Event-Driven Design for GitHub Actions and Airflow at Lyft

by Ken Obata

At Lyft, driver pay configs on GitHub must be validated through Airflow DAGs before merging. However, Scientists and Analysts who change configs are not familiar with Airflow. How do we make such validation self-service while meeting SOX compliance?

This talk presents a design pattern for bidirectional GitHub-Airflow integration: GitHub Actions trigger DAGs, and DAGs push results back as PR status checks via the GitHub Commit Status API. We cover event-driven push-style vs traditional polling style, and why an event-driven push-style works well with Dynamic Task Mapping. This pattern aligns with Airflow 3’s event-driven scheduling vision. We also discuss how SOX requirements shaped this design.

Spark + Airflow: How Orchestration Decisions Impact Performance and Cost

by Meni Shmueli

Apache Airflow is the de facto orchestrator for modern data platforms, while Apache Spark powers large-scale data processing. But when the two meet in production, teams quickly face architectural decisions that affect reliability, performance, and cloud cost.

In this talk we explore key design questions when orchestrating Spark with Airflow: • Should you run a shared Spark cluster, a cluster per DAG run, or clusters per task? • When should Spark workloads run in parallel vs sequentially within a workflow? • How can teams benchmark pipeline performance in terms of both runtime and cost? • How do emerging features like Spark Declarative Pipelines change how Spark integrates with orchestration systems?

Spec-Driven Development for Airflow DAGs

by Kyle McCluskey

AI coding assistants have transformed software development, moving from ad hoc “vibe coding” to rigorous spec-driven development (SDD). The Airflow ecosystem has fully embraced these advancements, but different use cases demand different SDD approaches. This talk compares ETL and ML pipeline patterns, showing how each leverages Airflow’s unique capabilities differently. I then present SDD strategies along a Spec Stability Spectrum. ETL specs are stable and external — schemas, dbt models — making deterministic, template-driven approaches like DAG Factory and the cosmos-dbt-core skill the right fit. ML specs are volatile and internal, as experiments evolve, so LLM-driven hybrid approaches like the Airflow AI SDK and the airflow-hitl skill are better suited. Both approaches are demonstrated live with Claude Code. Examples draw from my work at TXI Digital generating ETL and ML pipelines for heavy industry clients, with a focus on Rail and anecdotes from Renewable Energy.

Stabilizing LinkedIn Continuous Deployment on Airflow

by Wensi Hu & Pooja Pal

Last year, we showed how LinkedIn’s continuous deployment (LCD) runs on Apache Airflow to orchestrate safe, repeatable releases across thousands of services—powering everyday deployments for 10,000+ engineers.

This year, we’ll dive into the hard‑won patterns that keep those deployments stable at scale: preserving DAG consistency during live updates; routing seamlessly across multiple clusters for graceful failover; enforcing HA guardrails on the control plane; and using dynamic task mapping to deliver faster rollbacks and reduce deployment overhead. You’ll see how we abstract Airflow for a cleaner user experience, what really moved the needle on launching tasks faster, and portable observability practices that cut on‑call toil.

Stop Being the Dag Bottleneck: How to Scale Airflow Orchestration Beyond Your Engineering Team

by Yetunde Dada

Your data platform team didn’t sign up to be a Dag factory. But when Airflow expertise is concentrated in a small group of engineers, that’s exactly what happens. Analysts wait days for simple workflows, engineers burn cycles rebuilding the same patterns, and frustrated teams start building outside the stack entirely. The real fix isn’t a better onboarding guide or a friendlier UI. It’s rethinking the abstraction layer your team exposes to the rest of the business.

Streamlining Data Pipelines Creation at Stripe with Airflow

by Jiayu Yi

At Stripe, we process petabytes of data daily across thousands of pipelines powering financial reporting, fraud detection, and merchant analytics. As our data estate grew, so did the complexity of authoring, scheduling, and operating these pipelines. Engineers spent more time wrangling Airflow DAG boilerplate and managing dependencies than writing transformation logic.

To address this, we built a declarative platform that generates Airflow DAGs from YAML and SQL definitions. Authors specify what they want — source tables, SQL transformations, incremental mode, output schema — and the platform handles the rest: generating Airflow tasks, wiring upstream sensors, registering Iceberg tables, and configuring scheduling parameters. A key piece is an in-house dataset-to-task mapping service that resolves upstream dataset dependencies to their producing Airflow tasks. When an author declares an input dataset, the platform automatically looks up which task produces it and generates the appropriate sensor — no manual DAG cross-referencing required. This eliminates an entire class of misconfigured dependency bugs common in hand-wired Airflow deployments.

Streamlining Your Airflow Upgrade: Essential Tools for Migrating from 2.x to 3

by Ankit Chaurasia

Airflow 3 has officially arrived! If you’re considering an upgrade, this session will equip you with essential migration utilities that facilitate a smooth transition from Airflow 2.x. Attendees will learn the new CLI command, “airflow config lint”, to analyze your configuration files for any removed, deprecated, or renamed elements. This command provides comprehensive feedback and allows for filtering specific sections and options.

During the session, attendees will learn to leverage a set of rigorous Ruff rules - AIR301, AIR302, and AIR303 - crafted to detect migration issues within your codebase automatically. Notably, rule AIR301 flags DAG definitions lacking an explicit schedule argument, a critical update in Airflow 3. Rule AIR302 identifies deprecated functions and removes configuration settings, offering recommended alternatives. Rule AIR303 highlights code that references components now shifted to provider packages, ensuring your integrations are up to date.

Taming AI Workloads in Apache Airflow: Dag Patterns to Avoid Infrastructure Instability

by Zhe-You(Jason) Liu

Orchestrating AI workloads introduces a two-front battle with infrastructure instability. First, the Airflow workers themselves (e.g., Kubernetes pod evictions, Celery node scaling) can restart and lose track of active tasks. Second, the external AI cluster running the heavy compute can experience temporary network blips, API timeouts or compute rescheduling. With standard Dag designs, these transient hiccups often cause Airflow to panic, fail the task, and tragically send a kill signal to an expensive, perfectly healthy AI job.

Taming the MLOps Zoo: Orchestrating and Monitoring Models with Airflow

by Lindy Bustabad

Thanks to AI, your data scientists can build models faster than ever. The new bottleneck? Their attention. When your team maintains a zoo of ML models (dbt/SQL scoring models, Python ML on Kubernetes, and point-and-click product UI models) every new species adds feeding schedules, health checks, and habitat needs. The real question becomes: which animals need the zookeeper right now?

At Pendo, we orchestrate 10+ ML models through Airflow, each with its own dbt Cloud feature prep, Kubernetes scoring pods, and downstream monitoring. This talk covers how we keep the zoo running: DAG dependencies across heterogeneous model types, conditional execution for models that only score on certain schedules, and model-specific sub-pipelines that keep each species healthy. Then we’ll demo DS ModelGuard, an agentic monitoring system we built internally that does the morning rounds, tracking API health, output volume, likelihood drift, and feature-level input drift, so your data scientists know which enclosure to check first.

The Messy Middle: When Your Data Team Doesn't Need a Streaming Engine

by Constance Martineau

There’s a class of workload that doesn’t belong in your streaming stack. A team needs to react to data arriving in S3 or a message landing in Kafka. The SLA is minutes. Someone reaches for Flink because the orchestrator can’t trigger on events. Six months later, you’re running a streaming app for what is a bounded computation with a latency requirement.

This talk names that pattern, the “messy middle,” and argues that Airflow 3 eliminates the gap that pushed these workloads to streaming. Asset Watchers monitor external sources through async triggers, firing DAGs within minutes of event arrival. Assets turn data products into scheduling primitives. Partitions let Airflow reason about which slices of a dataset are ready.

The Rise of Abstraction in Dag Authoring: From YAML to Minecraft

by Volker Janz

In my almost 15 years as a data engineer, I’ve learned one universal truth: everyone needs orchestration. The marketing team needs daily attribution reports. The CRM team needs personalized newsletter triggers. The platform team needs cross-cloud data transfers. The analytics team needs third-party data imports. Data touches every corner of the business, and the orchestration layer is the one layer that connects it all.

This talk explores what becomes possible when we decouple pipeline logic (what happens) from definition (how it’s authored). With the right abstractions, the authoring interface can be anything: Python, declarative YAML, templates, spreadsheets, or even a video game.

The Self-Healing Pipeline: How Error Classification Eliminated 100% of Engineering Oversight

by Evgeny Nuger

What if your pipeline could tell the difference between recoverable errors and real bugs and handle both without waking anyone up? At OnsiteIQ, we process millions of construction site images monthly through Airflow with mixed AWS Batch spot and on-demand tasks. We need to handle corrupt data, spot evictions, and real bugs. Before Airflow, every failure looked the same: something broke, an engineer investigated, and the same transient infrastructure issues kept masking real bugs underneath. In a 3-month solo migration, I built custom Airflow operators that automatically classified and handled every failure via Airflow’s callbacks. Actual code bugs surface through clean, noise-free alerts directly to actionable tickets. Every genuine bug got caught exactly once and permanently fixed. Engineering oversight dropped from 20% to zero within months. This talk covers the error classification architecture, automatic fallback patterns, and the framework for turning Airflow into a self-healing system.

The State of Airflow: Momentum, Innovation, and What's Next

by Vikram Koka

Airflow 3 has been out for a year. In this keynote, we take stock of where the community stands, what we built together, and where we are headed.

We open with the data: adoption trends, community growth, and honest feedback from teams running Airflow 3 in production. What is working, what surprised us, and what the survey tells us about how the ecosystem is evolving.

The second section covers the year in Airflow. Provider discovery and distribution has been modernized. Airflow gained first-class support for AI and LLM workloads. And scheduling became more powerful, letting pipelines respond to data at a finer granularity.

Toward a Polyglot Airflow

by Tzu-ping Chung

Building on Airflow 3’s new worker structure and foundation laied by Go SDK, we take a look at how Airflow can support a fully cross-language Dag-authoring experience.

We will discuss how a new language SDK is built, how a task talks to Airflow, and how multiple languages may be mixed inside a Dag. To support additional languages without logic duplication, a new middle layer is required between Airflow and the task. Additional topics, such as security, distributed workload, and user interface considerations, will also be touched on.

Triggers at Datadog: What Are Trigger Queues and Why You Should Use Them

by Zach Gottesman

Datadog is a world-class data platform ingesting more than a 100 trillion events a day, providing real-time insights. Since our internal adoption of Airflow following the release of 3.0.0, the number of teams relying on our internal Airflow platform have grown organically and quickly.

This internal Airflow adoption came with a number of platform challenges, requiring novel solutions which could support multi-tenancy, scalability, and bespoke runtime environments. In this talk, we will cover how we’ve expanded the functionality of Airflow triggers – via trigger queue assignment – to support multi-tenancy deployments, while contributing those solutions upstream with the broader Airflow community. We’ll cover the conceptual design and motivations for Trigger queues, and how the trigger queue pattern can benefit both multi-tenant and single-occupant Airflow systems alike.

When Airflow Meets Yunikorn: Enhancing Airflow on Kubernetes with Yunikorn for Higher Efficiency

by Xiaodong Deng

Apache Airflow’s Kubernetes integration enables flexible workload execution on Kubernetes but lacks advanced resource management features including application queueing, tenant isolation and gang scheduling. These features are increasingly critical for data engineering as well as AI/ML use cases, particularly GPU utilization optimization. For example, gang scheduling ensures all required resources for a job are allocated atomically, preventing partial allocations that waste resources. Apache Yunikorn, a Kubernetes-native scheduler, addresses these gaps by offering a high-performance alternative to Kubernetes default scheduler. In this talk, we’ll demonstrate how to conveniently leverage Yunikorn’s power in Airflow, along with practical use cases and examples.

Your first Apache Airflow Contribution

by Amogh Desai, Kalyan Reddy & Phani Kumar

Ready to contribute to Apache Airflow?

In this hands-on workshop, we’ll help you jump straight into the project with real, beginner-friendly issues matched to your skills and interests.

To make the most of our time together, come with a development environment set up in advance — installing Breeze is highly recommended, but GitHub Codespaces is a great alternative if Docker isn’t an option for you.

We’ll walk through the full contribution journey step by step: exploring the codebase, picking an issue, opening your first pull request, and engaging with the community for feedback and reviews. Whether you’re interested in writing code, improving documentation, writing tests, or sharing ideas, there’s a welcoming place for you in the Airflow community.