This talk focuses on leveraging the Task State Management (AIP-103) and Enhanced Retry Policy work (AIP-105) being released in Airflow 3.3 to enable enhanced execution of long running tasks including checkpointing, sophisticated (and automated) retry policies, and intra-task observability.
Initially focused on Apache Spark, which is one of the most widely used workload frameworks for data engineers, this is extensible to long running tasks of any type including agentic workflows.
This solves a key pain point for Airflow users, without requiring custom async code development.
Amogh Rajesh Desai
Airflow PMC Member & Committer | Senior Software Engineer at Astronomer