Modern data platforms generate overwhelming amounts of operational data across distributed systems. For teams running Apache Airflow at scale, incidents often mean high mean time to resolution (MTTR), constant context switching between observability tools, and a growing on-call burden.

What if your Airflow environment had an always-on, autonomous on-call engineer?

In this workshop, we’ll explore how an AI-powered DevOps agent can supercharge Airflow operations — from automated DAG failure diagnosis and intelligent log analysis to proactive prevention of recurring incidents. Whether you’re running Airflow on a managed cloud service or self-hosted, the patterns and practices covered here apply broadly to modern data pipeline operations.

Key topics covered:

  • Autonomous incident detection & response — topology-aware analysis across DAG dependencies with actionable mitigation plans
  • Real-time root cause analysis — correlating logs, metrics, and traces across your Airflow application stack
  • Collaboration integrations — automatic messaging, ticketing, and incident management workflows

Join us to transform your Airflow DevOps experience.

Suba Palanisamy

Enterprise Support Lead TAM

Vinod Jayendra

Enterprise Account Engineer, AWS - Orchestrating Apache Airflow ML Workflows using SageMaker Unified Studio

Abdul Majid Mohammed

Sr. Technical Account Manager, AWS