Speaker(s):
Presented at Airflow Summit 2022

Resilient systems have the capability to recover when stressed by load, bugs in the workflow, and failure of any task. Reliability of the infrastructure or platform is not sufficient to run workflows reliably. It is critical to bring in resiliency practices during the design and build phase of the workflow to improve reliability, performance and operational aspects of the workflow.

In this session, We will go through

  1. Architecture of the Airflow through the lens of reliability
  2. Idempotency
  3. Designing for failures
  4. Applying back pressure
  5. Best practices

What we do not cover: Infrastructure/Platform/Product reliability