Ensuring high-quality data is essential for building user trust and enabling data teams to work efficiently. In this talk, we’ll explore how the Astronomer data team leverages Airflow to uphold data quality across complex pipelines; minimizing firefighting and maximizing confidence in reported metrics.
Maintaining data quality requires a multi-faceted approach: safeguarding the integrity of source data, orchestrating pipelines reliably, writing robust code, and maintaining consistency in outputs. We’ve embedded data quality into the DevEx experience, so it’s always at the forefront instead of in the backlog of tech debt.
We’ll share how we’ve operationalized:
- Implementing data contracts to define and enforce expectations
- Differentiating between critical (pipeline-blocking) and non-critical (soft) failures
- Exposing upstream data issues to domain owners
- Tracking metrics to measure overall data quality of our team
Join us to learn practical strategies for building scalable, trustworthy data systems powered by Airflow.
Maggie Stark
Staff Data Engineer Astronomer
Marion Azoulai
Senior Data Scientist, Astronomer