As modern data ecosystems grow in complexity, ensuring transparency, discoverability, and governance in data workflows becomes critical. Apache Airflow, a powerful workflow orchestration tool, enables data engineers to build scalable pipelines, but without proper visibility into data lineage, ownership, and quality, teams risk operating in a black box.

In this talk, we will explore how integrating Airflow with a data catalog can bring clarity and transparency to data workflows. We’ll discuss how metadata-driven orchestration enhances data governance, enables lineage tracking, and improves collaboration across teams. Through real-world use cases, we will demonstrate how Airflow can automate metadata collection, update data catalogs dynamically, and ensure data quality at every stage of the pipeline.

Attendees will walk away with practical strategies for implementing a transparent data workflow that fosters trust, efficiency, and compliance in their data infrastructure.

John Robert

Data and ML Platform Engineer