Apache Spark’s new Declarative Pipelines (SDP) let engineers define WHAT their data should look like, not HOW to build it. Apache Airflow 3 brings a declarized orchestration model. Together, they eliminate an entire category of boilerplate: the DAG that exists only to babysit a pipeline. This talk walks through building a production Spark SDP pipeline orchestrated by Airflow 3, showing how dependency graphs replace imperative task chains, how testing and recovery patterns change when your pipeline is declarative end-to-end, and what this means for the 80% of data engineering time currently spent on operational plumbing.

Lisa Cao

Staff Developer Relations

Andreas Neumann

Sr Staff Software Engineer