Apache Spark’s new Declarative Pipelines (SDP) let engineers define WHAT their data should look like, not HOW to build it. Apache Airflow 3 brings a declarized orchestration model. Together, they eliminate an entire category of boilerplate: the DAG that exists only to babysit a pipeline. This talk walks through building a production Spark SDP pipeline orchestrated by Airflow 3, showing how dependency graphs replace imperative task chains, how testing and recovery patterns change when your pipeline is declarative end-to-end, and what this means for the 80% of data engineering time currently spent on operational plumbing.
Lisa Cao
Staff Developer Relations
Andreas Neumann
Sr Staff Software Engineer