Airflow is a household brand in data engineering: It is readily familiar to most data engineers, quick to set up, and, as proven by millions of data pipelines powered by it since 2014, it can keep DAGs running.
But with the increasing demands of ML, there is a pressing need for tools that meet data scientists where they are and address two pressing issues - improving the developer experience & minimizing operational overhead.
In this talk, we discuss the problem space and the approach to solving it with Metaflow, the open-source framework we developed at Netflix, which now powers thousands of business-critical ML projects at Netflix & other companies. We wanted to provide data scientists with the best possible UX, allowing them to focus on parts they like (e.g., modeling) while providing robust solutions for the foundational infrastructure: data, compute, orchestration (using Airflow), & versioning.
In this talk, we will demo our latest work that builds on top of Airflow.