Apache Airflow is often perceived as a platform best suited for large organisations with significant infrastructure budgets and dedicated platform teams. In this talk, I want to share how we built and scaled a robust Airflow platform with tight cost constraints whilst still maintaining reliability, governance and developer productivity.
Starting from a small Airflow setup, we have evolved our architecture to support multiple teams and increasingly complex workflows. This includes standardising environments and making sure best practises are adopted around observability, resource management and version control.
I want to walk through the architectural decisions we made, the trade-offs we managed and open-source solutions we considered. I also want to outline the concrete steps we took to reduce operational overhead and get a hold of our cloud spend. I will share some practical examples of how to enforce consistency across DAGs, scale Airflow and build a data platform that grows with the business.
This session will be aimed at engineers and platforms teams who want to run Airflow efficiently and sustainably even with limited resources and budget constraints.
Aniruddha Sengupta
Principle Data Engineer, Insignis