Apache Airflow - The open-ended nature of this orchestration tool gives room for a variety of customization.
While this is a good thing, there are no bounds in which the system can or cannot be used, resulting in wasting a lot of time in scaling, testing, and debugging when things aren’t set properly.
In this talk, we will go through a series of factors that data teams need to keep a watch for while setting up an Airflow system.
Infrastructure
Teams and responsibilities
Development best practices
Pipeline best practices
Monitoring and observability
Fail-proof and recoverable system