From Zero to Airflow: bootstrapping a ML platform

Speaker(s): Noam Elfanbaum
When: (Jul-16 05:00 UTC)
-----

At Bluevine we use Airflow to drive our ML platform. In this talk, I’ll present the challenges and gains we had at transitioning from a single server running Python scripts with cron to a full blown Airflow setup. This includes: supporting multiple Python versions, event driven DAGs, performance issues and more!

Some of the points that I’ll cover are:

  • Supporting multiple Python versions
  • Event driven DAGs
  • Airflow Performance issues and how we circumvented them
  • Building Airflow plugins to enhance observability
  • Monitoring Airflow using Grafana
  • CI for Airflow DAGs (super useful!)
  • Patching Airflow scheduler