Democratised data workflows at scale
Financial Times is increasing its digital revenue by allowing business people to make data-driven decisions. Providing an Airflow based platform where data engineers, data scientists, BI experts and others can run language agnostic jobs was a huge swing. One of the most successful steps in the platform’s development was building our own execution environment, allowing stakeholders to self deploy jobs without cross team dependencies on top of the unlimited scale of Kubernetes.
In this talk we will share how we have integrated and extended Airflow at Financial Times. The main topics we will cover include:
- Providing team level security isolation
- Removing cross team dependencies
- Creating execution environment for independently creating and deploying R, Python, JAVA, Spark, etc jobs
- Reducing latency when sharing data between task instances
- Integrating all these features on top of Kubernetes