A Game of Constant Learning & Adjustment: Orchestrating ML Pipelines at the Philadelphia Phillies

Presented at Airflow Summit 2024

By Mike Hirsch Sophie Keith

When developing Machine Learning (ML) models, the biggest challenges are often infrastructural. How do we deploy our model and expose an inference API? How can we retrain? Can we continuously evaluate performance and monitor model drift?

In this talk, we will present how we are tackling these problems at the Philadelphia Phillies by developing a suite of tools that enable our software engineering and analytics teams to train, test, evaluate, and deploy ML models - that can be entirely orchestrated in Airflow. This framework abstracts away the infrastructural complexities that productionizing ML Pipelines presents and allows our analysts to focus on developing robust baseball research for baseball operations stakeholders across player evaluation, acquisition, and development.

We’ll also look at how we use Airflow, MLflow, MLServer, cloud services, and GitHub Actions to architect a platform that supports our framework for all points of the ML Lifecycle.

Mike Hirsch

Senior ML Engineer, Philadelphia Phillies

Sophie Keith

Software Engineer II, Philadelphia Phillies