This talk is speculative: orchestration tools like Airflow have it made it very easy to pull and push data from anywhere to everywhere. But we don’t know what data we are pushing around.

What if we have a schema language that we could use to describe this data? Not in terms of data type but in terms of sensitivity and instructions on how to handle this?

This talk is about the headaches companies are facing day to day and that maybe there’s an opportunity for the Airflow community to help solve this problem.