Post

Use cases for Apache Airflow

I gave a talk at Cloudera’s Chennai office on the use cases for Apache Airflow. Drawing heavily on my team’s experiences building data platforms, I presented a sample use case and walked through a high level system design.

We covered the guiding design principles behind the storage layer of the platform (covered in detail by my colleague here). We also looked at the evolution of the core abstraction used in all of the pipelines, from its humble beginnings as a single operator running for hours, to a fancy behemoth combining sensors and operators allowing us to run a huge number of pipelines without ever having to worry about Airflow’s performance.

We also talked about issues like Observability, unit testing, health checks and data lineage. While the recording of the session was unfortunately never published, I’d recommend going through the slides as they are very self-explanatory. In case it wasn’t clear, I’m quite proud of how this talk turned out 😄

Slides

The copyright for these slides belongs to Sahaj Software. If you reuse or benefit from this content, please provide attribution. Click here for the slides.

This post is licensed under CC BY 4.0 by the author.