5s3m& | Lessons Learned From Running Apache Airflow at Scale — Data Science & Engineering (2022)
https://shopify.engineering/lessons-learned-apache-airflow-scale
Saved on 2022-05-24 [19136 edays] via shopify.engineering
Modified 2023-09-04 [19604 edays]
data programming

Apache Airflow is an orchestration platform that enables development, scheduling and monitoring of workflows. At Shopify, we’ve been running Airflow in production for over two years for a variety of workflows, including data extractions, machine learning model training, Apache Iceberg table maintenance, and DBT-powered data modeling. At the time of writing, we are currently running Airflow 2.2 on Kubernetes, using the Celery executor and MySQL 8.