Idempotency is non-negotiable
Every task must be safe to re-run. If a task fails mid-way and is retried, it should not create duplicate rows or corrupt state. Use UPSERT (INSERT ... ON CONFLICT) over plain INSERT; write to a staging path then atomically swap.
Avoid dynamic DAGs where possible
DAGs generated at parse time from database queries slow the scheduler. Prefer static DAGs with dynamic task mapping (Airflow 2.3+) using expand() instead.
Backfilling
Design DAGs with start_date and use logical_date (formerly execution_date) in your queries. This enables clean backfills: airflow dags backfill -s 2024-01-01 -e 2024-03-01 my_dag.
SLA monitoring
Set sla on tasks that have latency commitments. Airflow calls sla_miss_callback when a task exceeds its SLA — wire this to Slack or PagerDuty.
Common pitfalls
- Top-level code in DAG files runs at every parse cycle (every 30 seconds by default) — never make DB calls at import time.
- Do not use
depends_on_past=Truewithout understanding its backfill implications.