Orchestrating Pipelines with Apache Airflow
DAGs, operators, scheduling, and production best practices for Airflow.
Updated May 21, 2026
44 views
Core concepts
- DAG — Directed Acyclic Graph. A Python file defining tasks and their dependencies.
- Operator — a single task type: PythonOperator, BashOperator, SQLExecuteQueryOperator, etc.
- Task instance — one execution of one operator for one DAG run.
- Scheduler — evaluates DAGs and creates task instances based on schedule and dependencies.
A minimal DAG
from airflow.decorators import dag, task
from datetime import datetime
@dag(schedule="0 6 * * *", start_date=datetime(2024, 1, 1))
def my_pipeline():
@task
def extract(): return fetch_data()
@task
def load(data): write_to_warehouse(data)
load(extract())
my_pipeline()
Production tips
- Use the TaskFlow API (decorators) for new DAGs — much cleaner than classic operators.
- Store secrets in Airflow Connections/Variables or a secrets backend (AWS Secrets Manager, Vault).
- Set
max_active_runs=1 on pipelines that are not safe to run concurrently.
- Use
on_failure_callback to send alerts to Slack or PagerDuty.