Why Flink over Spark Streaming

Flink is a true streaming engine — it processes one event at a time with low latency (milliseconds). Spark Streaming uses micro-batches (seconds). For strict latency requirements, Flink wins.

Event time vs processing time

  • Event time — when the event actually occurred (embedded in the payload). Correct for business logic but requires handling late-arriving events.
  • Processing time — when the event arrives at Flink. Simpler but produces wrong results for out-of-order data.

Use event time with watermarks for production pipelines. Watermarks tell Flink how late events can be before a window closes.

Windows

  • Tumbling — fixed, non-overlapping (1-minute aggregations).
  • Sliding — overlapping windows (5-min window every 1 min).
  • Session — gap-based, closes after inactivity period.

State backends

Use RocksDB state backend for large state. It spills to disk and supports incremental checkpoints to S3. Default heap-based backend is fine for small state but OOMs at scale.