Why Flink over Spark Streaming
Flink is a true streaming engine — it processes one event at a time with low latency (milliseconds). Spark Streaming uses micro-batches (seconds). For strict latency requirements, Flink wins.
Event time vs processing time
- Event time — when the event actually occurred (embedded in the payload). Correct for business logic but requires handling late-arriving events.
- Processing time — when the event arrives at Flink. Simpler but produces wrong results for out-of-order data.
Use event time with watermarks for production pipelines. Watermarks tell Flink how late events can be before a window closes.
Windows
- Tumbling — fixed, non-overlapping (1-minute aggregations).
- Sliding — overlapping windows (5-min window every 1 min).
- Session — gap-based, closes after inactivity period.
State backends
Use RocksDB state backend for large state. It spills to disk and supports incremental checkpoints to S3. Default heap-based backend is fine for small state but OOMs at scale.