Apache Iceberg — The Open Table Format Explained

Snapshots, schema evolution, partition evolution, time travel, and compaction.

Updated May 19, 2026 46 views

What Iceberg solves

Traditional Hive tables on S3 have no ACID transactions, no schema evolution, and terrible performance on large datasets (full partition scans). Iceberg adds all of this while keeping data in open Parquet/ORC files.

Snapshots and time travel

Every write creates a new snapshot. You can query any historical snapshot:

SELECT * FROM orders
FOR SYSTEM_TIME AS OF '2024-01-15 00:00:00';

Snapshots also enable safe concurrent writes — readers are never blocked by writers.

Schema evolution

Add, rename, or drop columns without rewriting data files. Iceberg tracks column IDs, not names, so renaming a column does not break readers of old files.

Partition evolution

Change the partitioning strategy without rewriting data. Old data retains its old partitioning; new data uses the new partitioning. Queries read both transparently.

Compaction

Streaming writes produce many small files. Run compaction periodically to merge them into larger files for faster reads: CALL system.rewrite_data_files('orders').

Apache Iceberg — The Open Table Format Explained

What Iceberg solves

Snapshots and time travel

Schema evolution

Partition evolution

Compaction

Related articles

Data Governance — Principles and Practical Implementation

Choosing a vector database: pgvector vs Pinecone vs Weaviate

Privacy-First Data Design — PII Handling Patterns

Apache Kafka — Core Concepts and When to Use It