What Delta Lake adds to S3

Delta Lake wraps Parquet files with a transaction log (_delta_log/). The log records every operation as JSON entries. This gives you ACID properties, time travel, and optimistic concurrency on plain object storage.

MERGE (upsert)

MERGE INTO orders USING updates
ON orders.order_id = updates.order_id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *;

This was impossible on plain Parquet/Hive tables without full rewrites.

Schema enforcement and evolution

By default, Delta rejects writes that do not match the table schema. To add columns: ALTER TABLE orders ADD COLUMNS (discount FLOAT). Old files simply return NULL for the new column.

OPTIMIZE and Z-ORDER

OPTIMIZE orders ZORDER BY (customer_id, order_date);

Compacts small files and co-locates related data. Queries filtering on ZORDER columns skip more files, dramatically improving performance.