Why governance fails

Governance initiatives fail when they become pure bureaucracy. Effective governance is infrastructure that makes doing the right thing the easiest thing.

Data ownership

Every dataset needs an owner — a team, not an individual. The owner is responsible for quality SLAs, documentation, and access requests. Without ownership, everything drifts.

Data catalog

A searchable inventory of datasets with descriptions, owners, lineage, and sample data. Open-source options: DataHub, Apache Atlas. Managed: Alation, Collibra. Start with DataHub — it has pre-built connectors for most warehouses and has strong lineage support.

Column-level lineage

Track which source columns contribute to which output columns. Critical for GDPR compliance (where did this PII come from?) and debugging (why did this metric change?).

Access control

Role-based access at the warehouse level (Snowflake roles, BigQuery IAM). Data masking for PII — analysts see ****@****.com instead of real emails unless they have PII role. Audit logs on all access.