The six dimensions
- Completeness — are required fields populated?
- Accuracy — do values reflect reality?
- Consistency — do related records agree?
- Timeliness — is data arriving within SLA?
- Uniqueness — are there unexpected duplicates?
- Validity — do values conform to expected formats and ranges?
Validation layers
- Source validation — check incoming data before it enters the pipeline.
- Pipeline validation — Great Expectations or custom assertions mid-pipeline.
- Warehouse validation — dbt tests on modelled data.
- Dashboard validation — metric monitors that alert when KPIs move unexpectedly.
Great Expectations
Define expectations as code: expect_column_values_to_not_be_null, expect_column_values_to_be_between. Run them as checkpoint steps in your pipeline. Results are stored and browsable in a Data Docs site.
Alerting
Set up anomaly detection on row counts and null rates per table. A sudden 40% drop in row count is usually a pipeline failure, not a business event.