Search results
21 resultsData Warehouse Modelling — Star Schema and Dimensional Design
Facts, dimensions, slowly changing dimensions, and why modelling choices matter for query performance.
Apache Iceberg — The Open Table Format Explained
Snapshots, schema evolution, partition evolution, time travel, and compaction.
PostgreSQL Performance Tuning Fundamentals
Indexing strategy, EXPLAIN ANALYZE, vacuum, and configuration settings that matter most.
Privacy-First Data Design — PII Handling Patterns
Tokenisation, pseudonymisation, encryption at rest, and right-to-deletion workflows.
Database Schema Migration Strategies
Expand-contract pattern, zero-downtime migrations, and tooling.
Data Governance — Principles and Practical Implementation
Ownership, cataloguing, lineage tracking, and access control at scale.
Multi-Tenancy Patterns — Database-per-Tenant, Schema-per-Tenant, and Row-Level
Tradeoffs for SaaS data isolation, compliance, and operational complexity.
SQL Query Optimisation — Indexes, Execution Plans, and N+1
Practical techniques for making slow queries fast.
Getting Started with dbt (data build tool)
Models, tests, documentation, and the dbt workflow for transforming warehouse data.
Implementing Data Lineage Tracking
Column-level lineage, tools, and why it is critical for debugging and compliance.
Fix 100% CPU Usage in Windows
Identify what is consuming CPU and permanently resolve the issue.
Snowflake Best Practices for Cost and Performance
Virtual warehouses, clustering, query optimization, and controlling spend.
Data Observability — Detecting Silent Pipeline Failures
Freshness, volume, distribution, schema, and lineage monitoring for data reliability.
Delta Lake — ACID Transactions for Your Data Lake
Transaction log, upserts, schema enforcement, and time travel on S3.
Amazon Redshift — Architecture and Query Optimization
Distribution styles, sort keys, VACUUM, ANALYZE, and WLM tuning.
Parquet vs CSV — Why Columnar Storage Matters
How Parquet's columnar format reduces storage costs and speeds up analytical queries.
API Pagination — Cursor, Offset, and Keyset Patterns
When each method works, performance tradeoffs, and implementation details.
Data & Platform — Service Overview
Pipelines, vector stores, governance, and privacy-first data design.
Data Platform Cost Optimization Strategies
Reducing Snowflake, S3, Spark, and Kafka spend without sacrificing performance.
Building a Data Catalog with DataHub
Ingestion, metadata, search, and making your catalog actually useful.
BigQuery Cost and Performance Optimization
Partitioned tables, clustered tables, slot usage, and avoiding full scans.