Search results
13 resultsPrivacy-First Data Design — PII Handling Patterns
Tokenisation, pseudonymisation, encryption at rest, and right-to-deletion workflows.
Designing a Data Lake on AWS S3
Folder structure, naming conventions, lifecycle policies, and access patterns.
Getting Started with dbt (data build tool)
Models, tests, documentation, and the dbt workflow for transforming warehouse data.
ETL vs ELT — Which Pattern Should You Use?
Understand the difference between Extract-Transform-Load and Extract-Load-Transform and when each fits.
Trino (formerly PrestoSQL) — Federated SQL Across Data Sources
Architecture, connectors, query federation, and performance tuning.
Real-Time Analytics Architecture Patterns
Lambda, Kappa, HTAP, and choosing the right pattern for sub-second analytics.
Time-Series Databases — InfluxDB vs TimescaleDB vs ClickHouse
Comparing purpose-built and general-purpose solutions for time-series data.
Snowflake Best Practices for Cost and Performance
Virtual warehouses, clustering, query optimization, and controlling spend.
DuckDB — Blazing Fast Local Analytics
When to reach for DuckDB instead of Spark, and how to use it effectively.
Parquet vs CSV — Why Columnar Storage Matters
How Parquet's columnar format reduces storage costs and speeds up analytical queries.
Building a Data Catalog with DataHub
Ingestion, metadata, search, and making your catalog actually useful.
PostgreSQL Replication — Streaming, Logical, and Read Replicas
Set up read replicas, understand WAL, and choose between streaming and logical replication.
API Documentation Best Practices
What makes documentation useful, tooling, and keeping docs accurate.