Search results
22 resultsCI/CD Pipeline Design — From Commit to Production
Stages, gates, deployment strategies, and keeping pipelines fast.
How we run a one-week research spike
The exact process we use to de-risk a technical bet in five days.
Building a Data Quality Framework
Dimensions of data quality, validation layers, and monitoring in production pipelines.
Introduction to Data Pipelines
What a data pipeline is, the core stages, and when to build vs buy.
Getting Started with dbt (data build tool)
Models, tests, documentation, and the dbt workflow for transforming warehouse data.
Change Data Capture (CDC) — Debezium and Log-Based CDC
How CDC works, why it beats polling, and how to implement it with Debezium.
WebSockets — Building Real-Time Features
Connection lifecycle, heartbeats, reconnection logic, and scaling with Redis pub/sub.
Data Lake vs Data Warehouse vs Lakehouse
Practical comparison of the three architectures and how to choose.
How long does a typical project take?
Timeline expectations from kick-off to launch.
Do you sign NDAs?
Standard policy on confidentiality and IP.
Container Registry Management and Image Lifecycle
Tagging conventions, vulnerability scanning, retention policies, and registry options.
Feature Stores — Bridging Data Engineering and ML
What a feature store is, online vs offline stores, and when to build vs buy.
Monitoring and Alerting for Data Pipelines
What to monitor, SLIs/SLOs for data, and building effective alerting.
Docker Containerisation Best Practices
Writing efficient Dockerfiles, multi-stage builds, security hardening, and image size reduction.
API Testing Strategy — Unit, Integration, Contract, and E2E
Building a test pyramid that catches real bugs without slowing delivery.
Building a Data Catalog with DataHub
Ingestion, metadata, search, and making your catalog actually useful.
Data Contracts — Formalising Agreements Between Producers and Consumers
Schema, SLAs, semantics, and how to enforce data contracts in practice.
Dependency Management and Supply Chain Security
Lock files, vulnerability scanning, SBOM, and keeping dependencies up to date.
Event Sourcing and CQRS — Practical Implementation
Event store design, projection rebuilding, and operational realities.
Fixing CPU Overheating — Thermal Paste and Cooler Guide
How to diagnose thermal throttling, replace thermal paste, and improve CPU cooler performance.
Extracting Microservices from a Monolith
The strangler fig pattern, identifying seams, and avoiding the distributed monolith.
Trino (formerly PrestoSQL) — Federated SQL Across Data Sources
Architecture, connectors, query federation, and performance tuning.