Search results
35 resultsApache Iceberg — The Open Table Format Explained
Snapshots, schema evolution, partition evolution, time travel, and compaction.
Multi-Tenancy Patterns — Database-per-Tenant, Schema-per-Tenant, and Row-Level
Tradeoffs for SaaS data isolation, compliance, and operational complexity.
Graph Databases — When to Use Neo4j Over Relational
Nodes, edges, Cypher queries, and use cases where graph beats SQL.
Secure Coding — OWASP Top 10 for Backend Engineers
Injection, broken auth, XSS, IDOR, and how to prevent each.
Applied AI & ML — Service Overview
Everything included in our Applied AI engagements: RAG, agents, fine-tuning, evals, and guardrails.
Data Warehouse Modelling — Star Schema and Dimensional Design
Facts, dimensions, slowly changing dimensions, and why modelling choices matter for query performance.
PostgreSQL Performance Tuning Fundamentals
Indexing strategy, EXPLAIN ANALYZE, vacuum, and configuration settings that matter most.
REST API Versioning Strategies
URL path, header, and query-param versioning compared with real-world tradeoffs.
What is Retrieval-Augmented Generation (RAG)?
A plain-English explanation of RAG: why it beats pure LLM memory for production knowledge systems.
SQL Query Optimisation — Indexes, Execution Plans, and N+1
Practical techniques for making slow queries fast.
Data Lake vs Data Warehouse vs Lakehouse
Practical comparison of the three architectures and how to choose.
Data Platform Cost Optimization Strategies
Reducing Snowflake, S3, Spark, and Kafka spend without sacrificing performance.
DuckDB — Blazing Fast Local Analytics
When to reach for DuckDB instead of Spark, and how to use it effectively.
MongoDB Schema Design Patterns
Embedding vs referencing, the subset pattern, and indexing strategy.
Redis Caching Patterns for Production Applications
Cache-aside, write-through, TTL strategy, and cache invalidation approaches.
Data & Platform — Service Overview
Pipelines, vector stores, governance, and privacy-first data design.
Snowflake Best Practices for Cost and Performance
Virtual warehouses, clustering, query optimization, and controlling spend.
Distributed Tracing — Propagating Context Across Services
Trace context propagation, sampling strategies, and analysing traces.
Amazon Redshift — Architecture and Query Optimization
Distribution styles, sort keys, VACUUM, ANALYZE, and WLM tuning.
GraphQL vs REST — When to Use Each
Comparing query flexibility, over-fetching, tooling, and operational complexity.
Vector Embeddings — How They Work and Where They Live
From text to vectors, similarity search, and choosing the right embedding model.
Implementing Search — From Basic SQL to Elasticsearch
Full-text search progression from LIKE queries to dedicated search engines.
Materialised Views — When and How to Use Them
Incremental refresh, use cases, and implementation across Postgres, Snowflake, and dbt.
Real-Time Analytics Architecture Patterns
Lambda, Kappa, HTAP, and choosing the right pattern for sub-second analytics.
BigQuery Cost and Performance Optimization
Partitioned tables, clustered tables, slot usage, and avoiding full scans.
Designing a Data Lake on AWS S3
Folder structure, naming conventions, lifecycle policies, and access patterns.
Event-Driven Data Architecture Patterns
Event sourcing, CQRS, outbox pattern, and when event-driven beats request/response.
CDN and Edge Caching Strategy
Origin offload, cache key design, purging, and choosing a CDN.
Apache Spark — Core Concepts and When to Use It
RDDs, DataFrames, Spark SQL, and the use cases where Spark is the right tool.
Elasticsearch Indexing Strategy and Performance
Mapping, sharding, bulk indexing, and query optimization for Elasticsearch.
Event Sourcing and CQRS — Practical Implementation
Event store design, projection rebuilding, and operational realities.
Parquet vs CSV — Why Columnar Storage Matters
How Parquet's columnar format reduces storage costs and speeds up analytical queries.
Time-Series Databases — InfluxDB vs TimescaleDB vs ClickHouse
Comparing purpose-built and general-purpose solutions for time-series data.
Fine-tuning LLMs: when, why, and how
A practical guide to LoRA, QLoRA, and full fine-tuning for production use cases.
Trino (formerly PrestoSQL) — Federated SQL Across Data Sources
Architecture, connectors, query federation, and performance tuning.