Read the execution plan first
Never guess. Run EXPLAIN ANALYZE (Postgres) or EXPLAIN FORMAT=JSON (MySQL) to see what the database is actually doing. Look for: sequential scans on large tables, poor row count estimates, and nested loops with large cardinality.
Index the right columns
- Index every foreign key — JOINs on unindexed FKs are table scans.
- Composite index column order: equality first, range second, sort third.
- Partial indexes for filtered queries:
CREATE INDEX ON orders (customer_id) WHERE status = 'open'. - Covering indexes: include all SELECT columns in the index so the DB never touches the table.
The N+1 query problem
-- BAD: N+1 (1 query for orders + N queries for customers)
orders = db.query("SELECT * FROM orders LIMIT 100")
for order in orders:
order.customer = db.query("SELECT * FROM customers WHERE id=?", order.customer_id)
-- GOOD: 2 queries total
orders = db.query("SELECT * FROM orders LIMIT 100")
ids = [o.customer_id for o in orders]
customers = {c.id: c for c in db.query("SELECT * FROM customers WHERE id IN (?)", ids)}
Avoid SELECT *
Fetch only the columns you use. On wide tables with TEXT or BLOB columns, SELECT * transfers kilobytes per row unnecessarily.