Define explicit mappings

Never rely on dynamic mapping in production. It guesses types from the first document and often gets it wrong. Define mappings upfront:

PUT /orders
{
  "mappings": {
    "properties": {
      "order_id": { "type": "keyword" },
      "amount":   { "type": "float" },
      "created":  { "type": "date" },
      "notes":    { "type": "text", "analyzer": "english" }
    }
  }
}

keyword vs text

keyword: exact match, aggregations, sorting. text: full-text search with tokenisation. For fields you need both: use text with a keyword sub-field.

Shard sizing

Target 20–50 GB per shard. Too many small shards wastes overhead; too few large shards limits parallelism. Start with 1 primary shard per index for small datasets.

Bulk indexing

Use the Bulk API — single-document indexing has per-request overhead. Batch 5–15 MB per bulk request. Disable refresh during initial load (refresh_interval: -1), then set to 1s after.