Define explicit mappings
Never rely on dynamic mapping in production. It guesses types from the first document and often gets it wrong. Define mappings upfront:
PUT /orders
{
"mappings": {
"properties": {
"order_id": { "type": "keyword" },
"amount": { "type": "float" },
"created": { "type": "date" },
"notes": { "type": "text", "analyzer": "english" }
}
}
}
keyword vs text
keyword: exact match, aggregations, sorting. text: full-text search with tokenisation. For fields you need both: use text with a keyword sub-field.
Shard sizing
Target 20–50 GB per shard. Too many small shards wastes overhead; too few large shards limits parallelism. Start with 1 primary shard per index for small datasets.
Bulk indexing
Use the Bulk API — single-document indexing has per-request overhead. Batch 5–15 MB per bulk request. Disable refresh during initial load (refresh_interval: -1), then set to 1s after.