What an embedding is

An embedding is a dense numerical vector that represents the semantic meaning of text, an image, or any object. Similar things have vectors that are close together in high-dimensional space (measured by cosine similarity or dot product).

How they are generated

A neural network (BERT, text-embedding-3, e5-large, etc.) encodes input into a fixed-size vector (typically 384–3072 dimensions). The model is trained so semantically related inputs produce geometrically close vectors.

Choosing an embedding model

  • Use MTEB leaderboard (huggingface.co/spaces/mteb/leaderboard) to compare models on your task type (retrieval, classification, clustering).
  • Larger dimensions are not always better — benchmark on your actual data.
  • For multilingual: use a multilingual-e5 or multilingual-instructor model.

Storage and search

Embeddings are stored in a vector database (pgvector, Pinecone, Weaviate, Qdrant). At query time, the query is embedded and an approximate nearest-neighbour (ANN) search returns the most similar stored vectors. HNSW and IVF-Flat are the dominant ANN index types.