What an embedding is
An embedding is a dense numerical vector that represents the semantic meaning of text, an image, or any object. Similar things have vectors that are close together in high-dimensional space (measured by cosine similarity or dot product).
How they are generated
A neural network (BERT, text-embedding-3, e5-large, etc.) encodes input into a fixed-size vector (typically 384–3072 dimensions). The model is trained so semantically related inputs produce geometrically close vectors.
Choosing an embedding model
- Use MTEB leaderboard (huggingface.co/spaces/mteb/leaderboard) to compare models on your task type (retrieval, classification, clustering).
- Larger dimensions are not always better — benchmark on your actual data.
- For multilingual: use a multilingual-e5 or multilingual-instructor model.
Storage and search
Embeddings are stored in a vector database (pgvector, Pinecone, Weaviate, Qdrant). At query time, the query is embedded and an approximate nearest-neighbour (ANN) search returns the most similar stored vectors. HNSW and IVF-Flat are the dominant ANN index types.