Vector Embeddings — How They Work and Where They Live

From text to vectors, similarity search, and choosing the right embedding model.

Updated May 23, 2026 41 views

What an embedding is

An embedding is a dense numerical vector that represents the semantic meaning of text, an image, or any object. Similar things have vectors that are close together in high-dimensional space (measured by cosine similarity or dot product).

How they are generated

A neural network (BERT, text-embedding-3, e5-large, etc.) encodes input into a fixed-size vector (typically 384–3072 dimensions). The model is trained so semantically related inputs produce geometrically close vectors.

Choosing an embedding model

Use MTEB leaderboard (huggingface.co/spaces/mteb/leaderboard) to compare models on your task type (retrieval, classification, clustering).
Larger dimensions are not always better — benchmark on your actual data.
For multilingual: use a multilingual-e5 or multilingual-instructor model.

Storage and search

Embeddings are stored in a vector database (pgvector, Pinecone, Weaviate, Qdrant). At query time, the query is embedded and an approximate nearest-neighbour (ANN) search returns the most similar stored vectors. HNSW and IVF-Flat are the dominant ANN index types.

Vector Embeddings — How They Work and Where They Live

What an embedding is

How they are generated

Choosing an embedding model

Storage and search

Related articles

Choosing a vector database: pgvector vs Pinecone vs Weaviate

Building a Data Quality Framework

Privacy-First Data Design — PII Handling Patterns

Apache Kafka — Core Concepts and When to Use It