Skip to main content

Posts

Showing posts with the label pgvector

PostgreSQL pgvector Optimization: Tuning HNSW vs IVFFlat for Billion-Scale Embeddings

  The "Day Two" Scaling Trap Building a vector search prototype with PostgreSQL and  pgvector  is deceptively simple. You spin up a Docker container, ingest 10,000 document chunks, run a standard Cosine Similarity query, and get results in 15 milliseconds. Then "Day Two" arrives. You move to production with 50 million vectors (e.g., OpenAI’s  text-embedding-3-small  at 1536 dimensions). Suddenly, your index build takes 14 hours, your RAM usage spikes to 100%, causing OOM kills, and your simple  SELECT  query takes 2 seconds to return. The default configuration of  pgvector  is not tuned for high-throughput or large datasets. Scaling requires a deliberate choice between  IVFFlat  (Inverted File Flat) and  HNSW  (Hierarchical Navigable Small World) indexes, and more importantly, tuning the critical build-time and query-time parameters that govern the precision-performance trade-off. Root Cause: The Geometry of Indexing The p...