Skip to main content

Posts

Showing posts with the label pgvector

How to Implement Hybrid Search (Vector + Keyword) in Supabase with pgvector

  Pure vector search is transformative, but it has a glaring weakness: precision. While semantic search excels at understanding intent (e.g., mapping "guidelines for visual design" to "style guide"), it often fails miserably at specific keyword matching. If a user searches for a specific error code ("ERR-902"), a SKU, or a proper noun, vector embeddings often "hallucinate" associations or drown out the exact match with conceptually similar but irrelevant results. This is the  Dense vs. Sparse vector problem . To build a production-grade search (RAG) system, you need  Hybrid Search . This technique combines the conceptual understanding of embeddings (Dense) with the precise matching of Full-Text Search (Sparse). Here is a rigorous guide to implementing Hybrid Search in Supabase using  pgvector  and Reciprocal Rank Fusion (RRF). The Root Cause: Why Vector Search Isn't Enough Embeddings work by compressing high-dimensional data (text) into a low...

PostgreSQL pgvector Optimization: Tuning HNSW vs IVFFlat for Billion-Scale Embeddings

  The "Day Two" Scaling Trap Building a vector search prototype with PostgreSQL and  pgvector  is deceptively simple. You spin up a Docker container, ingest 10,000 document chunks, run a standard Cosine Similarity query, and get results in 15 milliseconds. Then "Day Two" arrives. You move to production with 50 million vectors (e.g., OpenAI’s  text-embedding-3-small  at 1536 dimensions). Suddenly, your index build takes 14 hours, your RAM usage spikes to 100%, causing OOM kills, and your simple  SELECT  query takes 2 seconds to return. The default configuration of  pgvector  is not tuned for high-throughput or large datasets. Scaling requires a deliberate choice between  IVFFlat  (Inverted File Flat) and  HNSW  (Hierarchical Navigable Small World) indexes, and more importantly, tuning the critical build-time and query-time parameters that govern the precision-performance trade-off. Root Cause: The Geometry of Indexing The p...