Programming Tutorials

Posts

Showing posts with the label pgvector

How to Implement Hybrid Search (Vector + Keyword) in Supabase with pgvector

Pure vector search is transformative, but it has a glaring weakness: precision. While semantic search excels at understanding intent (e.g., mapping "guidelines for visual design" to "style guide"), it often fails miserably at specific keyword matching. If a user searches for a specific error code ("ERR-902"), a SKU, or a proper noun, vector embeddings often "hallucinate" associations or drown out the exact match with conceptually similar but irrelevant results. This is the Dense vs. Sparse vector problem . To build a production-grade search (RAG) system, you need Hybrid Search . This technique combines the conceptual understanding of embeddings (Dense) with the precise matching of Full-Text Search (Sparse). Here is a rigorous guide to implementing Hybrid Search in Supabase using pgvector and Reciprocal Rank Fusion (RRF). The Root Cause: Why Vector Search Isn't Enough Embeddings work by compressing high-dimensional data (text) into a low...

PostgreSQL pgvector Optimization: Tuning HNSW vs IVFFlat for Billion-Scale Embeddings

The "Day Two" Scaling Trap Building a vector search prototype with PostgreSQL and pgvector is deceptively simple. You spin up a Docker container, ingest 10,000 document chunks, run a standard Cosine Similarity query, and get results in 15 milliseconds. Then "Day Two" arrives. You move to production with 50 million vectors (e.g., OpenAI’s text-embedding-3-small at 1536 dimensions). Suddenly, your index build takes 14 hours, your RAM usage spikes to 100%, causing OOM kills, and your simple SELECT query takes 2 seconds to return. The default configuration of pgvector is not tuned for high-throughput or large datasets. Scaling requires a deliberate choice between IVFFlat (Inverted File Flat) and HNSW (Hierarchical Navigable Small World) indexes, and more importantly, tuning the critical build-time and query-time parameters that govern the precision-performance trade-off. Root Cause: The Geometry of Indexing The p...