How to Implement Hybrid Search (Vector + Keyword) in Supabase with pgvector

Pure vector search is transformative, but it has a glaring weakness: precision. While semantic search excels at understanding intent (e.g., mapping "guidelines for visual design" to "style guide"), it often fails miserably at specific keyword matching.

If a user searches for a specific error code ("ERR-902"), a SKU, or a proper noun, vector embeddings often "hallucinate" associations or drown out the exact match with conceptually similar but irrelevant results. This is the Dense vs. Sparse vector problem.

To build a production-grade search (RAG) system, you need Hybrid Search. This technique combines the conceptual understanding of embeddings (Dense) with the precise matching of Full-Text Search (Sparse).

Here is a rigorous guide to implementing Hybrid Search in Supabase using pgvector and Reciprocal Rank Fusion (RRF).

The Root Cause: Why Vector Search Isn't Enough

Embeddings work by compressing high-dimensional data (text) into a lower-dimensional float array (usually 1536 dimensions for OpenAI). During this compression, specific, rare tokens often lose their distinctiveness.

The Semantic Gap:

Dense Vectors (Embeddings): Capture context and relationships. Good for "How do I reset my password?"
Sparse Vectors (Keywords): Capture exact token presence. Good for "reset_password_v2 function".

When you rely solely on cosine similarity, a search for a specific version number might return a generic versioning document because semantically, they are "close," even if the user needs the exact string match.

The challenge in Hybrid Search is mathematical: How do you combine the results? You cannot simply add Cosine Similarity (0.0 to 1.0) to ts_rank (0.0 to infinity). The scales do not match.

The industry-standard solution is Reciprocal Rank Fusion (RRF). Instead of combining scores, we combine ranks.

Step 1: Database Schema and Extensions

First, we must prepare the Supabase PostgreSQL environment. We need the vector extension for embeddings and standard Postgres features for text search.

Run the following SQL in the Supabase SQL Editor:

-- Enable the pgvector extension to work with embeddings
create extension if not exists vector;

-- Create a table for documentation/content
create table documents (
  id bigserial primary key,
  content text not null,
  metadata jsonb default '{}',
  -- OpenAI embeddings are 1536 dimensions
  embedding vector(1536)
);

-- Create a generated column for Full Text Search (FTS)
-- This automatically updates when 'content' changes.
alter table documents
add column fts tsvector generated always as (to_tsvector('english', content)) stored;

-- Create indexes for performance
-- 1. HNSW index for fast vector search (approximate nearest neighbor)
create index on documents using hnsw (embedding vector_cosine_ops);

-- 2. GIN index for fast keyword search
create index on documents using gin (fts);

Why HNSW?

We chose HNSW (Hierarchical Navigable Small World) over IVFFlat. HNSW offers superior recall and performance for high-dimensional data at the cost of slightly slower build times, which is preferable for read-heavy search workloads.

Step 2: The Hybrid Search Logic (PL/pgSQL)

We will encapsulate the search logic in a PostgreSQL function (RPC). This allows us to execute the complex RRF logic on the server side, keeping the client clean and reducing network latency.

This function performs three steps:

Vector Search: Finds the top match_count documents by semantic similarity.
Keyword Search: Finds the top match_count documents by keyword frequency.
Fusion (RRF): Re-ranks the combined results.

create or replace function hybrid_search(
  query_text text,
  query_embedding vector(1536),
  match_count int,
  full_text_weight float default 1.0,
  semantic_weight float default 1.0,
  rrf_k int default 60
)
returns setof documents
language sql
as $$
with semantic_search as (
  select
    id,
    rank() over (order by embedding <=> query_embedding) as rank_semantic
  from
    documents
  order by
    embedding <=> query_embedding
  limit match_count
),
keyword_search as (
  select
    id,
    rank() over (order by ts_rank_cd(fts, websearch_to_tsquery(query_text)) desc) as rank_keyword
  from
    documents
  where
    fts @@ websearch_to_tsquery(query_text)
  limit match_count
),
joined_results as (
  select
    coalesce(s.id, k.id) as id,
    s.rank_semantic,
    k.rank_keyword
  from
    semantic_search s
  full outer join
    keyword_search k on s.id = k.id
),
final_scoring as (
  select
    j.id,
    -- RRF Formula: 1 / (k + rank)
    (
      coalesce(1.0 / (rrf_k + j.rank_semantic), 0.0) * semantic_weight +
      coalesce(1.0 / (rrf_k + j.rank_keyword), 0.0) * full_text_weight
    ) as score
  from
    joined_results j
)
select
  d.*
from
  final_scoring s
join
  documents d on s.id = d.id
order by
  s.score desc
limit match_count;
$$;

Deconstructing the SQL

websearch_to_tsquery: This handles natural language input for keywords better than raw to_tsquery. It supports quotes for exact phrases and ignores punctuation.
<=> Operator: This is the cosine distance operator in pgvector.
Full Outer Join: Critical. A document might appear in the keyword search but not the vector search (or vice versa). We need to score it regardless.
rrf_k: A smoothing constant (usually 60). It ensures that highly ranked documents in one list don't completely dominate the other list if they are outliers.

Step 3: Client-Side Implementation (TypeScript)

Now we consume this function from a TypeScript application (e.g., Next.js). We assume you have the supabase-js client initialized and a way to generate embeddings (like OpenAI).

import { createClient } from '@supabase/supabase-js';
import OpenAI from 'openai';

// Initialize clients
const supabase = createClient(process.env.SUPABASE_URL!, process.env.SUPABASE_KEY!);
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function searchDocs(userQuery: string) {
  // 1. Generate the embedding for the search query
  const embeddingResponse = await openai.embeddings.create({
    model: 'text-embedding-3-small', // Ensure this matches DB dimensions (1536)
    input: userQuery,
  });

  const queryEmbedding = embeddingResponse.data[0].embedding;

  // 2. Call the Supabase RPC function
  const { data: documents, error } = await supabase.rpc('hybrid_search', {
    query_text: userQuery,
    query_embedding: queryEmbedding,
    match_count: 10,       // Retrieve top 10 results
    full_text_weight: 1.0, // Give equal weight to both methods
    semantic_weight: 1.0,
    rrf_k: 60,             // Standard RRF constant
  });

  if (error) {
    console.error('Hybrid search failed:', error);
    throw error;
  }

  return documents;
}

// Usage Example
// searchDocs("How do I configure connection pooling error 503?");

Performance Optimization Strategy

Hybrid search is more resource-intensive than simple lookups because it executes two distinct search algorithms per query. To maintain low latency, consider these optimizations:

1. Optimize the `match_count`

In the RPC function, notice we LIMIT the internal CTEs (Common Table Expressions) by match_count before joining.

Do not retrieve 10,000 rows for ranking.
Fetching the top 20-50 from each method (vector and keyword) is statistically sufficient to find the best intersection.

2. Tuning `rrf_k`

The constant k (default 60) determines how much influence low-ranking documents have.

Higher k: Flattens the curve. Rank 1 and Rank 10 become closer in score.
Lower k: Gives massive advantage to the top 1-3 results.
Stick to 60 unless you have empirical data suggesting otherwise.

3. Index Maintenance

Postgres indexes (especially HNSW) require maintenance. If your table sees heavy INSERT/UPDATE traffic, the HNSW graph can degrade. Run VACUUM ANALYZE documents; periodically to optimize the query planner statistics.

Handling Edge Cases

Zero Results on FTS

If the user searches for a concept with no exact keywords (e.g., "sadness" when the text only contains "melancholy"), the keyword_search CTE will return empty. The FULL OUTER JOIN handles this gracefully. The coalesce(..., 0.0) in the scoring logic ensures the math doesn't break, falling back entirely to the vector score.

Short Queries

Vector embeddings can be noisy for very short queries (1-2 words). Fix: You might enforce a minimum character length before triggering the embedding generation, or boost full_text_weight dynamically if query_text.length < 5.

Conclusion

Implementing Hybrid Search in Supabase elevates your application from a basic prototype to a production-ready tool. By combining pgvector for intent and standard Postgres FTS for precision, and fusing them with RRF, you solve the "hallucination" problem inherent in AI search.

This approach ensures that when your users search for a specific error code, they find it—and when they search for a vague concept, they find that too.

Programming Tutorials

Search This Blog