There is a specific, maddening silence that occurs only in production environments. You have built a reactive dashboard or a chat application using Supabase. On localhost:3000, data flows instantly. You open two browser windows, update a row in one, and the other updates via WebSocket immediately.
Then you deploy to Vercel, Netlify, or AWS. You trigger an update. The database records the change, but the UI remains stale. The WebSocket connection seems open, but the payload never arrives.
This is rarely a bug in the Supabase SDK. It is almost always a configuration mismatch between the default PostgreSQL settings and the strict requirements of Logical Replication in a production environment. Here is why your subscriptions are failing and how to fix them with production-grade rigor.
The Architecture of "Realtime"
To debug this effectively, you must understand the underlying mechanics. Supabase Realtime is not a simple polling mechanism. It relies on PostgreSQL's Write-Ahead Log (WAL).
When you insert, update, or delete a row, Postgres writes this event to the WAL. Supabase utilizes a feature called Logical Replication. A specialized service (Realtime) acts as a replication slot, subscribing to the pgoutput plugin. It listens for WAL changes, filters them based on your client's request, converts them to JSON, and broadcasts them over WebSockets.
Why It Fails in Production
In a local development environment (using supabase start), the Supabase CLI often configures defaults permissively to reduce friction. However, in a hosted Supabase project, two critical security and performance gates prevent data from flowing by default:
- The Publication Gate: To save resources, Postgres does not broadcast changes for every table. Tables must be explicitly added to the
supabase_realtimepublication. - The RLS Gate: The Realtime server respects Row Level Security. If the user subscribed to the WebSocket cannot execute a
SELECTon the new row, the Realtime server will silently suppress the notification to prevent data leakage.
Solution 1: Enabling the Replication Stream
The most common root cause for "silent failure" is that the table is not part of the supabase_realtime publication.
While you can toggle this in the Supabase Dashboard (Database -> Replication), a Principal Engineer should manage this via SQL migrations to ensure infrastructure-as-code consistency.
Run the following SQL in your SQL Editor or add it to your migration files:
-- Check which tables are currently being replicated
select * from pg_publication_tables
where pubname = 'supabase_realtime';
-- Enable replication for a specific table (e.g., 'messages')
alter publication supabase_realtime add table messages;
-- OR, if you need to replicate specific schemas (use with caution regarding performance)
-- alter publication supabase_realtime add table schema_name.table_name;
Note: If you enable replication for all tables (alter publication supabase_realtime set table all), you risk performance degradation on high-throughput tables that don't actually require frontend subscriptions. Always be explicit.
Solution 2: The RLS "Select" Trap
Realtime applies RLS policies at the time of the event emission. A common mistake is creating an RLS policy that allows INSERT but restricts SELECT.
If a user sends a message but your RLS policy says they cannot view that message (perhaps due to a missing user_id match or pending moderation flag), the database accepts the write, but the WebSocket subscription filters out the event for that user.
The Fix: Ensure your policies explicitly allow reading the data you expect to receive.
-- Example: A correct policy for a chat application
create policy "Users can read messages in their channels"
on messages
for select
using (
auth.uid() in (
select user_id from channel_members where channel_id = messages.channel_id
)
);
If you are debugging and desperate, you can temporarily bypass RLS to confirm this is the root cause (do not leave this in production):
// Client-side bypass check (requires "Service Role" key - never expose this key on client)
// Just verify if the policy exists in your SQL definition:
// alter table messages force row level security;
Production-Grade Implementation (React/TypeScript)
Subscriptions in production are fragile due to network flakiness, mobile backgrounding, and load balancers killing idle connections. Generic documentation examples often lack connection state handling.
Below is a robust React hook using supabase-js v2 that handles channel lifecycle management, avoiding memory leaks and "zombie" listeners.
useRealtimeSubscription.ts
import { useEffect, useState, useRef } from 'react';
import { RealtimeChannel, RealtimePostgresChangesPayload } from '@supabase/supabase-js';
import { supabase } from '@/lib/supabaseClient'; // Your singleton instance
type EventType = 'INSERT' | 'UPDATE' | 'DELETE' | '*';
interface SubscriptionConfig {
table: string;
schema?: string;
event?: EventType;
filter?: string; // e.g., 'room_id=eq.123'
}
type RealtimeStatus = 'CONNECTING' | 'SUBSCRIBED' | 'CLOSED' | 'CHANNEL_ERROR' | 'TIMED_OUT';
export function useRealtimeSubscription<T>(
config: SubscriptionConfig,
callback: (payload: RealtimePostgresChangesPayload<T>) => void
) {
const [status, setStatus] = useState<RealtimeStatus>('CONNECTING');
const channelRef = useRef<RealtimeChannel | null>(null);
useEffect(() => {
// Unique channel key to prevent collisions
const channelKey = `sub:${config.schema || 'public'}:${config.table}:${config.filter || 'all'}`;
// Cleanup previous subscription if config changes quickly
if (channelRef.current) {
supabase.removeChannel(channelRef.current);
}
const channel = supabase
.channel(channelKey)
.on(
'postgres_changes',
{
event: config.event || '*',
schema: config.schema || 'public',
table: config.table,
filter: config.filter,
},
(payload) => callback(payload as RealtimePostgresChangesPayload<T>)
)
.subscribe((status) => {
setStatus(status as RealtimeStatus);
if (status === 'CHANNEL_ERROR' || status === 'TIMED_OUT') {
console.error(`[Realtime] Subscription error for ${config.table}: ${status}`);
// Optional: Implement exponential backoff reconnection logic here
}
});
channelRef.current = channel;
// Cleanup on unmount
return () => {
if (channelRef.current) {
supabase.removeChannel(channelRef.current);
channelRef.current = null;
}
};
}, [config.table, config.filter, config.schema, config.event]); // Dependencies dictate re-subscription
return { status };
}
Usage in a Component
'use client';
import { useState } from 'react';
import { useRealtimeSubscription } from '@/hooks/useRealtimeSubscription';
export default function LiveFeed() {
const [items, setItems] = useState<any[]>([]);
const { status } = useRealtimeSubscription<any>(
{
table: 'notifications',
event: 'INSERT',
filter: 'user_id=eq.550e8400-e29b-41d4-a716-446655440000',
},
(payload) => {
// Append new data optimistically
setItems((prev) => [payload.new, ...prev]);
}
);
return (
<div className="p-4 border rounded-lg">
<div className="flex items-center gap-2 mb-4">
<h2 className="text-xl font-bold">Live Notifications</h2>
<span className={`px-2 py-0.5 text-xs rounded-full ${
status === 'SUBSCRIBED' ? 'bg-green-100 text-green-800' : 'bg-amber-100 text-amber-800'
}`}>
{status}
</span>
</div>
{/* List Rendering Logic */}
<ul>
{items.map((item) => (
<li key={item.id} className="py-2 border-b">{item.content}</li>
))}
</ul>
</div>
);
}
Deep Dive: Handling Replica Identity
There is an edge case specific to UPDATE and DELETE events. By default, Postgres does not send the "old" record data in the WAL, only the primary key and the changes.
If your frontend logic relies on payload.old to identify which item to remove or update in a list, you might find payload.old is empty or only contains an ID.
To receive the full previous record, you must change the Replica Identity of the table.
-- Set replica identity to FULL so 'old' record contains all columns
-- Warning: Increases WAL volume and CPU usage on the database
alter table messages replica identity full;
Use replica identity full sparingly. For most applications, replica identity default (which sends the Primary Key) is sufficient. You should update your frontend state updater to find the item by ID rather than relying on the full object presence in the old property.
Troubleshooting Checklist
If you are still seeing silence, follow this rigorous checklist:
- Network Inspection: Open Chrome DevTools > Network > WS (WebSockets). Find the connection to
wss://...supabase.co. Look for the "Frames" tab. Do you see a heartbeat (Pings/Pongs)? If the connection closes immediately (Code 1006), check your API keys and project URL. - Supabase Inspector: Go to the Supabase Dashboard > Realtime > Inspector. Create a listener for your table. If the Dashboard sees the event but your app doesn't, the issue is Client-Side (likely filters or RLS). If the Dashboard doesn't see it, the issue is Database Configuration (Replication).
- Date/Time Sync: WebSockets rely on JWTs. If your server or client system time is significantly drifted, token verification may fail silently.
Conclusion
Realtime features are often the "wow" factor of an application, but they introduce stateful complexity into a stateless web environment. The difference between a localhost prototype and a production-ready application usually lies in the database configuration—specifically Logical Replication slots and RLS policies. By explicitly defining your publications and handling connection states with strict typing, you transform "flaky" features into resilient infrastructure.