Skip to main content

Fixing SQLite 'SQLITE_BUSY: database is locked' in Concurrent Edge Environments

 If you are deploying SQLite in a Node.js environment—whether on a distinct VPS, inside a Docker container, or via a stateful edge provider like Fly.io—you have likely encountered this stack trace during load testing or production spikes:

SqliteError: SQLITE_BUSY: database is locked
    at Statement.run (node_modules/better-sqlite3/lib/methods/wrappers.js:5:11)
    ...

This error usually creates a false dichotomy for developers: "SQLite isn't production-ready" or "I need to migrate to Postgres immediately." Neither is necessarily true. The issue is rarely SQLite itself, but rather the default configuration regarding concurrency and file locking.

Here is the root cause analysis and the definitive implementation to handle high-concurrency writes in Node.js.

The Root Cause: Rollback Journals and Exclusive Locks

By default, SQLite uses a Rollback Journal. To write to the database in this mode, SQLite must acquire an EXCLUSIVE lock on the database file.

  1. The Lock: When Process A writes, it locks the entire file.
  2. The Block: If Process B tries to read or write while A holds the lock, B is blocked.
  3. The Panic: In a Node.js event loop, if the driver doesn't have a retry strategy, it immediately throws SQLITE_BUSY.

In an edge environment, this is exacerbated. Your Node.js application is asynchronous and non-blocking, often handling hundreds of concurrent HTTP requests. However, standard SQLite writes are synchronous file system operations. If 50 requests hit your endpoint simultaneously, 49 of them might bounce off the locked file instantly if you rely on the default settings.

The Solution: WAL Mode and Native C-Level Retries

To fix this, we need to change how SQLite handles atomicity and how the driver handles lock contention. We do not use user-land JavaScript try/catch/retry loops, which are slow and pollute the event loop. We use SQLite's native features.

The Strategy

  1. Enable Write-Ahead Logging (WAL): This creates a separate -wal file. Writers append to the WAL, while readers read from the database file and the WAL. Crucially, readers no longer block writers, and writers do not block readers.
  2. Set a Busy Timeout: We instruct the underlying C library to busy-wait (retry) for a specified duration before throwing an error.
  3. Relax Synchronization: We adjust fsync behavior to optimize for typical edge server reliability vs. performance.

The Implementation

The following code uses better-sqlite3, the fastest and most rigorous SQLite driver for Node.js. The concepts apply equally to node-sqlite3 or libsql, provided you execute the PRAGMAs correctly.

Create a robust database instantiation file (e.g., db.ts or db.js):

// db.ts
import Database from 'better-sqlite3';
import path from 'node:path';

// Singleton pattern to prevent multiple connections in hot-reload/serverless environments
let dbInstance: Database.Database | null = null;

interface DbConfig {
  path: string;
  timeout?: number;
}

export function getDatabase(config: DbConfig = { path: './data.db' }) {
  if (dbInstance) return dbInstance;

  const dbPath = path.resolve(process.cwd(), config.path);

  // Initialize DB
  dbInstance = new Database(dbPath, {
    // Log queries in dev, silent in prod
    verbose: process.env.NODE_ENV === 'development' ? console.log : undefined,
  });

  /**
   * CRITICAL OPTIMIZATIONS FOR CONCURRENCY
   */
  
  // 1. WAL Mode
  // Moves the rollback journal to a write-ahead log. 
  // Writers append to WAL; they don't lock the main DB file.
  // Readers check WAL first, then DB.
  dbInstance.pragma('journal_mode = WAL');

  // 2. Busy Timeout
  // Instead of throwing SQLITE_BUSY immediately, SQLite will retry
  // to acquire the lock for 5000ms. This is done in C, blocking the 
  // worker thread but saving the request.
  // 5000ms is usually enough to weather a massive spike in writes.
  dbInstance.pragma('busy_timeout = 5000');

  // 3. Synchronous NORMAL
  // In WAL mode, synchronous=NORMAL is safe. It syncs the WAL less often
  // than FULL, but still ensures consistency. The only risk is losing 
  // the very last transaction on a hard power failure (OS crash), 
  // but DB corruption is prevented.
  dbInstance.pragma('synchronous = NORMAL');
  
  // 4. Cache Size
  // Adjust based on your Edge container's RAM.
  // -64000 = 64MB (negative value means kilobytes)
  dbInstance.pragma('cache_size = -64000');

  return dbInstance;
}

// Utility for atomic transactions
// Wraps multiple write operations into a single lock acquisition
export const runTransaction = <T>(fn: (...args: any[]) => T) => {
  const db = getDatabase();
  return db.transaction(fn);
};

Usage Example

Here is how you use this in an API route (e.g., Express, Fastify, or Next.js API route). Notice we don't need manual retry logic.

// api/orders.ts
import { getDatabase, runTransaction } from './db.js';

const db = getDatabase();

const insertOrder = db.prepare(`
  INSERT INTO orders (id, customer_id, total, created_at)
  VALUES (?, ?, ?, ?)
`);

const insertItems = db.prepare(`
  INSERT INTO order_items (order_id, product_id, quantity)
  VALUES (?, ?, ?)
`);

export function createOrder(orderData: any) {
  // Use the transaction helper.
  // This acquires the write lock ONCE for the whole batch.
  const createOrderTransaction = runTransaction((data) => {
    const orderId = crypto.randomUUID();
    
    insertOrder.run(orderId, data.customerId, data.total, new Date().toISOString());

    for (const item of data.items) {
      insertItems.run(orderId, item.productId, item.quantity);
    }

    return { orderId, status: 'success' };
  });

  try {
    // If the DB is locked by another process, 'busy_timeout' handles the wait.
    // We only catch the error if the timeout (5s) is actually exceeded.
    return createOrderTransaction(orderData);
  } catch (err: any) {
    if (err.code === 'SQLITE_BUSY') {
      // Real monitoring log
      console.error('Database saturated: writes exceeding 5000ms timeout');
      throw new Error('Service temporarily overloaded');
    }
    throw err;
  }
}

Why This Works

1. journal_mode = WAL

In standard mode, a write operation requires reading the page, modifying it, writing the original to a rollback journal, and writing the new page to the database file. It locks everything.

In WAL mode, the new data is simply appended to the end of the -wal file.

  • Concurrency: Multiple readers can read the DB while a writer appends to the WAL.
  • Performance: Appending is faster than random-access writes.
  • Checkpointing: SQLite automatically merges the WAL back into the main DB (checkpointing) periodically (default is when WAL reaches 1000 pages).

2. busy_timeout = 5000

This is the "magic" fix. Without this, if two Node.js requests try to write at the exact same microsecond, one fails instantly.

With busy_timeout, the SQLite driver registers a callback. If it encounters a lock, it sleeps (usually for increasing intervals) and retries until the timer expires. Because SQLite writes are incredibly fast (microseconds), a 5-second timeout effectively queues your writes sequentially without the overhead of a Redis queue.

3. Transaction Batching

By wrapping your logic in db.transaction(), you ensure that the application acquires the write lock once, performs all updates, and releases it.

If you ran insertOrder.run() then insertItems.run() separately outside a transaction, you would acquire the lock, release it, and try to acquire it again. This creates gaps where other writers can intervene, increasing contention and the probability of SQLITE_BUSY.

Conclusion

SQLite is capable of handling significant concurrency if configured correctly. The defaults are designed for backward compatibility and embedded devices, not modern server-side environments.

By enforcing WAL mode and utilizing the native busy timeout, you transform SQLite from a fragile file store into a robust, high-throughput database engine suitable for the majority of Node.js edge applications. Don't migrate to Postgres until you have actually tuned the engine you are already using.