DynamoDB Single Table Design: Common Pitfalls and Hot Partitions

You followed the tutorials. You modeled your access patterns in Excel or NoSQL Workbench. You successfully implemented the Single Table Design (STD) pattern, stuffing your Users, Orders, and Inventory into one efficient table.

Then, production traffic hit.

Suddenly, your CloudWatch metrics are bleeding red with ProvisionedThroughputExceededException. Your latency spikes, but your table’s total provisioned capacity is barely touched. You aren't running out of total capacity; you are hitting a Hot Partition.

Single Table Design is often marketed as the "one true way" to use DynamoDB, but it introduces tight coupling and physical limitations that are rarely discussed in "Hello World" tutorials. This post dissects the mechanics of hot partitions, why blind adoption of STD causes them, and how to implement Write Sharding to solve high-concurrency contention.

The Root Cause: Physical vs. Logical Partitions

To fix a hot partition, you must understand what DynamoDB is doing under the hood. When you define a Partition Key (PK), you are defining a logical grouping of data. However, AWS stores this data on physical partitions (SSD-backed storage nodes).

DynamoDB passes your PK through a hashing algorithm. The resulting hash determines which physical partition holds the item.

The Limits of Physics

Regardless of your table settings (On-Demand or Provisioned), a single physical partition has hard limits:

Throughput: Approximately 3,000 Read Capacity Units (RCUs) and 1,000 Write Capacity Units (WCUs) per second.
Storage: Approximately 10GB of data.

The "Hot Key" Phenomenon

In a Single Table Design, developers often group related data under one PK to facilitate single-query retrieval (e.g., PK: ORG#123 to get all organization data).

If ORG#123 becomes a viral account or a high-frequency event logger, all read/write requests target the same hash. This directs all traffic to a single physical partition. Even if you provision 100,000 WCUs for the table, that specific partition allows only 1,000 WCUs/sec.

While DynamoDB has Adaptive Capacity to isolate hot partitions, it is reactive, not instant. It cannot handle sudden bursts of 10x traffic on a single key.

The Solution: Write Sharding

If your access pattern requires writing to a single logical entity (like a global counter, a live event stream, or a voting system) at a rate higher than 1,000 TPS, you must decouple the Logical PK from the Physical PK.

We achieve this via Write Sharding.

Instead of writing to CANDIDATE#A, we write to CANDIDATE#A#0, CANDIDATE#A#1, etc. We append a random suffix to the Partition Key, distributing the traffic across multiple physical partitions.

The Code: TypeScript & AWS SDK v3

Below is a production-ready implementation of a Write Sharding strategy using the Scatter-Gather pattern for reads.

We will simulate a high-velocity voting system where thousands of users vote for a candidate concurrently.

Prerequisites

npm install @aws-sdk/client-dynamodb @aws-sdk/lib-dynamodb uuid

1. The Write Path (Sharded)

We append a random integer (0 to N) to the partition key.

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient, UpdateCommand } from "@aws-sdk/lib-dynamodb";

const client = new DynamoDBClient({ region: "us-east-1" });
const docClient = DynamoDBDocumentClient.from(client);

const TABLE_NAME = "ElectionData";
const SHARD_COUNT = 10; // Distributes load across roughly 10 physical partitions

/**
 * Increments a vote count using a random shard.
 * 
 * PK: CANDIDATE#{candidateId}#{0-9}
 * SK: METADATA
 */
async function addVote(candidateId: string): Promise<void> {
  // 1. Generate a random shard suffix (0 to SHARD_COUNT - 1)
  const shardId = Math.floor(Math.random() * SHARD_COUNT);
  
  // 2. Construct the sharded Partition Key
  const pk = `CANDIDATE#${candidateId}#${shardId}`;

  try {
    const command = new UpdateCommand({
      TableName: TABLE_NAME,
      Key: {
        PK: pk,
        SK: "METADATA"
      },
      UpdateExpression: "ADD votes :inc",
      ExpressionAttributeValues: {
        ":inc": 1,
      },
    });

    await docClient.send(command);
    // In a real app, you might emit a metric here
  } catch (error) {
    console.error(`Failed to record vote for ${candidateId} on shard ${shardId}`, error);
    throw error;
  }
}

2. The Read Path (Scatter-Gather)

To get the total votes, we cannot simply GetItem. We must query all shards in parallel and sum the results.

import { GetCommand } from "@aws-sdk/lib-dynamodb";

interface CandidateStats {
  candidateId: string;
  totalVotes: number;
}

/**
 * Retrieves votes from all shards and aggregates them.
 * Uses Promise.all for parallel execution (Scatter-Gather).
 */
async function getTotalVotes(candidateId: string): Promise<CandidateStats> {
  // 1. Create an array of promises for every shard
  const shardPromises = Array.from({ length: SHARD_COUNT }).map(async (_, index) => {
    const pk = `CANDIDATE#${candidateId}#${index}`;
    
    try {
      const command = new GetCommand({
        TableName: TABLE_NAME,
        Key: {
          PK: pk,
          SK: "METADATA"
        }
      });
      
      const response = await docClient.send(command);
      return (response.Item?.votes as number) || 0;
    } catch (error) {
      // Log error but return 0 to allow partial degradation, or throw depending on requirements
      console.warn(`Failed to read shard ${index} for ${candidateId}`, error);
      return 0;
    }
  });

  // 2. Wait for all reads to complete
  const results = await Promise.all(shardPromises);

  // 3. Aggregate the results
  const totalVotes = results.reduce((sum, count) => sum + count, 0);

  return {
    candidateId,
    totalVotes
  };
}

Deep Dive: Why This Works

Breaking the Hash

By appending #0 through #9 to the PK, the hashing algorithm sees 10 distinct keys. Statistically, these keys will land on different physical partitions (or at least be spread out enough to utilize the throughput of the entire table rather than a single node).

Throughput Math

If one partition supports 1,000 WCUs, splitting the traffic 10 ways theoretically raises your ceiling to 10,000 WCUs for that specific logical entity.

Cost Implications

This pattern is not free.

Write Cost: Remains roughly the same (1 WCU per vote).
Read Cost: Increases significantly. Reading the total count now consumes 1 RCU * Shard Count (minimum). If you have 20 shards, a single "Get Count" operation costs 20 RCUs (or 10 for eventually consistent reads).

Common Pitfalls and Edge Cases

While Write Sharding solves the throughput issue, it introduces complexity that leads to other common STD errors.

1. The GSI Hotspot (Hidden Error)

You might shard your main table PK, but what about your Global Secondary Indexes (GSI)? If your item looks like this:

PK: CANDIDATE#A#3 (Sharded - Good)
GSI1_PK: ELECTION#2024 (Not Sharded - Bad)

Every time you write to the main table, DynamoDB asynchronously updates the GSI. If all 10,000 votes map to GSI1_PK = ELECTION#2024, your GSI partition will throttle.

The Catch: GSI throttling can cause write rejections on your main table. You must shard high-velocity GSI keys as well.

2. Large Item Sizes

Single Table Design encourages "Pre-joining" data (storing complex objects). If your aggregated item size exceeds 400KB, DynamoDB rejects it. But even items approaching 10KB can cause hot partitions faster because throughput is calculated by 1KB chunks.

Fix: Keep items meant for high-velocity writes small. Offload large payloads to S3 and store the URL, or break the item into multiple rows (Item Collections).

3. Calculating Shard Count

How many shards do you need? Formula: Estimated Max Write / 1000. If you expect 50,000 writes/sec, you need at least 50 shards. Always add a safety buffer (e.g., 60-70 shards) to account for uneven hash distribution.

Conclusion

Single Table Design is powerful, but it is not a "set it and forget it" architecture. It shifts complexity from the query language (SQL joins) to the application code (Sharding and Scatter-Gather).

Blindly grouping data under a single PK works for low-traffic enterprise apps, but it is catastrophic for high-scale, event-driven architectures. Use observability tools (CloudWatch Contributor Insights) to identify hot keys early. When you see the heat, apply Write Sharding to break through the physical limits of the partition.

Programming Tutorials

Search This Blog