Skip to main content

Event Sourcing Schema Evolution: Implementing Upcasters for Breaking Changes

 The immutable nature of an Event Store is its greatest strength and its most significant operational liability. When you write UserRegistered to your append-only log, it is written in stone. Six months later, when business requirements force you to split fullName into firstName and lastName, you create a dichotomy: your historical data adheres to Schema A, but your current codebase expects Schema B.

When you redeploy and trigger a replay (projection rebuild), your system will crash. The deserializer, expecting the new field structure, will encounter the old JSON payload and throw a strict mapping exception.

You cannot migrate the data in the database (that violates the immutability principal of Event Sourcing). You must migrate the data on-the-fly as it flows from the storage engine to your application code. This pattern is called Upcasting.

The Root Cause: Deserialization Asymmetry

The problem stems from the decoupling of data storage and data interpretation.

  1. Time T1: Code version 1.0 serializes an object to JSON: {"name": "John Doe", "v": 1}.
  2. Time T2: Code version 2.0 changes the class definition to require firstName and lastName.
  3. Replay: The deserializer reads the T1 JSON. It looks for firstName in the JSON, finds nothing, and throws an exception (assuming strict validation) or sets fields to null (corrupting domain state).

Standard serialization libraries (Jackson in Java, Newtonsoft/System.Text.Json in C#) attempt to map JSON fields directly to POJO/Poco properties. When the structural contract breaks, the mapping fails. To fix this, we need to intervene within the deserialization pipeline, operating on the Intermediate Representation (IR)—the raw JSON tree—before it is mapped to a strongly typed object.

The Solution: A Chain of Upcasters

An Upcaster is a pure function that transforms a raw generic document (like JsonNode or JObject) representing an event of revision $N$ into a document representing revision $N+1$.

By chaining these upcasters, you can take an event generated in 2020 (v1) and run it through a pipeline (v1 -> v2 -> v3) until it matches the class definition in your 2024 codebase.

The Implementation

We will implement a robust Upcasting engine using Java 21 and Jackson. We assume the Event Store provides an envelope containing metadata (revision version) and the payload.

1. The Scenario

We are refactoring a UserRegistered event.

  • Revision 1 (Legacy): Single fullName field.
  • Revision 2 (Current): Split firstName and lastName.

2. The Upcasting Contract

First, define the contract for an Upcaster. It does not work on the Java class; it works on the JsonNode.

package com.example.eventsourcing.upcasting;

import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import java.util.Optional;

/**
 * Transforms a raw JSON event payload from one revision to the next.
 */
public interface EventUpcaster {
    
    // The event type this upcaster applies to (e.g., "UserRegistered")
    String supportsType();

    // The revision this upcaster transitions FROM (e.g., "1.0")
    String sourceRevision();

    // The revision this upcaster transitions TO (e.g., "2.0")
    String targetRevision();

    // The transformation logic
    JsonNode upcast(JsonNode input);
}

3. The Concrete Upcaster

Here is the logic to handle the breaking change. We parse the old field, transform the data, and modify the JSON structure safely.

package com.example.eventsourcing.upcasting.impl;

import com.example.eventsourcing.upcasting.EventUpcaster;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.node.ObjectNode;

public class UserRegisteredSplitNameUpcaster implements EventUpcaster {

    @Override
    public String supportsType() {
        return "UserRegistered";
    }

    @Override
    public String sourceRevision() {
        return "1.0";
    }

    @Override
    public String targetRevision() {
        return "2.0";
    }

    @Override
    public JsonNode upcast(JsonNode input) {
        if (!(input instanceof ObjectNode objectNode)) {
            return input;
        }

        // 1. Extract the old data
        if (objectNode.has("fullName")) {
            String fullName = objectNode.get("fullName").asText("");
            
            // 2. Perform the logic (Naively splitting for demonstration)
            String[] parts = fullName.split(" ", 2);
            String firstName = parts.length > 0 ? parts[0] : "";
            String lastName = parts.length > 1 ? parts[1] : "";

            // 3. Mutate the structure to match the new Schema
            objectNode.put("firstName", firstName);
            objectNode.put("lastName", lastName);
            
            // 4. Remove the obsolete field to ensure strict schema compliance
            objectNode.remove("fullName");
        }

        return objectNode;
    }
}

4. The Upcaster Chain (The Engine)

We need a service that sits between the database fetch and the deserializer. This service looks at the incoming event's revision and applies the necessary upcasters in sequence.

package com.example.eventsourcing.serialization;

import com.example.eventsourcing.upcasting.EventUpcaster;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;

public class EventSerializer {

    private final ObjectMapper objectMapper;
    private final Map<String, List<EventUpcaster>> upcasters;

    public EventSerializer(ObjectMapper objectMapper, List<EventUpcaster> upcasterList) {
        this.objectMapper = objectMapper;
        // Group upcasters by Event Type for O(1) lookup
        this.upcasters = upcasterList.stream()
                .collect(Collectors.groupingBy(EventUpcaster::supportsType));
    }

    /**
     * Deserializes a raw payload, applying upcasters if necessary.
     *
     * @param rawJson The JSON string from the DB.
     * @param eventType The event type name.
     * @param currentRevision The revision of the event as stored in DB.
     * @param targetClass The class we want to instantiate.
     */
    public <T> T deserialize(String rawJson, String eventType, String currentRevision, Class<T> targetClass) {
        try {
            // 1. Parse to Intermediate Representation (IR)
            JsonNode node = objectMapper.readTree(rawJson);
            
            // 2. Determine if upcasting is needed
            List<EventUpcaster> chain = upcasters.getOrDefault(eventType, List.of());
            
            String activeRevision = currentRevision;
            
            // 3. Apply chain sequentially
            // Note: In a real system, you would topologically sort these or 
            // ensure the list is ordered by sourceRevision -> targetRevision.
            for (EventUpcaster upcaster : chain) {
                if (upcaster.sourceRevision().equals(activeRevision)) {
                    node = upcaster.upcast(node);
                    activeRevision = upcaster.targetRevision();
                }
            }

            // 4. Map IR to POJO
            return objectMapper.treeToValue(node, targetClass);

        } catch (Exception e) {
            throw new RuntimeException("Failed to deserialize and upcast event", e);
        }
    }
}

5. Usage Example

Here is how the modern code uses the serializer. Note that UserRegistered V2 record has no knowledge of fullName.

package com.example.domain;

import com.example.eventsourcing.serialization.EventSerializer;
import com.example.eventsourcing.upcasting.impl.UserRegisteredSplitNameUpcaster;
import com.fasterxml.jackson.databind.ObjectMapper;
import java.util.List;

// The Modern Schema (V2)
record UserRegistered(String id, String firstName, String lastName) {}

public class App {
    public static void main(String[] args) {
        // Setup
        ObjectMapper mapper = new ObjectMapper();
        var upcaster = new UserRegisteredSplitNameUpcaster();
        var serializer = new EventSerializer(mapper, List.of(upcaster));

        // Simulated Historical Data (V1 Schema stored in DB)
        String historicalEventJson = """
            {
                "id": "user-123",
                "fullName": "Alice Wonderland"
            }
        """;
        String historicalRevision = "1.0";

        // Execution
        UserRegistered event = serializer.deserialize(
            historicalEventJson, 
            "UserRegistered", 
            historicalRevision, 
            UserRegistered.class
        );

        // Verification
        System.out.println("First Name: " + event.firstName()); // Output: Alice
        System.out.println("Last Name: " + event.lastName());   // Output: Wonderland
    }
}

Why This Works

  1. Lazy Transformation: We manipulate the JSON tree (JsonNode), which is cheap compared to instantiating full objects. The heavy reflection logic of the final treeToValue binding only happens after the structure is corrected.
  2. Encapsulation: The domain model (UserRegistered record) remains pure. It contains no annotations or logic regarding legacy formats. It represents the current truth.
  3. Intermediate Representation: By targeting the IR (JSON) rather than the object, we can handle deletions, renames, and type changes (e.g., String to Integer) that would otherwise make object instantiation impossible.

Conclusion

Schema evolution in Event Sourcing is not optional; it is a lifecycle requirement. While strategies like "Weak Schema" (making everything nullable) might seem tempting, they push technical debt into your domain logic.

Implementing an Upcaster pattern creates a sanitized anti-corruption layer between your immutable history and your evolving code. The database stays immutable, the domain model stays clean, and the complexity of time-traveling data is contained entirely within the serialization infrastructure.