OpenAI to Anthropic Migration Guide: SDK Refactors & Key Gotchas

The migration from GPT-4 to Claude 3.5 Sonnet (or Opus) is becoming a common architectural shift for engineering teams seeking better code generation performance and reduced "laziness." However, treating Claude as a drop-in replacement for OpenAI is a recipe for runtime errors.

While the conceptual logic of Large Language Models (LLMs) remains similar, the SDK implementations diverge significantly. Anthropic’s API enforces stricter validation on prompt structure and token limits compared to OpenAI’s more permissive implementation.

This guide provides a rigorous technical breakdown of the differences between the openai and anthropic Python SDKs, specifically focusing on the Chat Completions (Messages) API. We will analyze the root causes of incompatibility and implement a unified adapter pattern.

The Core Architecture Divergence

The primary friction point isn't just parameter naming—it is the topology of the message payload.

OpenAI's Chat Completions API treats the system prompt as just another message in the history list. You can interleave system messages, or place them anywhere (though index 0 is best practice), and the model attempts to adhere to them.

Anthropic’s Messages API treats the system prompt as a distinct, top-level parameter. It enforces a strict alternation of user and assistant roles within the messages array. Injecting a system role into the messages list in Anthropic’s SDK will result in a 400 BadRequestError.

Furthermore, Anthropic requires an explicit max_tokens definition, whereas OpenAI infers a default if omitted.

1. SDK Initialization and Client Setup

Both libraries now adhere to modern Python standards, utilizing type hints and Pydantic under the hood. However, their environment variable defaults differ slightly.

OpenAI Setup

OpenAI looks for OPENAI_API_KEY.

import os
from openai import OpenAI

# OpenAI automatically looks for os.environ.get("OPENAI_API_KEY")
openai_client = OpenAI()

Anthropic Setup

Anthropic looks for ANTHROPIC_API_KEY. Note that the import structure is distinct.

import os
from anthropic import Anthropic

# Anthropic automatically looks for os.environ.get("ANTHROPIC_API_KEY")
anthropic_client = Anthropic()

2. Refactoring the Request Payload

This is where 90% of migration errors occur. Below, we map a standard OpenAI request to its Anthropic equivalent.

The OpenAI Approach (Legacy & Current)

In the OpenAI SDK, the system instruction is part of the messages list.

response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a JSON-only parser."},
        {"role": "user", "content": "Extract data from this email."}
    ],
    temperature=0.7,
    # max_tokens is optional here
)

The Anthropic Approach (The Fix)

To migrate this, you must extract the system message. Additionally, you must provide max_tokens. While the raw API historically used max_tokens_to_sample, the modern Python SDK normalizes this to max_tokens, but it remains a mandatory argument.

response = anthropic_client.messages.create(
    model="claude-3-5-sonnet-20240620",
    max_tokens=1024,  # REQUIRED field, unlike OpenAI
    system="You are a JSON-only parser.", # Top-level parameter
    messages=[
        # NO "system" role allowed here.
        {"role": "user", "content": "Extract data from this email."}
    ],
    temperature=0.7,
)

3. Handling Response Objects

The structure of the returned object differs significantly. OpenAI nests content deeply within choices, reflecting its history of returning multiple completion candidates. Anthropic returns a cleaner, flat content list.

OpenAI Response Parsing

# OpenAI returns a ChatCompletion object
content = response.choices[0].message.content
print(content)

Anthropic Response Parsing

Anthropic returns a Message object containing a list of ContentBlock items. This allows for multi-modal responses (text mixed with tool use), but for standard chat, you access the first block's text.

# Anthropic returns a Message object
content = response.content[0].text
print(content)

4. Implementation: A Unified LLM Adapter

To prevent vendor lock-in and simplify your codebase, do not sprinkle if/else statements throughout your application. Implement a unified Adapter pattern.

The following code is a production-ready abstraction capable of handling the conversion logic automatically.

from typing import List, Dict, Optional, Union
from openai import OpenAI
from anthropic import Anthropic

class UnifiedLLMClient:
    def __init__(self):
        self.openai = OpenAI()
        self.anthropic = Anthropic()

    def generate(
        self, 
        provider: str, 
        model: str, 
        messages: List[Dict[str, str]], 
        max_tokens: int = 1024,
        temperature: float = 0.7
    ) -> str:
        """
        Normalized generation method that handles SDK discrepancies.
        """
        if provider == "openai":
            return self._call_openai(model, messages, max_tokens, temperature)
        elif provider == "anthropic":
            return self._call_anthropic(model, messages, max_tokens, temperature)
        else:
            raise ValueError(f"Unsupported provider: {provider}")

    def _call_openai(self, model, messages, max_tokens, temperature) -> str:
        # OpenAI handles system messages inside the messages list natively
        response = self.openai.chat.completions.create(
            model=model,
            messages=messages,
            max_tokens=max_tokens,
            temperature=temperature
        )
        return response.choices[0].message.content

    def _call_anthropic(self, model, messages, max_tokens, temperature) -> str:
        # 1. Extract System Message
        system_prompt = None
        filtered_messages = []
        
        for msg in messages:
            if msg["role"] == "system":
                system_prompt = msg["content"]
            else:
                filtered_messages.append(msg)

        # 2. Call Anthropic API
        # Note: 'system' is passed as a named argument, not in the list
        kwargs = {
            "model": model,
            "max_tokens": max_tokens,
            "messages": filtered_messages,
            "temperature": temperature
        }
        
        if system_prompt:
            kwargs["system"] = system_prompt

        response = self.anthropic.messages.create(**kwargs)
        
        # 3. Parse text content
        return response.content[0].text

# --- Usage Example ---

client = UnifiedLLMClient()

# Shared message format (OpenAI Style)
chat_history = [
    {"role": "system", "content": "You are a helpful coding assistant."},
    {"role": "user", "content": "Write a Hello World in Python."}
]

# Switch providers instantly
try:
    print("--- OpenAI Output ---")
    print(client.generate("openai", "gpt-4o", chat_history))
    
    print("\n--- Anthropic Output ---")
    print(client.generate("anthropic", "claude-3-5-sonnet-20240620", chat_history))
except Exception as e:
    print(f"Error: {e}")

Deep Dive: The "Prefill" Feature (Anthropic Exclusive)

One distinct advantage of the Anthropic SDK is "Prefilling." This is a feature OpenAI does not natively support in the same way.

In the Anthropic messages list, if the last message has the role assistant, the model will continue generating from that point. This is incredibly powerful for forcing output formats (e.g., forcing a { to start a JSON response).

# Forcing JSON output in Anthropic via Prefill
response = anthropic_client.messages.create(
    model="claude-3-5-sonnet-20240620",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "List 3 colors in JSON."},
        {"role": "assistant", "content": "{"} # The Prefill
    ]
)

# Output will effectively attach to the prefill
# Result: "colors": ["red", "blue", "green"] }
full_json = "{" + response.content[0].text

If you attempt this with OpenAI, the model will likely assume the assistant has finished speaking or get confused, as OpenAI expects the assistant message to be a completed turn of conversation in the history.

Common Pitfalls and Edge Cases

1. `max_tokens` vs. `max_tokens_to_sample`

In older versions of the Anthropic API documentation and raw HTTP requests, the parameter was named max_tokens_to_sample.

The Fix: Use max_tokens in the latest Python SDK (anthropic>=0.3.0). Do not rely on outdated curl examples.
The Gotcha: If you omit this parameter, OpenAI defaults to the model's remaining context window. Anthropic throws an error.

2. Strict User/Assistant Alternation

OpenAI is lenient if you send multiple user messages in a row. Anthropic is strict. The sequence must be user -> assistant -> user.

If you have a system where a user sends two messages before the bot replies, you must concatenate them into a single user message content block before sending the request to Anthropic.

3. Error Handling Hierarchy

The exception classes are named differently. If you have try/except blocks wrapping your calls, you need to update them.

OpenAI: openai.APIConnectionError, openai.RateLimitError
Anthropic: anthropic.APIConnectionError, anthropic.RateLimitError

It is best practice to import the specific exceptions from the respective libraries to avoid silent failures during network blips.

Conclusion

Migrating to the Anthropic SDK requires more than changing the endpoint URL. It requires respecting a stricter separation of concerns between system instructions and conversation history. By implementing an Adapter pattern and handling the mandatory max_tokens requirement, you can leverage Claude's capabilities without destabilizing your production environment.

Programming Tutorials

Search This Blog