Half my consulting calls this quarter started the same way: "We have been on the OpenAI SDK for two years. Leadership wants to evaluate Anthropic. What does the migration actually look like?"
This is the answer. SDK-level. Honest. Including the parts that take longer than the marketing material says.
The auto-translator that handles 80 percent of the simple cases is at /tools/openai-to-anthropic.
The five concrete differences
These are the only things you actually have to change. Everything else is the same.
1. Client construction
# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...")
# After
from anthropic import Anthropic
client = Anthropic(api_key="sk-ant-...")
Two lines. Done.
2. Endpoint call
# Before
client.chat.completions.create(model="gpt-5", messages=[...])
# After
client.messages.create(model="claude-opus-4-7", messages=[...])
Different method, same shape. Mostly.
3. System prompt placement
This is the one that breaks naive translators.
# OpenAI: system goes inside messages
messages=[
{"role": "system", "content": "You are senior."},
{"role": "user", "content": "..."},
]
# Anthropic: system is a top-level parameter
client.messages.create(
model="claude-opus-4-7",
system="You are senior.",
messages=[
{"role": "user", "content": "..."},
],
)
The auto-translator handles this. Hand-rolled migrations forget it about half the time.
4. max_tokens is required
# OpenAI: optional
client.chat.completions.create(model="gpt-5", messages=[...])
# Anthropic: required
client.messages.create(model="claude-opus-4-7", messages=[...], max_tokens=1024)
Anthropic will reject the call if max_tokens is missing. Pick a sane default, write it down, ship.
5. Response shape
# OpenAI
text = response.choices[0].message.content
# Anthropic
text = response.content[0].text
If you have any code that walks the response object (logging, tracing, telemetry), this is the line you have to update.
Model mapping
Not all GPT models map cleanly to a single Claude model. This is the table I use:
| OpenAI model | Anthropic equivalent | When this is wrong |
|---|---|---|
| gpt-5 | claude-opus-4-7 | If you need pure speed over quality, drop to Sonnet |
| gpt-5-mini | claude-sonnet-4-6 | Almost always right |
| gpt-5-nano | claude-haiku-4-5 | Almost always right |
| gpt-4o | claude-sonnet-4-6 | If you used 4o for vision, Sonnet has vision too |
| gpt-4o-mini | claude-haiku-4-5 | Almost always right |
| o1 | claude-opus-4-7 + extended thinking | Anthropic does not have a separate reasoning model; use extended thinking |
| o1-mini | claude-sonnet-4-6 + extended thinking | Same |
The benchmark I ran across 30 production tasks is in this post. Short version: the models win different jobs. Map by task type, not name.
The harder parts
These are the parts the auto-translator cannot fix. Budget time.
Tool use
OpenAI's tools and Anthropic's tools look similar but use different schemas. OpenAI uses JSON Schema directly. Anthropic uses an input_schema field with similar but not identical conventions. Tool result handling is also structurally different: OpenAI returns tool_calls on the assistant message; Anthropic returns tool_use content blocks.
If you have non-trivial tool use, expect a half-day to rewrite the orchestration layer.
Streaming
Streaming response shapes are different. OpenAI streams chunk.choices[0].delta.content. Anthropic streams content block deltas with explicit start, delta, and stop events. Your UI code will need updates.
Structured outputs
OpenAI has a response_format parameter that accepts a JSON Schema. Anthropic does not. You get structured outputs via tool use (cleanest) or by instructing the model in the prompt (less reliable). If you have anything in production using response_format, plan a full rewrite for those paths.
Caching
Anthropic's prompt caching is good, but the API is different. You mark a content block with cache_control: {type: "ephemeral"} and the system caches everything up to that block. If you were not using caching on OpenAI, this is a free 30 to 70 percent cost reduction for repeated context. Worth setting up.
The migration plan that works
For a team with a serious OpenAI integration, this is the four-week plan I recommend.
Week 1: shadow mode
Add the Anthropic SDK alongside the OpenAI SDK. For each call site, run both and log the outputs. Do not switch traffic. The goal is to surface diffs.
Week 2: tool use and streaming
Rewrite the tool orchestration layer. Rewrite the streaming handler. Test in isolation. Both are the most failure-prone parts.
Week 3: cutover with feature flag
Route 5 percent of traffic to Anthropic. Monitor latency, error rate, output quality, and cost. Ramp to 50 percent if the numbers hold.
Week 4: full cutover plus cleanup
100 percent traffic on Anthropic. Remove the OpenAI SDK dependency. Update the runbooks. Document the model mapping in your CLAUDE.md or AGENTS.md.
What I would not migrate
Some workloads are not worth moving. Specifically:
- Extremely cost-sensitive embeddings. OpenAI's embedding pricing is competitive. Anthropic does not currently offer a direct equivalent.
- Anything using the OpenAI Assistants API heavily. Anthropic does not have a 1:1 replacement; you would need to reimplement the orchestration.
- Production code where the LLM is incidental. If the LLM is 5 percent of your stack and migration cost is two engineer-weeks, the math probably does not work this quarter.
Receipts
- 4 client migrations done in 2026 so far.
- Median time from "decision to migrate" to "100 percent traffic on Anthropic": 22 days.
- Median cost reduction post-migration: 18 percent on identical workloads, after enabling prompt caching.
- Most common single mistake: forgetting
max_tokensis required.