Skip to main content

Minimal request

{
  "model": "claude-sonnet-4-20250514",
  "max_tokens": 1600,
  "thinking": {
    "type": "enabled",
    "budget_tokens": 2048
  },
  "messages": [
    {
      "role": "user",
      "content": [{ "type": "text", "text": "Compare three cache architectures and recommend one." }]
    }
  ]
}

cURL example

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 1600,
    "thinking": {
      "type": "enabled",
      "budget_tokens": 2048
    },
    "messages": [
      {
        "role": "user",
        "content": [{ "type": "text", "text": "Compare three cache architectures and recommend one." }]
      }
    ]
  }'

Python example

from anthropic import Anthropic

client = Anthropic(api_key="<ANTHROPIC_API_KEY>")

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1600,
    thinking={
        "type": "enabled",
        "budget_tokens": 2048,
    },
    messages=[
        {
            "role": "user",
            "content": [{"type": "text", "text": "Compare three cache architectures and recommend one."}],
        }
    ],
)

print(response.content[0].text)

Node.js example

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const response = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1600,
  thinking: {
    type: "enabled",
    budget_tokens: 2048
  },
  messages: [
    {
      role: "user",
      content: [{ type: "text", text: "Compare three cache architectures and recommend one." }]
    }
  ]
});

console.log(response.content[0].text);

Best practices

  • Enable thinking only for genuinely hard reasoning tasks
  • Start with a smaller budget_tokens value
  • If tools are involved later, verify that your stream parser supports thinking events too