Recommended endpoint
Minimal request
cURL example
Python example
Node.js example
Best practices
- Control
reasoning.effortexplicitly for cost and latency - Ask for decision-ready outputs rather than raw chain-of-thought
- Track latency and token cost separately from normal chat traffic
