This page keeps only the HTTP endpoint notes plus direct cURL, Python, and Node.js request examples.
Endpoint Summary
| Endpoint | Summary |
|---|---|
POST /v1/chat/completions | Standard chat completion endpoint for multi-turn conversation, tool calling, and optional streaming. |
POST /v1/responses | Newer unified response endpoint for structured output, multimodal input, and future capability expansion. |
POST /v1/chat/completions
Standard chat completion endpoint for multi-turn conversation, tool calling, and optional streaming.
Request Notes
- Authenticate with Authorization: Bearer ; the core payload fields are model and messages.
- messages should preserve system, user, and assistant history in order; add stream=true for SSE output.
- For OpenAI-compatible gateways, this is usually the safest default text endpoint to start with.
Response Notes
- Synchronous output is typically read from choices[0].message.content.
- When tool calling is enabled, handle tool_calls and the follow-up tool exchange together.
- Streaming mode returns SSE chunks rather than one complete JSON response.
Examples
cURL
chat.completionsPython
requestsNode.js
fetchResponse Example (200)
responsePOST /v1/responses
Newer unified response endpoint for structured output, multimodal input, and future capability expansion.
Request Notes
- It still uses Bearer auth, but the main payload shape is centered on input and instructions rather than messages.
- If you want one endpoint shape for text and structured output, prefer this over legacy chat.completions.
- Newer response-format and multimodal features usually show up here first.
Response Notes
- Consumers usually read from output[] or output_text instead of choices[0].message.
- When the workflow becomes async or tool-driven, this endpoint usually exposes richer status fields.
- Migration work should include field mapping, retries, and server-side logging.
