Documentation Index
Fetch the complete documentation index at: https://docs.apigo.ai/llms.txt
Use this file to discover all available pages before exploring further.
Recommended endpoint
Minimal request
{
"model": "o4-mini",
"input": "Compare three cache architectures and recommend one.",
"reasoning": {
"effort": "medium"
},
"max_output_tokens": 1200
}
cURL example
curl https://maas.apigo.ai/v1/responses \
-H "Authorization: Bearer $YOUR API KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "o4-mini",
"input": "Compare three cache architectures and recommend one.",
"reasoning": {
"effort": "medium"
},
"max_output_tokens": 1200
}'
Python example
from openai import OpenAI
client = OpenAI(
base_url="https://maas.apigo.ai/v1",
api_key="<YOUR API KEY>",
)
response = client.responses.create(
model="o4-mini",
input="Compare three cache architectures and recommend one.",
reasoning={"effort": "medium"},
max_output_tokens=1200,
)
print(response.output_text)
Node.js example
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://maas.apigo.ai/v1",
apiKey: process.env.YOUR API KEY,
});
const response = await client.responses.create({
model: "o4-mini",
input: "Compare three cache architectures and recommend one.",
reasoning: { effort: "medium" },
max_output_tokens: 1200,
});
console.log(response.output_text);
Best practices
- Control
reasoning.effort explicitly for cost and latency
- Ask for decision-ready outputs rather than raw chain-of-thought
- Track latency and token cost separately from normal chat traffic