推荐 endpoint
最小请求
{
"model": "o4-mini",
"input": "比较三种缓存架构的取舍,并给出推荐。",
"reasoning": {
"effort": "medium"
},
"max_output_tokens": 1200
}
cURL 示例
curl https://mass.apigo.ai/v1/responses \
-H "Authorization: Bearer $TIDEMIND_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "o4-mini",
"input": "比较三种缓存架构的取舍,并给出推荐。",
"reasoning": {
"effort": "medium"
},
"max_output_tokens": 1200
}'
Python 示例
from openai import OpenAI
client = OpenAI(
base_url="https://mass.apigo.ai/v1",
api_key="<TIDEMIND_API_KEY>",
)
response = client.responses.create(
model="o4-mini",
input="比较三种缓存架构的取舍,并给出推荐。",
reasoning={"effort": "medium"},
max_output_tokens=1200,
)
print(response.output_text)
Node.js 示例
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://mass.apigo.ai/v1",
apiKey: process.env.TIDEMIND_API_KEY,
});
const response = await client.responses.create({
model: "o4-mini",
input: "比较三种缓存架构的取舍,并给出推荐。",
reasoning: { effort: "medium" },
max_output_tokens: 1200,
});
console.log(response.output_text);
最佳实践
- 推理场景优先显式控制
reasoning.effort - 不要要求模型暴露完整思维链,重点约束最终结论格式
- 单独监控延迟、token 和成功率,别和普通聊天流量混算
