跳转到主要内容

推荐 endpoint

最小请求

{
  "model": "o4-mini",
  "input": "比较三种缓存架构的取舍,并给出推荐。",
  "reasoning": {
    "effort": "medium"
  },
  "max_output_tokens": 1200
}

cURL 示例

curl https://mass.apigo.ai/v1/responses \
  -H "Authorization: Bearer $TIDEMIND_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "o4-mini",
    "input": "比较三种缓存架构的取舍,并给出推荐。",
    "reasoning": {
      "effort": "medium"
    },
    "max_output_tokens": 1200
  }'

Python 示例

from openai import OpenAI

client = OpenAI(
    base_url="https://mass.apigo.ai/v1",
    api_key="<TIDEMIND_API_KEY>",
)

response = client.responses.create(
    model="o4-mini",
    input="比较三种缓存架构的取舍,并给出推荐。",
    reasoning={"effort": "medium"},
    max_output_tokens=1200,
)

print(response.output_text)

Node.js 示例

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://mass.apigo.ai/v1",
  apiKey: process.env.TIDEMIND_API_KEY,
});

const response = await client.responses.create({
  model: "o4-mini",
  input: "比较三种缓存架构的取舍,并给出推荐。",
  reasoning: { effort: "medium" },
  max_output_tokens: 1200,
});

console.log(response.output_text);

最佳实践

  • 推理场景优先显式控制 reasoning.effort
  • 不要要求模型暴露完整思维链,重点约束最终结论格式
  • 单独监控延迟、token 和成功率,别和普通聊天流量混算