跳转到主要内容

推荐 endpoint

最小请求

{
  "model": "gemini-2.5-flash",
  "contents": [
    {
      "role": "user",
      "parts": [{ "text": "边生成边解释 SSE 流式输出。" }]
    }
  ]
}

cURL 示例

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{ "text": "边生成边解释 SSE 流式输出。" }]
      }
    ]
  }'

Python 示例

from google import genai

client = genai.Client(api_key="<GEMINI_API_KEY>")

stream = client.models.generate_content_stream(
    model="gemini-2.5-flash",
    contents="边生成边解释 SSE 流式输出。",
)

for chunk in stream:
    print(chunk.text or "", end="")

Node.js 示例

import { GoogleGenAI } from "@google/genai";

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const stream = await ai.models.generateContentStream({
  model: "gemini-2.5-flash",
  contents: "边生成边解释 SSE 流式输出。"
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text || "");
}

最佳实践

  • 逐段累积 parts[].text,不要假设每个 chunk 都是完整句子
  • 流式和结构化输出可以组合,但建议由服务端统一处理 partial JSON
  • 对延迟敏感的场景优先选 Flash 类模型