跳转到主要内容

推荐 endpoint

最小请求

{
  "model": "gemini-2.5-flash",
  "contents": [
    {
      "role": "user",
      "parts": [
        { "text": "描述这张图里的界面结构。" },
        {
          "inlineData": {
            "mimeType": "image/png",
            "data": "<base64>"
          }
        }
      ]
    }
  ]
}

cURL 示例

curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent" \
  -H "x-goog-api-key: $GEMINI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [
          { "text": "描述这张图里的界面结构。" },
          {
            "inlineData": {
              "mimeType": "image/png",
              "data": "<BASE64_IMAGE>"
            }
          }
        ]
      }
    ]
  }'

Python 示例

import base64
import requests

with open("ui.png", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode("utf-8")

response = requests.post(
    "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent",
    headers={
        "x-goog-api-key": "<GEMINI_API_KEY>",
        "Content-Type": "application/json",
    },
    json={
        "contents": [
            {
                "role": "user",
                "parts": [
                    {"text": "描述这张图里的界面结构。"},
                    {
                        "inlineData": {
                            "mimeType": "image/png",
                            "data": image_base64,
                        }
                    },
                ],
            }
        ]
    },
    timeout=60,
)
response.raise_for_status()

print(response.json()["candidates"][0]["content"]["parts"][0]["text"])

Node.js 示例

import { readFileSync } from "node:fs";

const imageBase64 = readFileSync("ui.png").toString("base64");

const response = await fetch(
  "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent",
  {
    method: "POST",
    headers: {
      "x-goog-api-key": process.env.GEMINI_API_KEY,
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      contents: [
        {
          role: "user",
          parts: [
            { text: "描述这张图里的界面结构。" },
            {
              inlineData: {
                mimeType: "image/png",
                data: imageBase64
              }
            }
          ]
        }
      ]
    })
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

最佳实践

  • 小图直接内联,大图或重复素材优先 Files API
  • 混合多媒体时,明确按 parts 顺序组织上下文
  • 文本、图片、文件统一走同一套请求模型,便于后续扩展