Recommended endpoint
Minimal request
cURL example
Python example
Node.js example
Best practices
- Start with a Flash model for lower-latency chat
- Keep the
partsmodel intact so media can be added later without redesign - If you do not need extra reasoning cost, set
thinkingBudgetto0on Gemini 2.5 Flash
