Recommended endpoint
Minimal request
cURL example
Python example
Node.js example
Best practices
- Accumulate
parts[].textincrementally instead of assuming complete sentences per chunk - Structured output can also be streamed, but parse partial JSON server-side
- Flash-class models are usually the best fit for latency-sensitive streaming chat
