Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.apigo.ai/llms.txt

Use this file to discover all available pages before exploring further.

This page explains the throughput and throttling constraints you should pay attention to when calling the API.

What to check

  • request rate per time window
  • concurrency limits
  • whether different models or capabilities have separate quotas
  • whether free, test, and production environments differ

Engineering guidance

  • centralize retry, backoff, and circuit breaking on the server
  • use caching or queueing for high-frequency flows
  • separate business traffic spikes from model invocation spikes

Suggested debugging order

  1. confirm whether you hit a platform-level throttle
  2. confirm whether the specific model or capability has its own rate cap
  3. inspect whether the client is retrying or resubmitting unexpectedly