https://api.llmgrid.ai/v1Auth Header:
Authorization: Bearer <LLMGRID_API_KEY>
Quick Triage
Use the status code to decide your next step:- 4xx (except 429) → request issue (auth, permissions, invalid params). Fix the request.
- 429 → throttling / quota / budget issue. Back off, retry, and/or reduce throughput.
- 5xx → transient failure (gateway or upstream provider). Retry with backoff.
400 — Bad Request
What it means
The request payload is invalid or contains unsupported parameters.Common causes
- Invalid JSON (malformed body)
- Wrong parameter names or types (e.g., string where number expected)
- Unsupported parameter for the selected model/provider
- Invalid
messagesstructure (missingroleorcontent)
How to fix
- Validate your JSON and parameter types.
- Remove provider-unsupported params, or enable “drop unsupported params” if your SDK supports it.
- Ensure
messagesis an array of objects withrole+content.
401 — Unauthorized
What it means
Missing or invalid API key.Common causes
- Missing
Authorizationheader - Wrong key format
- Expired/revoked Virtual Key
How to fix
- Set
Authorization: Bearer $LLMGRID_API_KEY - Regenerate/rotate the Virtual Key if needed.
403 — Forbidden
What it means
You are authenticated, but not authorized to perform the requested action.Common causes
- Virtual Key does not have access to the model/provider
- Tenant policy blocks the request
- Environment restrictions (e.g., IP allowlist, security policy)
How to fix
- Confirm the Virtual Key is assigned permissions for the model/provider.
- Check tenant-level policies and access controls.
404 — Not Found
What it means
The endpoint or resource does not exist.Common causes
- Wrong base URL or path
- Incorrect endpoint (
/chat/completionvs/chat/completions) - Invalid model name (or model not enabled for your tenant)
How to fix
- Confirm base URL:
https://api.llmgrid.ai/v1 - Confirm endpoint path and model name spelling.
409 — Conflict
What it means
The request conflicts with the current state of the resource.Common causes
- Attempting to create a resource that already exists
- Concurrent updates to the same configuration/key
How to fix
- Retry if the operation is safe and idempotent.
- Resolve duplicates (use “update” instead of “create” where applicable).
422 — Unprocessable Entity
What it means
The request is syntactically valid JSON, but fails schema/validation rules.Common causes
- Tool/function schema invalid (missing required fields)
- Message format values invalid (unsupported role/content structure)
- Parameter values out of range (e.g.,
top_pnot between 0 and 1)
How to fix
- Validate tool definitions and message objects.
- Check parameter bounds and supported values for the chosen model/provider.
429 — Too Many Requests (Rate Limit / Quota / Budget)
What it means
You are sending too many requests and/or tokens, or you exceeded allowed quotas/budgets.Common causes
- Burst traffic (RPM/TPM exceeded)
- Concurrency too high
- Quota/budget limits reached (tenant/project/key)
How to fix
- Implement retries with exponential backoff + jitter.
- Add client-side throttling (queue + concurrency limits).
- Reduce
max_tokensor request frequency. - Verify budgets and rate limit settings in LLMGrid.
500 — Internal Server Error
What it means
A server-side error occurred (gateway or upstream provider).Common causes
- Transient provider instability
- Temporary routing issues
- Internal service errors
How to fix
- Retry with backoff.
- If persistent, log request context (model, time, tenant/key) and contact support.
503 — Service Unavailable
What it means
The service is temporarily unavailable or overloaded.Common causes
- Provider overload
- Temporary maintenance window
- Capacity constraints during high traffic
How to fix
- Retry with longer backoff than for 500s.
- Reduce concurrency and request rate.
- Use fallbacks if configured.
Timeouts / Connection Errors
Symptoms
- Requests hang, then fail
- Streaming disconnects mid-response
Common causes
- Network instability
- Too-low client timeout for long generations
- Very large prompts / high
max_tokens

