Skip to main content
This page lists common errors you may encounter when calling LLMGrid’s API and how to resolve them quickly. Base URL: https://api.llmgrid.ai/v1
Auth Header: Authorization: Bearer <LLMGRID_API_KEY>

Quick Triage

Use the status code to decide your next step:
  • 4xx (except 429) → request issue (auth, permissions, invalid params). Fix the request.
  • 429 → throttling / quota / budget issue. Back off, retry, and/or reduce throughput.
  • 5xx → transient failure (gateway or upstream provider). Retry with backoff.

400 — Bad Request

What it means

The request payload is invalid or contains unsupported parameters.

Common causes

  • Invalid JSON (malformed body)
  • Wrong parameter names or types (e.g., string where number expected)
  • Unsupported parameter for the selected model/provider
  • Invalid messages structure (missing role or content)

How to fix

  • Validate your JSON and parameter types.
  • Remove provider-unsupported params, or enable “drop unsupported params” if your SDK supports it.
  • Ensure messages is an array of objects with role + content.

401 — Unauthorized

What it means

Missing or invalid API key.

Common causes

  • Missing Authorization header
  • Wrong key format
  • Expired/revoked Virtual Key

How to fix

  • Set Authorization: Bearer $LLMGRID_API_KEY
  • Regenerate/rotate the Virtual Key if needed.

403 — Forbidden

What it means

You are authenticated, but not authorized to perform the requested action.

Common causes

  • Virtual Key does not have access to the model/provider
  • Tenant policy blocks the request
  • Environment restrictions (e.g., IP allowlist, security policy)

How to fix

  • Confirm the Virtual Key is assigned permissions for the model/provider.
  • Check tenant-level policies and access controls.

404 — Not Found

What it means

The endpoint or resource does not exist.

Common causes

  • Wrong base URL or path
  • Incorrect endpoint (/chat/completion vs /chat/completions)
  • Invalid model name (or model not enabled for your tenant)

How to fix

  • Confirm base URL: https://api.llmgrid.ai/v1
  • Confirm endpoint path and model name spelling.

409 — Conflict

What it means

The request conflicts with the current state of the resource.

Common causes

  • Attempting to create a resource that already exists
  • Concurrent updates to the same configuration/key

How to fix

  • Retry if the operation is safe and idempotent.
  • Resolve duplicates (use “update” instead of “create” where applicable).

422 — Unprocessable Entity

What it means

The request is syntactically valid JSON, but fails schema/validation rules.

Common causes

  • Tool/function schema invalid (missing required fields)
  • Message format values invalid (unsupported role/content structure)
  • Parameter values out of range (e.g., top_p not between 0 and 1)

How to fix

  • Validate tool definitions and message objects.
  • Check parameter bounds and supported values for the chosen model/provider.

429 — Too Many Requests (Rate Limit / Quota / Budget)

What it means

You are sending too many requests and/or tokens, or you exceeded allowed quotas/budgets.

Common causes

  • Burst traffic (RPM/TPM exceeded)
  • Concurrency too high
  • Quota/budget limits reached (tenant/project/key)

How to fix

  • Implement retries with exponential backoff + jitter.
  • Add client-side throttling (queue + concurrency limits).
  • Reduce max_tokens or request frequency.
  • Verify budgets and rate limit settings in LLMGrid.

500 — Internal Server Error

What it means

A server-side error occurred (gateway or upstream provider).

Common causes

  • Transient provider instability
  • Temporary routing issues
  • Internal service errors

How to fix

  • Retry with backoff.
  • If persistent, log request context (model, time, tenant/key) and contact support.

503 — Service Unavailable

What it means

The service is temporarily unavailable or overloaded.

Common causes

  • Provider overload
  • Temporary maintenance window
  • Capacity constraints during high traffic

How to fix

  • Retry with longer backoff than for 500s.
  • Reduce concurrency and request rate.
  • Use fallbacks if configured.

Timeouts / Connection Errors

Symptoms

  • Requests hang, then fail
  • Streaming disconnects mid-response

Common causes

  • Network instability
  • Too-low client timeout for long generations
  • Very large prompts / high max_tokens