Documentation Index
Fetch the complete documentation index at: https://docs.llmgrid.ai/llms.txt
Use this file to discover all available pages before exploring further.
General
What is LLMGrid?
LLMGrid is an enterprise AI gateway and orchestration platform that provides centralized access, governance, routing, and observability for large language models, tools, and agents.Is LLMGrid tied to a single model or provider?
No. LLMGrid is provider‑agnostic and designed to support multiple model providers, search tools, vector stores, and integrations behind a single, consistent API surface.Does LLMGrid require changes to existing OpenAI‑based applications?
Minimal changes are required. Applications using OpenAI‑compatible SDKs typically only need to update thebase_url to point to the LLMGrid proxy and use an LLMGrid API key.
Access & Authentication
What is a Virtual Key?
A Virtual Key is an API key managed by LLMGrid that controls authentication, access scope, budgets, rate limits, routing behavior, and observability for requests.Can Virtual Keys be rotated or revoked?
Yes. Virtual Keys can be rotated or revoked at any time without requiring application redeployment.How is access controlled?
Access is controlled using a combination of:- Virtual Keys
- Teams and organizations
- Budgets and limits
- Guardrails
- Routing policies
Models & Routing
How does model routing work?
Requests are routed based on configured routing strategies, access rules, and fallback policies. Routing can consider availability, performance, and governance constraints.Can I use model aliases?
Yes. Model aliases allow applications to reference stable identifiers while underlying models or providers change.What happens if a model is unavailable?
If fallback models are configured, LLMGrid automatically routes requests to the next available option. Otherwise, an error is returned.Guardrails & Safety
What are Guardrails?
Guardrails are policy enforcement mechanisms that inspect and control inputs, outputs, and tool execution to ensure safety, compliance, and governance.When do Guardrails run?
Guardrails can run:- Before a model call
- During execution
- After a response is generated
- Before tool or MCP execution
- In logging‑only (audit) mode
Can Guardrails be scoped?
Yes. Guardrails can be enforced tenant‑wide or scoped to specific keys, teams, routes, or test scenarios.Usage, Cost & Budgets
How is usage tracked?
Usage is tracked at the request level, including tokens, execution time, routing outcomes, and metadata like keys, tags, and agents.How do Budgets work?
Budgets define usage limits and rate limits that are enforced automatically. When a budget is exceeded, requests may be throttled or rejected.Can costs be adjusted or discounted?
Yes. Cost Tracking supports applying percentage‑based discounts to provider costs, which are reflected in usage metrics and headers.Caching & Performance
What does caching do?
Caching stores responses for repeat or deterministic requests, reducing latency and avoiding repeated model calls.What cache backends are supported?
LLMGrid supports Redis‑based caching with multiple deployment modes, including single‑node, cluster, sentinel, and semantic‑aware caching.Does caching affect model behavior?
Caching short‑circuits model execution for cache hits but does not alter model output semantics.Search Tools & Vector Stores
What are Search Tools?
Search Tools allow models and agents to retrieve external or live information to ground responses and enable retrieval‑augmented generation (RAG).What are Vector Stores used for?
Vector Stores store and retrieve embeddings for semantic search and RAG workflows. They are referenced by ID and managed centrally.Can vector stores be tested?
Yes. Vector Stores include a test mode to validate connectivity and availability without impacting production traffic.Observability & Logging
What types of logs are available?
LLMGrid provides:- Request logs
- Execution and model logs
- Guardrail enforcement logs
- Audit logs for administrative actions
Can usage and logs be filtered?
Yes. Logs and metrics can be filtered by time, model, key, team, organization, tag, agent, and status.Can observability data be exported?
Yes. Observability data can be accessed programmatically and integrated with external analytics or monitoring systems.Security & Compliance
How is data protected?
LLMGrid enforces secure transport, masked credentials, scoped access, and policy‑based controls through Guardrails and access management.Is LLMGrid suitable for regulated environments?
LLMGrid supports common enterprise compliance requirements through configuration, observability, and governance controls rather than hard‑coded logic.Who is responsible for compliance?
LLMGrid provides security and governance tooling, while customers remain responsible for application‑level data handling and regulatory obligations.Troubleshooting
My requests are failing—where should I look first?
Start with:- Request Logs
- Guardrail enforcement events
- Budget or rate‑limit violations
- Model availability and routing status
Cache hit ratio is low—why?
Common reasons include:- Non‑deterministic prompts
- Missing or inconsistent routing keys
- Semantic caching not enabled where appropriate
A Guardrail is blocking traffic unexpectedly—what should I do?
Review Guardrail logs in logging‑only mode first, adjust scope or thresholds, and validate changes using the Test Playground.Getting Help
Where can I find more documentation?
Refer to:- API Reference
- Guardrails
- Routing Settings
- Observability
- Security & Compliance
How do I request a feature?
Use the provided support or issue‑tracking links in the UI to submit feature requests or feedback.If you need help beyond these FAQs, consult the relevant feature documentation or contact your platform administrator.

