Glossary
A consolidated definition list for the platform-specific vocabulary used across the documentation. For a narrative introduction to the same concepts, see Key Concepts. Product and marketing materials on tetrate.io use some alternate names for the same concepts: for example, control plane for management plane, token budgets for the platform's per-key rate limits, and AI Gateway for the data-plane proxy. This glossary uses the names that appear in these docs; cross-references note the marketing equivalents where they differ.
Active fallback chain
The ordered, priority-ranked list of backends that the gateway walks when a request fails with a retryable error (connection failure, 5xx, or 429). The gateway tries the first backend; on failure it moves to the next until one succeeds or the chain is exhausted, the standard provider fallback pattern used by AI gateways. In this platform, the chain is the priority-ordered portion of an API key's Routing configuration; it is active only when that configuration's Active toggle is on (inactive configurations are saved but not enforced). See Fallback policy, Routing chain, and Improve Resilience with Fallbacks.
Admin Dashboard
The operator-facing application. Where models, providers, users, budgets, SSO, audit logs, and instance settings are managed.
Advanced routing rules
Routing decisions that depend on attributes of the request itself (a custom header, the requested model name, or request metadata) rather than only on static priority or weight. See Apply Advanced Routing Rules.
Agent Router Console
The developer-facing application. Where API keys, routing configurations, MCP profiles, integrations, the Playground, request logs, and personal usage analytics are accessed.
Agent Router Enterprise
The full platform: data plane, management plane, Console, and Admin Dashboard, considered as a single system. Often shortened to "the platform" after first mention.
Agent Router Service
The self-serve product tier: sign up with a GitHub or Google account and route through a Tetrate-managed service without a dedicated Enterprise instance. Includes multi-model routing, MCP access, and per-key usage logs. Agent Router Enterprise adds cross-team cost attribution, admin access controls, runtime AI Guardrails, enterprise SSO, and distributed data-plane deployment. This documentation set focuses on Agent Router Enterprise; tier comparison is on the welcome page.
AI Gateway
The model-routing component of the data plane: the proxy that receives LLM requests from applications, applies routing configuration, attaches credentials, and forwards traffic to upstream providers. On tetrate.io this is one of three product pillars (alongside MCP Gateway and AI Guardrails). In these docs it is also called the Gateway; older materials may use LLM Gateway. Built on Envoy AI Gateway. See Architecture overview.
AI Guardrails
Runtime policy enforcement on model and MCP traffic: PII detection and redaction, prompt and response filtering, and request blocking before traffic reaches upstream providers. Implemented inline in the gateway through Dynamic modules. An Agent Router Enterprise capability; configured in the Admin Dashboard. See Protect Requests with Guardrails.
API key
The credential an application presents to the gateway. Each key is associated with a Console user, a routing configuration, and optional per-key rate limits.
Backend
In routing terms, a model-on-a-provider combination plus the credentials needed to reach it. Routing decisions resolve to a choice of backend.
Bring your own key (BYOK)
The mechanism by which a consumer's own upstream provider credentials are used in place of platform-managed credentials. BYOK credentials are configured at the Console account level and are slotted into routing chains alongside platform-managed credentials. See Use Your Own Provider Credentials.
Budget
An operator-facing spending control implemented through Rate limits on API keys plus Usage analytics for visibility. There is no separate "budget object" in the platform; budgets are the combination of per-key token ceilings (enforced inline with HTTP 429) and regular review of usage by key or team. Tetrate marketing often calls the same mechanism token budgets. See Working with budgets.
Canary deployment
A routing pattern in which a new model version is rolled out gradually, by traffic splitting from a small weight on the new version to progressively larger weights as confidence grows.
Console
Short form for the Agent Router Console.
Control plane
Synonym for Management plane. Tetrate-hosted; stores routing rules, policies, user records, API key metadata, audit history, and analytics rollups. Governs one or more customer-managed data planes. Used on tetrate.io and in the public FAQ; these docs prefer management plane.
Controller
The data-plane component that bridges the Data plane and Management plane. Polls the management plane for configuration updates and translates them into gateway configuration, keeping routing rules, policies, and provider credentials in sync without inbound connections from the internet.
Correlation ID
A UUID attached by the gateway to every request, exposed as the X-Request-ID response header and emitted as a span attribute on the corresponding OpenTelemetry trace. The primary identifier for joining application logs with Request Logs and traces.
Cost attribution
The practice of tying AI spend and usage to a team, application, or agent. In Agent Router Enterprise, attribution relies on per-purpose API keys, Usage analytics, optional export to an observability stack, and (with SSO) authenticated identity on every request. Supports showback and chargeback workflows. An Enterprise-tier capability.
Data plane
The request-path component of the platform: a customer-managed Kubernetes deployment containing the Controller and an AI Gateway proxy built on Envoy AI Gateway. All AI traffic flows through it. Prompt and response payloads stay inside the customer's infrastructure; only configuration and telemetry cross to the management plane.
Dynamic module
A high-performance proxy extension (written in Rust or Go and compiled to a shared library) that runs inline in the gateway's filter chain. Dynamic modules implement the platform's routing logic, credential handling, and provider-specific translation.
Endpoint picker
A pluggable component (Endpoint Picker Provider, EPP) that selects the best upstream inference endpoint per request from a pool of candidates, using real-time signals rather than static weights or round-robin. Standard signals include KV-cache utilisation, queue depth, and prefix-cache affinity. Defined by the Kubernetes Gateway API Inference Extension; integrated in Envoy AI Gateway via InferencePool + Endpoint Picker Provider. In this platform, the Endpoint Picker operates within the eligible set defined by the routing configuration. It does not override policy boundaries. See Apply Advanced Routing Rules and Load Balance Across Regional Deployments.
Envoy AI Gateway
The open-source, CNCF-backed AI gateway co-created and maintained by Tetrate and Bloomberg. The data-plane proxy technology underneath Agent Router Enterprise. Tetrate Agent Router adds the Management plane, model and MCP catalogues, cost attribution, AI Guardrails, SSO, and multi-gateway governance on top. See the Tetrate FAQ for the build-vs-buy comparison.
Fallback policy
An ordered list of backends per API key. The gateway tries the first backend; if it fails with a retryable error, the gateway walks to the next backend, continuing until a backend succeeds or the chain is exhausted. See Improve Resilience with Fallbacks.
Gateway
The proxy component of the data plane that receives application requests, applies routing rules, attaches provider credentials, and translates between API formats. In architecture diagrams and on tetrate.io this component is also called the AI Gateway (formerly LLM Gateway in some older materials). See AI Gateway and Envoy AI Gateway.
Logical model name
A name exposed by a routing configuration that the gateway resolves to a specific provider model identifier. Decouples application code from provider model version strings. Configured through model-name overrides; see Apply Advanced Routing Rules.
Management plane
The Tetrate-hosted control surface. Stores routing rules, policies, user records, API key metadata, audit history, and analytics rollups. Communicates with the data plane through a small, well-defined interface; no application payloads cross the boundary. Tetrate marketing and the public FAQ refer to the same component as the control plane; the terms are interchangeable.
Model context protocol (MCP)
An emerging standard for exposing tools, data sources, and context to AI clients. MCP servers act as adapters between AI clients (Claude Code, Cursor, VS Code, and others) and the systems those clients need to reach.
MCP catalogue
The set of MCP servers registered on the platform by operators. The catalogue is the universe from which developers assemble MCP profiles in the Console. Managed in the Admin Dashboard under MCP Servers.
MCP Gateway
The MCP-routing and governance component of the platform: operators curate an MCP catalogue centrally, developers assemble MCP profiles in the Console, and agents reach approved tools through governed endpoints. On tetrate.io this is a distinct product pillar alongside the AI Gateway. Tool calls receive the same attribution and logging as LLM requests.
MCP profile
A named collection of MCP servers exposed through a single Agent Router URL. AI clients connect to the profile URL once and receive access to every server included in the profile. See Aggregate MCP Servers into a Profile.
OAuth client (MCP)
A configuration record holding the credentials and endpoints needed for OAuth-authenticated MCP servers. Managed in the Admin Dashboard under MCP OAuth Clients.
OpenTelemetry (OTel / OTLP)
The open observability standard the platform uses for trace and metric export. OTLP is the wire protocol. The gateway exports traces over OTLP; metrics are exposed on a Prometheus scrape endpoint that can be forwarded as OTLP by an external collector.
Persona
The audience profile used to organise documentation. The platform serves two primary personas: Developer (Console) and Platform operator (Admin Dashboard), with the Installer treated as a phase of the operator persona during initial setup.
Platform capabilities
The catalogue of features active on a deployment. Visible on the Instances surface under Enabled Features.
Platform operator
The persona responsible for running the platform: provisioning models and providers, governing access, configuring SSO, auditing usage, and operating the system across environments.
Playground
An interactive testing surface inside the Console. Sends messages to any enabled model and renders responses, token counts, and latency in real time. Playground traffic flows through the same gateway as application traffic. See Test Prompts in the Playground.
Provider
An upstream AI service (OpenAI, Anthropic, Google, Azure OpenAI, Mistral, and others) that the gateway routes requests to. Provider configurations carry credentials and connectivity details.
Provider translation
The transparent conversion the gateway performs between its OpenAI-compatible request surface and the provider-specific APIs of upstream services. Includes header mutations, body mutations, path rewriting, model-name override, and response normalisation. See Supported APIs.
Rate limit
A per-API-key cap on token consumption within a rolling one-hour window. Available on input tokens, output tokens, total tokens, or any combination. Enforced inline by the gateway; requests that would exceed an active limit receive HTTP 429. On tetrate.io and in operator guides, the same mechanism is often described as a token budget or budget; see Budget.
Request logs
The per-request record in the Console. Captures the resolved model, provider, token counts, latency, cost, and full request and response payloads for every gateway call under the user's API keys.
Resolved model
The model that actually served a request, after the routing configuration has been evaluated. May differ from the model the application asked for (the requested_model) due to fallback walks, traffic splits, or logical-name overrides.
Role
An Agent Router-level permission set. The role model is small: super_admin, model_admin, provider_admin, mcp_admin, user_admin, billing_admin, and user. Roles can be assigned manually or driven by SSO claim mapping.
Role mapping
The mechanism by which OIDC claims (app roles, group memberships) are translated into Agent Router roles on every login. Configured on the SSO provider in the Admin Dashboard. See the SSO role mapping guide.
Routing chain
The ordered or weighted list of backends attached to an API key. May be a fallback chain (priority-based), a traffic split (weight-based), or a combination.
Routing configuration
The settings that govern how requests on a given API key are dispatched: routing strategy, backend list, weights or priorities, advanced rules, and active/inactive state.
Routing policy
Synonym for Routing configuration. Used in guardrail scoping ("per routing policy"), quickstarts, and some operator guides. The canonical term in the Console UI and developer guides is routing configuration; both refer to the settings on an API key that govern dispatch strategy, backends, weights, and advanced rules.
Service provider (SP) metadata
The SAML metadata produced by the platform that has to be registered on the identity provider for trust to be established. Surfaced on the SSO Settings page after SAML configuration is saved.
Session affinity
The gateway property that keeps a multi-turn conversation, agent workflow, or MCP session pinned to a consistent processing path for its entire lifetime. Applied automatically; no configuration required.
Single sign-on (SSO)
Delegated authentication through a corporate identity provider, over SAML 2.0 or OIDC. Configured in the Admin Dashboard under Settings → SSO. See Configure Single Sign-On. In Agent Router Enterprise, SSO also means every gateway request can carry authenticated user and team identity, enabling per-team cost attribution and audit; this is an Enterprise-tier capability.
Streaming
Server-sent events (SSE) delivery of incremental response tokens. Supported across all three API formats; see Supported APIs for the per-format SSE shape.
Tetrate Agent Router
The umbrella product name for Tetrate's AI gateway offering. Includes Agent Router Service (self-serve) and Agent Router Enterprise (dedicated instance with full governance). Sits in front of existing agent frameworks and provider SDKs via an OpenAI-compatible API; built on Envoy AI Gateway. This documentation set documents Agent Router Enterprise unless a page explicitly compares tiers.
Traffic splitting
A routing strategy in which all backends sit at the same priority and share traffic by weight. Used for cost reduction, A/B evaluation, and gradual migration. See Reduce Cost with Traffic Splitting.
Usage analytics
The aggregated traffic surface. Two variants exist: the Console version, scoped to the signed-in user's own data, and the Admin Dashboard version, aggregated across all users and API keys on the platform.
Where to go next