Configure vendor guardrails
A guardrail is a content-filtering rule enforced inline by the gateway: it inspects a request or a response as it passes through, and it can block, redact, or flag traffic before the caller ever sees it. Tetrate Agent Router distinguishes two kinds. Vendor guardrails are the safety and moderation controls provided natively by an AI provider, such as a provider's own content-moderation pass or built-in safety settings, enabled and configured through the platform rather than re-implemented inside it. Custom guardrails are organisation-specific rules, such as personally identifiable information (PII) detection or a regex blocklist, that the platform enforces itself.
This guide covers the vendor side. Enabling vendor guardrails leans on the safety work the provider has already done, with no extra filtering infrastructure to run inside the data plane. The provider's model performs the moderation; the platform's role is to switch that behaviour on, decide where it applies, and surface the result consistently to callers and to the audit trail. For organisation-specific rules that no provider can know about, the Configure custom guardrails guide is the right surface, and the two are designed to be used together.
Persona: Platform operator working in the Admin Dashboard.
Estimated time: 10--20 minutes for an initial pass, depending on how many models and providers are in scope.
When this guide applies
This guide is the right starting point in any of these situations:
| Situation | What it covers |
|---|---|
| Turning on a provider's native safety controls for the first time | Enabling vendor guardrails and choosing the scope at which they apply |
| Deciding where a vendor guardrail should bind | Comparing per-model, per-provider, and per-routing-policy scope |
| Understanding what a vendor guardrail can and cannot catch | The content categories vendor guardrails typically cover |
| Explaining a blocked request to a developer | How a block is surfaced to the caller and recorded for review |
| Layering provider-native safety with organisation-specific rules | How vendor guardrails relate to custom guardrails |
For the developer-side view (how a blocked response appears in application code and how a request is composed against a guardrail-protected key) see Protect requests with guardrails. That guide assumes the configuration this one establishes.
Outcomes
By the end of this guide:
- At least one vendor guardrail is enabled and bound to a defined scope.
- The trade-offs between per-model, per-provider, and per-routing-policy scope are understood.
- The behaviour a caller sees when a request is blocked is known, so it can be explained to developers.
- The relationship between vendor guardrails and custom guardrails is clear, and the two are used deliberately rather than interchangeably.
Prerequisites
- Administrator access to the Admin Dashboard, typically the
super_adminrole, or a role granting guardrail configuration. - At least one provider configured with a healthy connection, and at least one of its models enabled. The provisioning steps are covered in Provision models and providers.
- An understanding of which of the in-scope providers expose native safety or moderation controls. Not every provider does, and the categories each one supports vary; the provider's own documentation is the authoritative source.
Step 1: confirm which providers expose native safety controls
Vendor guardrails depend entirely on capability the upstream provider supplies. The first task is therefore to establish which of the configured providers offer safety or moderation controls and what those controls cover, because nothing can be enabled in the platform that the provider does not support.
- For each provider in scope, consult the provider's own documentation for its safety, moderation, or content-filtering features. Naming varies between providers, and so does the granularity of control.
- Note that a provider without native safety controls cannot be protected by a vendor guardrail at all. Coverage for those providers comes from custom guardrails instead, described in Configure custom guardrails.
- Record, per provider, the categories the controls address and whether thresholds are adjustable. This record drives the scope decision in Step 3.
Treating the provider documentation as the source of truth avoids the most common mistake: assuming a guardrail is active for traffic that flows through a provider with no native moderation, where the platform has nothing to enable.
Step 2: understand what vendor guardrails cover
Vendor guardrails operate on the categories the provider's safety model recognises. The exact taxonomy differs between providers, but the controls typically address content such as the following:
- Harmful or unsafe content: categories such as hate, harassment, self-harm, violence, and sexual content, as defined by the provider.
- Safety thresholds: where supported, a sensitivity level that governs how aggressively borderline content is acted upon, rather than a simple on/off switch.
Two properties of vendor guardrails follow from the fact that the provider owns the logic:
- The category names, definitions, and threshold semantics are the provider's, not the platform's. A category labelled the same way by two providers may behave differently.
- The platform does not redefine these categories. It enables the provider's controls and applies them to the traffic in scope, then surfaces the outcome through a single, consistent interface regardless of which provider produced it.
Vendor guardrails are well suited to broad, provider-native safety. They are not the right tool for organisation-specific rules (a list of confidential project codenames, an internal PII policy, or a regex pattern unique to one team) which no provider can be expected to know. Those belong to custom guardrails, covered in Step 5.
Step 3: choose the scope at which the guardrail applies
A vendor guardrail binds to a scope that determines which traffic it inspects. The platform supports three, from narrowest to broadest:
| Scope | What it protects | When to use it |
|---|---|---|
| Per model | Traffic to one enabled model | A single model needs safety controls that others do not, or a model is being trialled before wider rollout |
| Per provider | All traffic to every model from one provider | A provider's native controls should apply uniformly to everything it serves |
| Per routing policy | All traffic governed by one routing policy, across whichever backends that policy targets | Safety behaviour should follow a use case or an audience rather than a specific model or provider |
The choice is a governance decision rather than a technical one. Per-provider scope is the simplest to reason about when a provider's controls should apply everywhere it is used. Per-model scope suits trials and exceptions. Per-routing-policy scope is the most expressive, because it ties safety behaviour to the policy that already expresses an audience or a use case, and it continues to apply as the underlying backends in that policy change.
Where scopes overlap (a per-provider guardrail and a per-model guardrail both touching the same traffic) the more specific binding governs that traffic. Keeping the bindings deliberate, rather than enabling the same control at several scopes at once, keeps the resulting behaviour easy to audit.
Step 4: enable the vendor guardrail
With the provider capability confirmed and the scope chosen, the guardrail is enabled from the Admin Dashboard.
- Sign in to the Admin Dashboard.
- Open the guardrails surface from the sidebar.
- Create a guardrail and select the vendor type, indicating that the provider's native safety controls are to be used rather than a platform-enforced rule.
- Select the provider whose controls the guardrail will use. Only providers that expose native safety controls are eligible.
- Configure the categories and, where the provider supports it, the safety threshold for each. The available categories and thresholds reflect what the provider offers, as established in Step 1.
- Bind the guardrail to the scope chosen in Step 3: a specific model, a provider, or a routing policy.
- Save the configuration.
The guardrail takes effect on subsequent requests within its scope. Requests already in flight complete under the configuration that was active when they were admitted; there is no service restart and no downtime window.
Step 5: understand how a blocked request is surfaced and recorded
When a vendor guardrail acts on a request, the outcome reaches the caller and the audit trail through the same path regardless of which provider made the safety determination.
- To the caller, a blocked request returns an error response rather than a model completion. The response indicates that the request was stopped by a guardrail rather than failing for an unrelated reason such as an authentication or an availability error, so that application code and developers can distinguish a policy block from a transport failure. The developer-side handling of this response is covered in Protect requests with guardrails.
- For the record, a guardrail action is captured as an event: which guardrail acted, the scope it was bound to, and the disposition of the request. These events are reviewable alongside the platform's other administrative and traffic events, described in Audit platform activity.
Because the platform normalises the outcome, a developer does not need to know which provider's safety model produced a block in order to handle it, and an operator does not need to consult each provider's own logs to audit guardrail activity across the platform.
Step 6: combine vendor and custom guardrails
Vendor and custom guardrails are complementary, and a typical deployment uses both:
- Vendor guardrails provide broad, provider-native safety (harmful-content categories and safety thresholds maintained by the provider) with no filtering infrastructure to run inside the data plane.
- Custom guardrails enforce organisation-specific rules the platform applies itself, such as PII detection, keyword and regex blocklists, and other policies unique to the organisation. These are described in Configure custom guardrails.
The two operate in the same inline filter path, and both must permit a request for it to reach the model. Vendor guardrails are the right choice where a provider already maintains the relevant safety logic; custom guardrails cover everything provider-native controls cannot know about. Deciding category by category which layer owns a given concern, rather than duplicating the same intent across both, keeps the overall policy coherent and auditable.
What to do next
- Configure custom guardrails: add organisation-specific rules such as PII detection and regex blocklists alongside the provider-native controls established here. See Configure custom guardrails.
- Protect requests with guardrails: review the developer-side handling of guardrail-blocked responses. See Protect requests with guardrails.
- Audit platform activity: review the events generated when a guardrail acts on a request. See Audit platform activity.
- Glossary: confirm the definitions of guardrail and related terms. See Glossary.
The vendor guardrails configured in this guide remain in place for subsequent guides.
Where to go next