Skip to main content

Configure vendor guardrails

A guardrail is a content-filtering rule enforced inline by the gateway: it inspects a request or a response as it passes through, and it can block, redact, or flag traffic before the caller ever sees it. Tetrate Agent Router distinguishes two kinds. Vendor guardrails are the safety and moderation controls provided natively by an AI provider, such as a provider's own content-moderation pass or built-in safety settings, enabled and configured through the platform rather than re-implemented inside it. Custom guardrails are organisation-specific rules, such as personally identifiable information (PII) detection or a regex blocklist, that the platform enforces itself.


This guide covers the vendor side. Enabling vendor guardrails leans on the safety work the provider has already done, with no extra filtering infrastructure to run inside the data plane. The provider's model performs the moderation; the platform's role is to switch that behaviour on, decide where it applies, and surface the result consistently to callers and to the audit trail. For organisation-specific rules that no provider can know about, the Configure custom guardrails guide is the right surface, and the two are designed to be used together.

Persona: Platform operator working in the Admin Dashboard.

Estimated time: 10--20 minutes for an initial pass, depending on how many models and providers are in scope.

When this guide applies

This guide is the right starting point in any of these situations:

SituationWhat it covers
Turning on a provider's native safety controls for the first timeEnabling vendor guardrails and choosing the scope at which they apply
Deciding where a vendor guardrail should bindComparing per-model, per-provider, and per-routing-policy scope
Understanding what a vendor guardrail can and cannot catchThe content categories vendor guardrails typically cover
Explaining a blocked request to a developerHow a block is surfaced to the caller and recorded for review
Layering provider-native safety with organisation-specific rulesHow vendor guardrails relate to custom guardrails

For the developer-side view (how a blocked response appears in application code and how a request is composed against a guardrail-protected key) see Protect requests with guardrails. That guide assumes the configuration this one establishes.

Outcomes

By the end of this guide:

  • At least one vendor guardrail is enabled and bound to a defined scope.
  • The trade-offs between per-model, per-provider, and per-routing-policy scope are understood.
  • The behaviour a caller sees when a request is blocked is known, so it can be explained to developers.
  • The relationship between vendor guardrails and custom guardrails is clear, and the two are used deliberately rather than interchangeably.

Prerequisites

  • Administrator access to the Admin Dashboard, typically the super_admin role, or a role granting guardrail configuration.
  • At least one provider configured with a healthy connection, and at least one of its models enabled. The provisioning steps are covered in Provision models and providers.
  • An understanding of which of the in-scope providers expose native safety or moderation controls. Not every provider does, and the categories each one supports vary; the provider's own documentation is the authoritative source.

Step 1: confirm which providers expose native safety controls

Vendor guardrails depend entirely on capability the upstream provider supplies. The first task is therefore to establish which of the configured providers offer safety or moderation controls and what those controls cover, because nothing can be enabled in the platform that the provider does not support.

  • For each provider in scope, consult the provider's own documentation for its safety, moderation, or content-filtering features. Naming varies between providers, and so does the granularity of control.
  • Note that a provider without native safety controls cannot be protected by a vendor guardrail at all. Coverage for those providers comes from custom guardrails instead, described in Configure custom guardrails.
  • Record, per provider, the categories the controls address and whether thresholds are adjustable. This record drives the scope decision in Step 3.

Treating the provider documentation as the source of truth avoids the most common mistake: assuming a guardrail is active for traffic that flows through a provider with no native moderation, where the platform has nothing to enable.

Step 2: understand what vendor guardrails cover

Vendor guardrails operate on the categories the provider's safety model recognises. The exact taxonomy differs between providers, but the controls typically address content such as the following:

  • Harmful or unsafe content: categories such as hate, harassment, self-harm, violence, and sexual content, as defined by the provider.
  • Safety thresholds: where supported, a sensitivity level that governs how aggressively borderline content is acted upon, rather than a simple on/off switch.

Two properties of vendor guardrails follow from the fact that the provider owns the logic:

  • The category names, definitions, and threshold semantics are the provider's, not the platform's. A category labelled the same way by two providers may behave differently.
  • The platform does not redefine these categories. It enables the provider's controls and applies them to the traffic in scope, then surfaces the outcome through a single, consistent interface regardless of which provider produced it.

Vendor guardrails are well suited to broad, provider-native safety. They are not the right tool for organisation-specific rules (a list of confidential project codenames, an internal PII policy, or a regex pattern unique to one team) which no provider can be expected to know. Those belong to custom guardrails, covered in Step 5.

Step 3: choose the scope at which the guardrail applies

A vendor guardrail binds to a scope that determines which traffic it inspects. The platform supports three, from narrowest to broadest:

ScopeWhat it protectsWhen to use it
Per modelTraffic to one enabled modelA single model needs safety controls that others do not, or a model is being trialled before wider rollout
Per providerAll traffic to every model from one providerA provider's native controls should apply uniformly to everything it serves
Per routing policyAll traffic governed by one routing policy, across whichever backends that policy targetsSafety behaviour should follow a use case or an audience rather than a specific model or provider

The choice is a governance decision rather than a technical one. Per-provider scope is the simplest to reason about when a provider's controls should apply everywhere it is used. Per-model scope suits trials and exceptions. Per-routing-policy scope is the most expressive, because it ties safety behaviour to the policy that already expresses an audience or a use case, and it continues to apply as the underlying backends in that policy change.

Where scopes overlap (a per-provider guardrail and a per-model guardrail both touching the same traffic) the more specific binding governs that traffic. Keeping the bindings deliberate, rather than enabling the same control at several scopes at once, keeps the resulting behaviour easy to audit.

Step 4: enable the vendor guardrail

With the provider capability confirmed and the scope chosen, the guardrail is enabled from the Admin Dashboard.

  1. Sign in to the Admin Dashboard.
  2. Open the guardrails surface from the sidebar.
  3. Create a guardrail and select the vendor type, indicating that the provider's native safety controls are to be used rather than a platform-enforced rule.
  4. Select the provider whose controls the guardrail will use. Only providers that expose native safety controls are eligible.
  5. Configure the categories and, where the provider supports it, the safety threshold for each. The available categories and thresholds reflect what the provider offers, as established in Step 1.
  6. Bind the guardrail to the scope chosen in Step 3: a specific model, a provider, or a routing policy.
  7. Save the configuration.

The guardrail takes effect on subsequent requests within its scope. Requests already in flight complete under the configuration that was active when they were admitted; there is no service restart and no downtime window.

Step 5: understand how a blocked request is surfaced and recorded

When a vendor guardrail acts on a request, the outcome reaches the caller and the audit trail through the same path regardless of which provider made the safety determination.

  • To the caller, a blocked request returns an error response rather than a model completion. The response indicates that the request was stopped by a guardrail rather than failing for an unrelated reason such as an authentication or an availability error, so that application code and developers can distinguish a policy block from a transport failure. The developer-side handling of this response is covered in Protect requests with guardrails.
  • For the record, a guardrail action is captured as an event: which guardrail acted, the scope it was bound to, and the disposition of the request. These events are reviewable alongside the platform's other administrative and traffic events, described in Audit platform activity.

Because the platform normalises the outcome, a developer does not need to know which provider's safety model produced a block in order to handle it, and an operator does not need to consult each provider's own logs to audit guardrail activity across the platform.

Step 6: combine vendor and custom guardrails

Vendor and custom guardrails are complementary, and a typical deployment uses both:

  • Vendor guardrails provide broad, provider-native safety (harmful-content categories and safety thresholds maintained by the provider) with no filtering infrastructure to run inside the data plane.
  • Custom guardrails enforce organisation-specific rules the platform applies itself, such as PII detection, keyword and regex blocklists, and other policies unique to the organisation. These are described in Configure custom guardrails.

The two operate in the same inline filter path, and both must permit a request for it to reach the model. Vendor guardrails are the right choice where a provider already maintains the relevant safety logic; custom guardrails cover everything provider-native controls cannot know about. Deciding category by category which layer owns a given concern, rather than duplicating the same intent across both, keeps the overall policy coherent and auditable.

What to do next

  • Configure custom guardrails: add organisation-specific rules such as PII detection and regex blocklists alongside the provider-native controls established here. See Configure custom guardrails.
  • Protect requests with guardrails: review the developer-side handling of guardrail-blocked responses. See Protect requests with guardrails.
  • Audit platform activity: review the events generated when a guardrail acts on a request. See Audit platform activity.
  • Glossary: confirm the definitions of guardrail and related terms. See Glossary.

The vendor guardrails configured in this guide remain in place for subsequent guides.