Manage configuration as code
Platform configuration (the policies, budgets, model catalogue, pricing, and API keys that govern how requests are routed and what they cost) has to live somewhere, and the choice of where determines who can change it, how a change is reviewed, and whether the change can be reproduced on a second instance. Agent Router Enterprise exposes the same configuration through three surfaces: the Admin Dashboard UI, the admin API, and infrastructure as code (IaC) via Terraform and Helm. Most deployments use the three together, each for the kind of change it suits, rather than choosing one.
Persona: Platform operator or platform engineer responsible for how the platform is configured and how that configuration is reviewed and promoted across environments.
Estimated time: 20--30 minutes to read; longer to design and adopt an IaC workflow for a specific deployment.
When this guide applies
This guide is relevant in any of these situations:
| Situation | What it covers |
|---|---|
| A non-engineer needs to read or adjust budgets and policies | The Admin Dashboard UI surfaces for budgets, policies, the model catalogue, pricing, and API keys |
| The baseline configuration has to be reviewed and version-controlled before it reaches production | The IaC path via Terraform and Helm, and why it is preferred for the baseline |
| The same configuration has to be reproduced across staging, production, and regional instances | Promotion of versioned configuration across environments |
| A routing or policy change has to take effect without taking the platform down | The dynamic configuration model |
| An auditor or security reviewer asks who changed what and when | The audit record that covers every administrative change regardless of surface |
| UI edits and IaC runs are starting to overwrite each other | The recommended split of ownership between the two |
Outcomes
By the end of this guide:
- The three configuration surfaces (the Admin Dashboard UI, the admin API, and IaC via Terraform and Helm) are understood, along with the kind of change each suits.
- The reasons to manage the baseline configuration as code (review, version control, and reproducibility) are clear.
- It is clear that policy and routing changes take effect dynamically, without a redeploy of the data plane.
- It is clear that every administrative change is versioned and recorded in the audit log, regardless of which surface made it.
- A division of ownership between UI-driven and IaC-driven changes has been chosen, with a rule for keeping the two from overwriting each other.
Prerequisites
- Administrator access to the Admin Dashboard (typically the
super_adminrole) for the UI and audit surfaces. - For the admin API: an API credential with administrative scope, issued from the Admin Dashboard.
- For the IaC path: a Terraform or Helm toolchain, a version-control repository, and, for Enterprise deployments, the Kubernetes context that hosts the data plane. The Kubernetes resources the installer renders are covered in Retrieving data plane resources.
- A working model catalogue, so that policies and budgets have something to act on. Provisioning is covered in Provision models and providers.
Step 1: choose the right surface for the change
The first decision for any configuration change is which surface to make it through. The three surfaces read and write the same underlying configuration, so the choice is about review, reproducibility, and who is making the change, not about capability.
| Surface | Best for | Audience |
|---|---|---|
| Admin Dashboard UI | Day-to-day adjustments: reading a budget, tightening a policy, enabling a model, rotating a key | Operators, and non-engineers who need to read or adjust budgets and policies without touching code |
| Admin API | Scripted or automated changes, integration with internal tooling, bulk operations | Platform engineers and automation |
| IaC (Terraform and Helm) | The baseline configuration that has to be reviewed, versioned, and reproduced across instances | Platform engineers |
The UI is the surface a non-engineer can use. Budgets and policies are presented as forms rather than as code, so a finance owner can read what a budget is set to, and a security owner can read what a policy enforces, without reading a manifest. Where their permissions allow it, the same forms let them adjust those values in place. The day-to-day adjustments that follow a single decision (raising one team's budget, adding one model to the catalogue, or rotating one key) are fastest through the UI.
The admin API exposes the same operations programmatically. It is the surface for automation: a script that provisions a new team's keys and budget in one pass, an integration that syncs the model catalogue from an internal source of truth, or a bulk policy update across many keys. The API is also what the Terraform provider calls underneath.
IaC is the surface for the baseline: the configuration that defines what the platform is before any day-to-day adjustment. The next step covers why.
Step 2: understand why the baseline is managed as code
Any single configuration value can be set through the UI. The reason teams move the baseline into IaC is not capability; it is the discipline that version control brings.
- Review. An IaC change is a pull request. The proposed change to a policy, a budget, or a routing rule is visible as a diff, can be commented on, and is approved by a second person before it is applied. A UI change has no equivalent gate; it takes effect when the operator selects Save.
- Version control. The repository is the history of how the configuration reached its current state. Every change has an author, a timestamp, and a commit message explaining why. Reverting a change is a revert of a commit, not a reconstruction from memory.
- Reproducibility. The same configuration definition can be applied to staging, then to production, then to a new regional instance, and produce the same result each time. Reproducing a UI-built configuration on a second instance means clicking through the same forms again and hoping nothing was missed: the configuration drift that the multi-instance guide warns about. See Run multiple platform instances.
Two toolchains cover the IaC path:
- Helm renders and applies the Kubernetes resources for the data plane itself: the Controller, the Agent Router gateway, and their supporting objects. This is the installation-time layer, covered in Retrieving data plane resources, where the rendered manifest can be committed to a GitOps repository and applied by Argo CD or Flux rather than applied directly.
- Terraform manages the platform-level configuration (providers, the model catalogue, policies, budgets, and keys) as declarative resources that call the admin API underneath. This is the layer that defines what the running platform does, as opposed to how it is deployed.
The examples below are illustrative. They show the shape of an IaC workflow, not the exact resource schema; the authoritative resource names and arguments are published with the Terraform provider and the Helm chart for the deployment's release.
A Terraform definition of a routing policy and a budget reads roughly as follows:
# Illustrative only --- consult the provider documentation for exact resource names and arguments.
resource "tare_policy" "default_routing" {
name = "default-routing"
# An ordered fallback chain: the gateway walks the list on failure.
fallback = [
"anthropic-claude",
"bedrock-claude",
]
}
resource "tare_budget" "research_team" {
name = "research-team-monthly"
limit_usd = 5000
period = "monthly"
applies_to = "team:research"
}
A Helm values file pins the data plane release and its registry:
# Illustrative values for the data plane chart.
image:
registry: us-central1-docker.pkg.dev/acme-prod/tare
tag: "1.14.2"
controller:
replicas: 2
Both files belong in version control, where a change to either is reviewed as a pull request before it is applied.
Step 3: rely on dynamic configuration for policy and routing changes
A configuration change does not require the data plane to be rebuilt or redeployed. Policies, routing rules, fallback chains, traffic-splitting weights, budgets, and the model catalogue are read by the gateway from the management plane and take effect dynamically.
The mechanism is the same regardless of which surface made the change:
- The change is written through the UI, the admin API, or a Terraform apply.
- The management plane stores the new configuration.
- The Controller in the data plane reconciles the change and pushes the updated configuration to the Agent Router gateway.
- The gateway begins enforcing the new configuration after a brief propagation period, without dropping in-flight requests and without a restart.
The practical consequence is that a routing change (shifting traffic from one backend to another, adding a fallback, or tightening a budget) is a configuration operation, not a deployment. A Helm change that alters the data plane release itself is the exception: changing the gateway image or its replica count is a deployment, because it changes the running pods rather than the configuration they read. The line is the same as the deployment-mode boundary described in Run multiple platform instances: configuration that lives in the management plane is dynamic, and infrastructure that lives in the data plane manifests is deployed.
Step 4: confirm that every change is versioned and audited
Two independent records cover configuration change, and a complete picture uses both.
- The IaC repository is the version history for everything managed as code. The diff, the author, the timestamp, and the review are all in the version-control system. This record exists only for changes made through the IaC path.
- The audit log records every administrative change to the platform regardless of the surface that made it: a UI edit, an API call, or a Terraform apply all produce an audit entry. Each entry captures who made the change, what was changed, and when. This is the record that answers an auditor's question for changes that did not go through IaC, and it is the cross-check that confirms an IaC apply did what its diff claimed.
Reading and exporting the audit log is covered in Audit platform activity. For deployments that need a single timeline across both records, exporting the audit stream to a central SIEM, alongside the IaC repository history, gives one place to answer "who changed this policy, and why".
The two records are complementary, not redundant. The repository explains the intent behind a change and carries the review; the audit log proves the change reached the running platform and catches anything changed outside the IaC path.
Step 5: divide ownership between the UI and IaC
The UI and IaC write the same configuration, so a value set in one can be overwritten by the other. A Terraform apply that defines a budget will reset that budget to its declared value, discarding an interim UI adjustment, on the next run. Avoiding this is a matter of deciding, per category of configuration, which surface owns it.
A workable split for most deployments:
- IaC owns the baseline. Providers, the model catalogue structure, the standing set of policies, the default budgets, and the data plane release are defined in code and promoted through review. These are the things that have to match across instances and that benefit most from version control.
- The UI owns day-to-day adjustments that are deliberately not in the baseline: a one-off budget increase for a team running an experiment, an urgent policy tightening during an incident, a key rotation. These are the changes that need to happen in minutes, by whoever is on hand, and that do not need to be reproduced on another instance.
The rule that keeps the two from fighting is to not manage the same value in both places. A budget that Terraform defines should be changed in Terraform, not in the UI; otherwise the next apply silently reverts the UI change. A budget that the team has decided to manage by hand should be left out of the Terraform definition entirely, so that no apply touches it. When a value that started as a hand-managed UI adjustment becomes permanent, it is promoted into the IaC baseline by a deliberate change to the code, and from that point it is owned by IaC.
For Enterprise deployments running a GitOps reconciler such as Argo CD or Flux, the reconciler will actively revert drift from the committed state on its own schedule. Anything under the reconciler's management must not be edited in the UI at all, because the reconciler will undo it; the UI's scope on those deployments is limited to the categories deliberately excluded from the GitOps repository.
What to do next
- Provision models and providers: the model catalogue and provider connections are the baseline that an IaC workflow manages first. See Provision models and providers.
- Work with budgets: budgets are a primary candidate for the UI-versus-IaC ownership split; designing them is covered in Working with budgets.
- Audit platform activity: the audit log that records every administrative change across all three surfaces. See Audit platform activity.
- Retrieve data plane resources: the Helm-rendered manifest that the GitOps path commits and reconciles. See Retrieving data plane resources.
- Run multiple platform instances: where reproducibility across instances pays off, and where configuration drift is most costly. See Run multiple platform instances.
Where to go next