Installation guide for Azure

This guide installs the Agent Router data plane on Azure Kubernetes Service (AKS).

Architecture

The Agent Router platform uses a split-plane model. A Management Plane hosted by Tetrate holds configuration, license data, and the web dashboard. A Data Plane runs in a customer-managed Kubernetes cluster and handles all AI traffic. The two planes communicate over a single outbound HTTPS connection initiated by the data plane. No inbound connections from the internet are required.

The data plane needs a stable public hostname so applications can reach it; TLS certificates are issued for hostnames, not IP addresses. DNS and TLS are configured in Step 8 and Appendix B.

The procedure produces:

An Azure Container Registry mirroring the Agent Router images
An AKS cluster running the data plane in the tars-system and tars-dataplane namespaces
An Azure Application Gateway fronting the data plane at a customer-owned DNS name
A registered data plane URL on the management plane

Plan for 30--60 minutes of installation time, plus DNS propagation.

Prepare for the installation: obtain the data plane credential and install the CLI
Cluster setup: create or reuse a Kubernetes cluster
- Step 3: Provision the AKS cluster
Registry setup: mirror Agent Router images so the cluster can pull them
- Step 4: Create an Azure Container Registry
- Step 5: Sync Agent Router images to ACR
Data Plane installation: deploy the data plane
- Step 6: Install the Agent Router data plane
Ingress setup: expose the data plane externally
- Step 7: Expose the data plane via AGIC
DNS configuration: wire the hostname to the ingress and register the URL on the management plane
- Step 8: Wire DNS and register the URL
Testing the installation: verify the install works end-to-end
Appendices

Prerequisites

Dashboard and Router app access

Agent Router exposes two web surfaces. Both URLs are provided during onboarding:

Dashboard (admin): https://dashboard.<your-tenant>.tetrate.ai. Used in Step 1 and Step 8.
Router app (end-user): https://router.<your-tenant>.tetrate.ai. Used in Step 11 for creating API keys and MCP profiles.

Required tools

Install the following on the workstation used for this guide:

Tool	Install
`az` CLI	`https://learn.microsoft.com/cli/azure/install-azure-cli`
`kubectl`	`https://kubernetes.io/docs/tasks/tools`
`helm` (3+)	`https://helm.sh/docs/intro/install`
`docker`	`https://docs.docker.com/get-docker`
`curl`	Preinstalled on macOS and most Linux distributions
`tare` CLI	Covered in Step 2

A data-plane-credentials.json file is also required. See Step 1.

Infrastructure

A dedicated workload cluster must be provisioned before starting the installation. The cluster requires at least three nodes. See Cluster sizing for more detail.

warning

Tetrate support does not cover client-side infrastructure provisioning or Kubernetes issues. The instructions for creating clusters and related infrastructure components are provided as a courtesy and should be carefully evaluated before executing them.

Azure permissions

The following role assignments are required on the subscription used for deployment:

Role	Scope	Required for
Contributor	The resource group	AKS, ACR, App Gateway creation
Azure Kubernetes Service Contributor Role	The AKS cluster	Enabling the AGIC addon
AcrPush	The container registry	Pushing synced images
User Access Administrator	The resource group	Attaching ACR to AKS
Network Contributor	The AKS managed RG (`MC_*`)	Letting AGIC manage App Gateway state
Log Analytics Contributor	The linked Log Analytics RG	Required only when Container Insights is enabled

Cluster sizing

The default chart installs multiple always-on components: egress envoy (minimum 2 replicas), AI gateway controller and ext_proc, controller and worker, Redis, and rate-limit services. A single-node footprint is not sufficient.

The egress envoy is the dominant resource consumer. Its CPU and memory usage scale with the configuration size held in memory: the count of AIGatewayRoute and AIServiceBackend resources, header-mutation rules, and per-route features. The AI gateway team's control-plane scaling benchmark shows roughly linear CPU and memory growth as routes are added. Plan capacity for route counts that grow as providers, models, and projects are added. General-purpose VM sizes from the Standard_D*s_v5 family provide a balanced default.

Size	Use case	Recommended node pool	Approximate allocatable target
Small	Dev / test / low traffic	3 × `Standard_B2s`	≥ 6 vCPU, ≥ 20 GiB RAM
Medium	Staging / light production	3 × `Standard_D4s_v5`	≥ 12 vCPU, ≥ 40 GiB RAM
High	Production with burst headroom	3 × `Standard_D8s_v5` (or split into system + data plane pools)	≥ 24 vCPU, ≥ 80 GiB RAM

Maintain a minimum of three nodes to tolerate node upgrades and evictions. Demo installs may start at Small; production installations should start at Medium.

note

AKS Automatic is not supported.

Conventions

All commands assume the environment variables defined in Step 3 are exported in the current shell. Re-export them when opening a new terminal.

Replace every <placeholder> value with a site-specific one.

Step 1: obtain the data plane credential

In the dashboard, navigate to System → Settings → Data plane and click Generate Data plane credential.

Save the downloaded file as data-plane-credentials.json on the workstation used for installation. This file is the long-lived identity the data plane uses to authenticate to the management plane.

note

Some parts of the product still use the older "service account" naming for this file. The dashboard is standardizing on "data plane credential"; the file is the same.

Each data plane uses its own credential. Credentials can be revoked from the dashboard, and additional credentials can be generated (for example, one per environment) at any time.

Step 2: install the `tare` CLI

Run the installer script:

curl -sSL https://tare.tetrate.ai/tools/install.sh | bash

Output:

==> tare installer
==> channel: stable

==> Detected platform: darwin-arm64

==> Installing tare for darwin-arm64...
==> Downloading from: https://tare.tetrate.ai/tools/tags/v0.1.0-beta.4/tare-darwin-arm64.tar.gz
ok Installed tare to /Users/johndoe/.tare/bin/tare

==> tare version: tare version v0.1.0-beta.4

ok Installation directory is already in your PATH

==> Get started:
    tare install identity.json --serve-url https://proxy.acme.com
    tare install --help

The installer prints the install path (typically ~/.tare/bin/tare). Add it to PATH and verify the version:

export PATH="$PATH:$HOME/.tare/bin"
echo 'export PATH="$PATH:$HOME/.tare/bin"' >> ~/.zshrc   # or ~/.bashrc

$ tare --version
tare version v0.1.0-beta.4

Step 3: provision the aks cluster

Step 7 uses the AGIC addon (Application Gateway Ingress Controller). The az aks create flags below configure the cluster networking for AGIC compatibility.

tip

Reusing an existing AKS cluster. Check AGIC compatibility before continuing:

az aks show -n <cluster-name> -g <resource-group> \
  --query 'networkProfile.{plugin: networkPlugin, mode: networkPluginMode, dataplane: networkDataplane}'

If the output is {plugin: "azure", mode: null, dataplane: "azure"}, AGIC is supported. Skip to Step 4.
If mode is overlay or dataplane is cilium, AGIC is not supported. Use Appendix A: AGC for ingress.

Step 3.1: set environment variables

RESOURCE_GROUP=<resource-group>
LOCATION=<region>
AKS_CLUSTER_NAME=<cluster-name>

# Use the regional default Kubernetes version to avoid an unexpected end-of-support situation.
K8S_VERSION=$(az aks get-versions --location "${LOCATION}" --query "values[?isDefault].version | [0]" -o tsv)

# Optional resource tags; adapt or remove as needed.
TAGS=(
  owner="<your-name>"
  team="<your-team>"
  purpose=development
)

az login
az account set --subscription <subscription-id>

Step 3.3: create the resource group

az group create \
  --name "${RESOURCE_GROUP}" \
  --location "${LOCATION}" \
  --tags "${TAGS[@]}"

Step 3.4: create the aks cluster

az aks create \
  --resource-group "${RESOURCE_GROUP}" \
  --name "${AKS_CLUSTER_NAME}" \
  --location "${LOCATION}" \
  --kubernetes-version "${K8S_VERSION}" \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --network-plugin azure \
  --network-dataplane azure \
  --enable-managed-identity \
  --generate-ssh-keys \
  --tags "${TAGS[@]}"

Provisioning takes approximately five minutes. AKS includes a default CSI driver, so no additional configuration is required for the persistent volumes used by the data plane's Redis state.

The --node-count and --node-vm-size values above correspond to the Medium tier in Cluster sizing. Adjust as needed.

Step 3.5: fetch the kubeconfig

az aks get-credentials \
  --resource-group "${RESOURCE_GROUP}" \
  --name "${AKS_CLUSTER_NAME}" \
  --file ~/kubeconfig-${AKS_CLUSTER_NAME}

export KUBECONFIG=~/kubeconfig-${AKS_CLUSTER_NAME}
kubectl get nodes

Expected output:

NAME                                STATUS   ROLES    AGE   VERSION
aks-nodepool1-xxxxxxxx-vmss000000   Ready    <none>   2m    v1.34.x
aks-nodepool1-xxxxxxxx-vmss000001   Ready    <none>   2m    v1.34.x

Step 4: create an Azure container registry

Create the registry that the next step syncs Agent Router images into:

ACR_NAME=<globally-unique-ACR-name>   # 5–50 lowercase alphanumeric chars

az acr create \
  --resource-group "${RESOURCE_GROUP}" \
  --name "${ACR_NAME}" \
  --sku Standard \
  --tags "${TAGS[@]}"

Step 5: sync Agent Router images to acr

Step 5.1: authenticate Docker to the acr

az acr login --name "${ACR_NAME}"

This refreshes local Docker credentials for ${ACR_NAME}.azurecr.io for roughly three hours. Re-run this command if unauthorized: authentication required errors occur during image sync.

Step 5.2: sync images

Copy the container images from Tetrate's registry into the registry created above. The tare CLI authenticates to the source registry automatically. Only the destination ACR requires a local login.

tare install /path/to/data-plane-credentials.json \
  --image-sync ${ACR_NAME}.azurecr.io/tare \
  --sync-only

The sync produces no progress output and takes several minutes. After it completes, verify the images:

az acr repository list --name "${ACR_NAME}" -o tsv

The output should list ten repositories under the tare/ prefix, including ai-gateway-controller, envoy-tars, gateway, liaison, ratelimit, redis, tare-doctor, and valet.

Step 5.3: grant aks pull access to acr

Attach the ACR to AKS so the cluster's managed identity can pull images:

az aks update \
  --resource-group "${RESOURCE_GROUP}" \
  --name "${AKS_CLUSTER_NAME}" \
  --attach-acr "${ACR_NAME}"

No image-pull secret is required; AKS handles authentication via its managed identity.

tip

If the account lacks User Access Administrator on the resource group, the command above fails with Could not create a role assignment for ACR. Fall back to the admin-user flow:

az acr update --name "${ACR_NAME}" --admin-enabled true
ACR_USERNAME=$(az acr credential show --name "${ACR_NAME}" --query "username" -o tsv)
ACR_PASSWORD=$(az acr credential show --name "${ACR_NAME}" --query "passwords[0].value" -o tsv)

Pipe these credentials to tare install in Step 6 using --image-pull-secret-stdin.

Step 6: install the Agent Router data plane

tare install /path/to/data-plane-credentials.json \
  --image-sync ${ACR_NAME}.azurecr.io/tare

The tare install command performs the following actions:

Creates the tars-system and tars-dataplane namespaces.
Installs the Agent Router data plane via Helm.

tip

If the admin-user fallback from Step 5.3 was used, pipe the credentials so tare install creates the image-pull secret:

echo "${ACR_USERNAME}:${ACR_PASSWORD}" | \
  tare install /path/to/data-plane-credentials.json \
    --image-sync ${ACR_NAME}.azurecr.io/tare \
    --image-pull-secret-stdin

Step 7: expose the data plane via agic

The data plane terminates external traffic on a single in-cluster service: egress in tars-dataplane on port 10080. It serves both LLM API requests (/v1/*) and MCP traffic (/mcp/*, /.well-known/*); no separate routes are needed.

Step 7.1: enable agic on the cluster

This provisions an Azure Application Gateway (Standard_v2) and wires it to AKS. Provisioning takes approximately five minutes.

az aks enable-addons \
  --resource-group "${RESOURCE_GROUP}" \
  --name "${AKS_CLUSTER_NAME}" \
  --addons ingress-appgw \
  --appgw-name "${AKS_CLUSTER_NAME}-appgw" \
  --appgw-subnet-cidr 10.225.0.0/24

note

The --appgw-subnet-cidr must not overlap any existing subnet in the VNet (the default AKS subnet is 10.224.0.0/16). Keep the size at /24; this works for any cluster networking and is required for clusters that ever used Overlay.

Verify the controller is running:

kubectl get pods -n kube-system -l app=ingress-appgw

A single ingress-appgw-deployment-* pod should be Running. A small number of restarts during the first few minutes is normal while the controller reconciles against the in-progress ARM provisioning.

Step 7.2: create the ingress

cat <<'EOF' | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: tars-ingress
  namespace: tars-dataplane
  annotations:
    # AGIC's default health probe is GET / on the backend, but the egress envoy
    # only serves /v1/* and /mcp/* paths and returns 404 for /. Without these
    # two annotations, AGIC marks the backend unhealthy and every request
    # returns 502 Bad Gateway.
    appgw.ingress.kubernetes.io/health-probe-path: /
    appgw.ingress.kubernetes.io/health-probe-status-codes: "200-499"
spec:
  ingressClassName: azure-application-gateway
  rules:
  - http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: egress
            port:
              number: 10080
EOF

Step 7.3: retrieve the public ip

kubectl get ingress tars-ingress -n tars-dataplane

Expected output:

NAME           CLASS                       HOSTS   ADDRESS         PORTS   AGE
tars-ingress   azure-application-gateway   *       40.x.x.x        80      30s

Record the ADDRESS value; it is used in Step 8.

note

The Ingress resource persists across tare install re-runs and does not need to be re-applied.

warning

TLS is required for production.

The Ingress above listens on HTTP/80 only, which is acceptable for local testing but not for any customer-facing deployment. Configure TLS on the Ingress before going to production. Appendix B describes two example mechanisms: bring-your-own certificate and cert-manager.

Step 8: wire DNS and register the URL

Step 8.1: add the DNS a record

Create an A record pointing the data plane hostname to the App Gateway address from Step 7.3:

<your-data-plane-hostname>.   A   <appgw-ip-from-step-7-3>

The App Gateway listens on port 80 by default; no port suffix is required.

DNS propagation typically takes one to two minutes. Verify with:

dig +short <your-data-plane-hostname>
# Should return the App Gateway IP

Step 8.2: register the URL on the management plane

In the dashboard, navigate to System → Settings → Profile → Proxy URL and set it to https://<your-data-plane-hostname> (or http://... if TLS was skipped for local testing).

Changes propagate to the data plane within approximately 30 seconds.

tip

For demo installations without a domain, <appgw-ip>.nip.io resolves automatically. Use it as a placeholder, then switch to a real hostname before going to production.

Step 9: verify provider routes

Providers (OpenAI, Anthropic, and others) and their upstream API keys are configured during Agent Router onboarding, not during this install. The data plane retrieves that configuration automatically once it is connected to the management plane.

Verify the data plane received the provider routes:

kubectl get aigatewayroutes -A
kubectl get aiservicebackends -A

Both should show Accepted resources within a minute of the data plane coming up. If the lists are empty, consult the Agent Router onboarding guide to confirm providers are configured.

Step 10: verify the install

Run tare doctor:

tare doctor /path/to/data-plane-credentials.json --verbose

Pass criteria: all in-cluster checks report Status: OK (or Healthy) with 0 errors, 0 warnings, and the final line confirms the health-report bundle was accepted (Sending health report ... OK (bundle <id>)).

Expected output (abridged):

CHECKS PERFORMED:
- Namespace existence (system, dataplane)
- CRD presence (Gateway API, AI Gateway, RouteDeployment)
- Controller deployments ready (TARS, AI Gateway, Envoy Gateway)
- Proxy deployment ready
- GatewayClass and Gateway accepted/programmed
- EnvoyPatchPolicy acceptance (per instance)
- EnvoyProxy acceptance (per instance)
- Egress EnvoyProxy image uses envoy-tars
- Identity Secret and ConfigMap present
- AIServiceBackend acceptance
- Envoy Gateway Backend acceptance
- BackendTrafficPolicy acceptance
- BackendSecurityPolicy acceptance
- BackendTLSPolicy acceptance
- ClientTrafficPolicy acceptance
- HTTPRouteFilter presence/acceptance
- ReferenceGrant presence
- RouteDeployment status conditions
- AIGatewayRoute acceptance/resolution
- HTTPRoute parent acceptance/resolution
- MCPRoute parent acceptance/resolution
- Proxy admin and forward endpoints
- Pod CrashLoopBackOff (excluding tars-config-monitor)

Sending health report to https://api.<your-tenant>.tetrate.ai/v1/dataplane-status... OK (bundle <bundle-id>)

Send a request with an invalid token to confirm auth is enforced:

curl -sS -o /dev/null -w "HTTP %{http_code}\n" \
  "http://<your-data-plane-hostname>/v1/chat/completions" \
  -X POST -H "Content-Type: application/json" \
  -H "Authorization: Bearer NotREAL" \
  -d '{"model":"gpt-5-mini","messages":[{"role":"user","content":"hi"}]}'
# Expected: HTTP 401

If the response is HTTP 200, contact support. Auth is enforced automatically on every install. If tare doctor reports Status: Broken, see Troubleshooting.

Step 11: smoke tests

In the router app (https://router.<your-tenant>.tetrate.ai), select API Keys in the sidebar and create a key. Applications (and the tests below) use this key as Authorization: Bearer ....

Set the host and key once:

export DP_HOST=<your-data-plane-hostname>
export TARS_API_KEY="<your-api-key-from-router-app>"

Chat Completions (OpenAI shape)

curl -s "http://${DP_HOST}/v1/chat/completions" \
  -X POST -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  -d '{
    "model": "gpt-5-mini",
    "messages": [{"role": "user", "content": "hello, what are you?"}]
  }'

Anthropic Messages (native shape)

curl -s "http://${DP_HOST}/v1/messages" \
  -X POST -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  -d '{
    "model": "claude-haiku-4-5",
    "max_tokens": 64,
    "messages": [{"role": "user", "content": "hello"}]
  }'

List available models

curl -s "http://${DP_HOST}/v1/models" \
  -H "Authorization: Bearer ${TARS_API_KEY}" | jq '.data[].id' | sort -u

Streaming

curl -s "http://${DP_HOST}/v1/chat/completions" \
  -X POST -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  -d '{
    "model": "gpt-5-mini",
    "stream": true,
    "messages": [{"role": "user", "content": "count from 1 to 5"}]
  }'

MCP

Create an MCP profile in the router app: MCP Profiles (sidebar) → Create profile. The profile ID is used in the URL below.

curl -s "http://${DP_HOST}/mcp/<profile-id>" \
  -X POST -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'

To register the MCP profile with Claude Code:

claude mcp add --transport http <profile-name> \
  http://${DP_HOST}/mcp/<profile-id> \
  --header "Authorization: Bearer ${TARS_API_KEY}"

Upgrading

To upgrade the data plane to a newer Agent Router release:

Install the new tare CLI version (re-run Step 2).
Re-run image sync (Step 5.2) and tare install (Step 6).

Cluster state (namespaces, ingress, DNS, and dashboard configuration) persists across upgrades.

Cleanup

To remove the deployment, delete resources in dependency order:

# 1. Ingress + AGIC addon (also releases the App Gateway)
kubectl delete ingress tars-ingress -n tars-dataplane --ignore-not-found
az aks disable-addons \
  --resource-group "${RESOURCE_GROUP}" \
  --name "${AKS_CLUSTER_NAME}" \
  --addons ingress-appgw 2>/dev/null || true

# 2. Helm release
helm uninstall tars -n tars-system 2>/dev/null || true

# 3. ACR
az acr delete --resource-group "${RESOURCE_GROUP}" --name "${ACR_NAME}" --yes

# 4. AKS cluster
az aks delete \
  --resource-group "${RESOURCE_GROUP}" \
  --name "${AKS_CLUSTER_NAME}" \
  --yes --no-wait

# 5. Resource group (catches anything left behind)
az group delete --name "${RESOURCE_GROUP}" --yes --no-wait

# 6. Kubeconfig
rm -f ~/kubeconfig-${AKS_CLUSTER_NAME}

note

If helm uninstall tars hangs on finalizers (for example, GatewayClass//tars-egress still exists), see the finalizer cleanup procedure in Troubleshooting.

Troubleshooting

Issues are grouped by the step where they are most likely to occur.

Image synchronization issues

Symptom	Cause	Fix
`401 Unauthorized` on `HEAD https://registry.tetrate.ai/v2/...`	The credential is not authorized to pull from `registry.tetrate.ai`.	Regenerate the credential from Dashboard → System → Settings → Data plane → Credentials and retry.
`unauthorized: authentication required` from the destination registry	The `az acr login` token has expired (~3h default).	Re-run `az acr login --name "${ACR_NAME}"` and retry the sync.

Acr pull access

Symptom	Cause	Fix
Pods in `ImagePullBackOff` with `403 Forbidden` from ACR	The image-pull secret was not created or is missing from the pod's namespace.	Confirm in both namespaces: `kubectl get secret tars-image-pull-secret -n tars-dataplane` and `kubectl get secret tars-image-pull-secret -n tars-system`. If missing, re-run `tare install` with the `--image-pull-secret-stdin` flow from the Step 6 tip.

Agic

Symptom	Cause	Fix
`az aks enable-addons` fails with `AuthorizationFailed: ... managedClusters/write`	Caller lacks AKS write permission.	Grant `Azure Kubernetes Service Contributor Role` on the AKS resource.
`LinkedAuthorizationFailed: ... Microsoft.OperationalInsights/workspaces/sharedkeys/read`	Container Insights is enabled and AGIC requires read access on the linked Log Analytics workspace.	Grant `Log Analytics Contributor` on the linked workspace RG, or disable Container Insights: `az aks disable-addons -n <cluster> -g <rg> --addons monitoring`.
Ingress has no `ADDRESS` after five minutes; AGIC log reports `App Gateway in stopped state`	AGIC reconciled too early.	Restart the AGIC controller: `kubectl delete pod -n kube-system -l app=ingress-appgw`. The replacement pod re-reads state and programs the gateway.
AGIC log loops on `Waiting for overlay extension config to be ready`	Cluster uses Cilium dataplane or Azure CNI Overlay; AGIC does not support either.	Switch to Appendix A: AGC, or recreate the cluster with traditional Azure CNI.
Ingress has an `ADDRESS` but `curl` returns `502 Bad Gateway`	AGIC's default health probe is `GET /` and the egress envoy returns 404 there.	The Ingress YAML in Step 7.2 sets the `health-probe-path` and `health-probe-status-codes: "200-499"` annotations. Add them if missing; AGIC reconciles within ~30 seconds.

Testing

Symptom	Cause	Fix
HTTP 404 with body `No matching route found. It is likely because the model specified in your request is not configured in the Gateway.`	The requested model name is not configured, or no providers are configured.	Verify `kubectl get aigatewayroutes -A` shows `Accepted` rows. If empty, consult the Agent Router onboarding guide. Changes propagate to the data plane within ~30 seconds.
HTTP 404 with empty body	No `AIGatewayRoute` resources exist in the cluster.	Check the data plane is connected to the management plane: `kubectl logs -n tars-system deployment/controller-worker --tail=50`. The log entry `No secret found for provider` indicates the provider key did not reach the data plane; contact the Agent Router onboarding team.
HTTP 401 with a valid bearer	The API key was issued against a different management plane than this data plane is registered to.	Issue a new key from the router app for this tenant (`https://router.<your-tenant>.tetrate.ai` → API Keys → Create).
HTTP 502 from the dashboard playground (but not from direct `curl`)	The URL registered on the management plane does not match what the App Gateway serves. Most common cause: registered `https://<host>` but the App Gateway only listens on HTTP/80.	Either enable TLS on the App Gateway (see Appendix B) and keep the `https://` URL, or set the registered URL in Dashboard → System → Settings → Data planes to `http://<host>` to match.

Cleanup: Helm uninstall hangs

Symptom	Cause	Fix
`helm uninstall tars` times out with `resource GatewayClass//tars-egress still exists. status: Terminating`	Custom resource finalizers block namespace deletion when controllers exit before the finalizer drains.	Force-clear finalizers in two passes. (1) Clear gateway-related CRs: `kubectl patch gatewayclass tars-egress --type=merge -p '{"metadata":{"finalizers":[]}}'` and repeat for `aiservicebackends`, `backendsecuritypolicies`, `mcproutes`, `tarsroutedeployments`. (2) Once namespaces start terminating, do the same for `aigatewayroutes`. Then `kubectl delete ns tars-system tars-dataplane`.

Appendix a: alternative ingress (agc)

Use AGC when the AKS cluster runs the Cilium dataplane or Azure CNI Overlay, neither of which is supported by AGIC, and recreating the cluster is impractical. AGC works on any CNI.

note

AGC planning is required at cluster creation time: the AKS cluster needs --enable-oidc-issuer and --enable-workload-identity. Existing clusters without these flags can be updated using az aks update. No rebuild is required.

High-level steps:

Register the resource provider: az provider register --namespace Microsoft.ServiceNetworking
Create a user-assigned managed identity for the ALB controller.
Grant the identity AppGw for Containers Configuration Manager on the cluster's node resource group and Network Contributor on the cluster's VNet.
Federate the identity with the AKS OIDC issuer.
Install the ALB controller via Helm (oci://mcr.microsoft.com/application-lb/charts/alb-controller).
Create a delegated subnet for AGC and an ApplicationLoadBalancer CR.
Create a Gateway (Gateway API) and HTTPRoute instead of an Ingress.

For the full walkthrough, see the Microsoft documentation.

Replace Step 7 with the AGC setup. Step 8 and all subsequent steps are unchanged; only the ingress provisioning differs.

Appendix b: enable TLS

The main flow uses HTTP-only on port 80 so the install can complete without a certificate. Production deployments require TLS on the App Gateway. AGIC supports any certificate delivery mechanism that produces a kubernetes.io/tls Secret in the cluster. Two example flows are described below. Existing TLS provisioning workflows (corporate CA, Azure Key Vault, internal PKI) can be used by delivering the resulting certificate and key as a tls Secret named in the Ingress.

B.1: bring your own certificate

Create the secret from an existing fullchain and key:

kubectl create secret tls tars-ingress-tls \
  --cert=path/to/fullchain.pem \
  --key=path/to/privkey.pem \
  -n tars-dataplane

Update the Ingress to use the secret, with host scoping and HTTP-to-HTTPS redirect:

metadata:
  annotations:
    appgw.ingress.kubernetes.io/health-probe-path: /
    appgw.ingress.kubernetes.io/health-probe-status-codes: "200-499"
    appgw.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: azure-application-gateway
  tls:
  - hosts: [proxy.example.com]
    secretName: tars-ingress-tls
  rules:
  - host: proxy.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: egress
            port:
              number: 10080

B.2: cert-manager and let's encrypt

cert-manager auto-issues and auto-renews certificates from Let's Encrypt. The HTTP-01 challenge runs through the same AGIC ingress configured in Step 7, so no additional infrastructure is required. This option suits sites without an existing certificate workflow.

Prerequisites:

The DNS A record from Step 8.1 must be live (dig +short <your-host> returns the App Gateway IP). Let's Encrypt validates over DNS and HTTP.
The App Gateway must listen on HTTP/80 (default from Step 7).
An email address for Let's Encrypt expiry notices.

Step b.2.1: install cert-manager

One-time, cluster-wide:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager --create-namespace \
  --version v1.18.2 \
  --set crds.enabled=true

Verify the install:

kubectl get pods -n cert-manager
# Expect 3 pods Running: cert-manager-*, cert-manager-cainjector-*, cert-manager-webhook-*

Step b.2.2: create the clusterissuer

cat <<'EOF' | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: [email protected]           # Replace with a real address
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    solvers:
    - http01:
        ingress:
          ingressClassName: azure-application-gateway
EOF

Confirm the issuer reached the Ready state:

kubectl get clusterissuer letsencrypt-prod
# NAME              READY   AGE
# letsencrypt-prod  True    20s

tip

For testing, point server: at https://acme-staging-v02.api.letsencrypt.org/directory. The staging issuer has higher rate limits and a separate root, preserving the production quota. Switch to production once the certificate issues cleanly on staging.

Step b.2.3: update the ingress with TLS and cert-manager annotations

Replace the Ingress from Step 7.2 with the version below. Two additions: a tls: block referencing a Secret cert-manager will create, and the cert-manager.io/cluster-issuer annotation that triggers issuance.

cat <<'EOF' | kubectl apply -f -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: tars-ingress
  namespace: tars-dataplane
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    appgw.ingress.kubernetes.io/ssl-redirect: "true"
    appgw.ingress.kubernetes.io/health-probe-path: /
    appgw.ingress.kubernetes.io/health-probe-status-codes: "200-499"
spec:
  ingressClassName: azure-application-gateway
  tls:
  - hosts: [proxy.example.com]       # Replace with the data plane hostname
    secretName: tars-ingress-tls
  rules:
  - host: proxy.example.com          # Must match tls.hosts
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: egress
            port:
              number: 10080
EOF

important

The host: on the rule must match the hostname in tls.hosts and the hostname registered for the data plane on the management plane. A mismatch causes either AGIC to reject the rule or Let's Encrypt to fail the HTTP-01 challenge.

Step b.2.4: wait for cert-manager to issue the certificate

kubectl get certificate -n tars-dataplane -w
# Initial:   READY=False (cert-manager solves the HTTP-01 challenge)
# After ~1m: READY=True

If the certificate remains READY=False for more than a couple of minutes, inspect the order and challenge status:

kubectl describe certificate tars-ingress-tls -n tars-dataplane
kubectl get challenge -n tars-dataplane
kubectl describe challenge -n tars-dataplane     # Reports the precise solver error

Common challenge failures and fixes:

Failure message	Fix
`Self-check failed: ... acme: server returned a non-2xx HTTP status` (404)	AGIC has not yet programmed the solver path. Wait ~30 seconds; cert-manager creates a solver Ingress and AGIC reconciles.
`dns: NXDOMAIN` or `no IP for hostname`	DNS A record has not propagated. Confirm with `dig +short <host>`.
`urn:ietf:params:acme:error:rateLimited`	Let's Encrypt quota exceeded. Switch to the staging issuer (see Step B.2.2 tip) and retry.

Step b.2.5: verify HTTPS end-to-end

curl -v https://<your-data-plane-hostname>/v1/models \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  2>&1 | grep -E '^[<>] (HTTP|x-amz|Authorization|expire date|issuer)' | head -10

Expected: a clean TLS handshake (no certificate errors) and HTTP/2 200. Verify the certificate chain:

echo | openssl s_client -servername <host> -connect <host>:443 2>/dev/null \
  | openssl x509 -noout -subject -issuer -dates
# subject= CN = proxy.example.com
# issuer=  C = US, O = Let's Encrypt, CN = R10
# notAfter=... (~90 days from issue)

Step b.2.6: update the registered URL

If an http://... URL was registered in Step 8.2, update it to https://.... The data plane URL on the management plane must match the protocol the App Gateway serves.

Renewal: cert-manager renews automatically at two-thirds of the certificate's lifetime (approximately 60 days for Let's Encrypt's 90-day certificates). No manual action is required.

Appendix c: forward observability data to an OpenTelemetry collector

The data plane can stream envoy HTTP access logs and router_* application metrics to a customer-managed OpenTelemetry Collector, which forwards to any backend (Azure Monitor, Datadog, Grafana Cloud, SigNoz).

The two streams are configured in separate fields on the EnvoyProxy resource:

Stream	EnvoyProxy field	Contents
Access logs (per-request HTTP metadata)	`accessLog.sinks[]`	Method, status, path, latency, MCP headers, downstream/upstream addresses
Metrics (`router_*` and envoy native stats)	`metrics.sinks[]`	`router_requests_total`, `router_model_requests_total`, plus envoy cluster/listener counters

Either stream can be configured independently; the instructions below cover both in order.

Available metrics

Name	Type	Labels
`router_requests_total`	Counter	method, endpoint, status_code
`router_request_duration_ms`	Histogram	method, endpoint, status_code
`router_errors_total`	Counter	type, endpoint, status, model, provider
`router_streaming_requests_total`	Counter	model, provider, endpoint
`router_model_requests_total`	Counter	model, provider, endpoint, byok
`router_auth_attempts_total`	Counter	result, auth_mode
`router_balance_checks_total`	Counter	result
`router_overrun_protections_total`	Counter	reason

Two equivalent scrape paths are available:

Direct envoy admin (<egress-pod>:19001/stats/prometheus): metric names appear as listed above. This is the lightest setup for Prometheus-only consumers that do not need access logs.
OpenTelemetry Collector (configured below): exposes the same router_* names on otel-collector.tars-dataplane.svc:9464/metrics, plus an OTLP-gRPC receiver for the access-log stream. Recommended when a single collection point fans out to multiple backends.

note

If router_* metrics do not appear after sending traffic, ask the MP operator to verify PROXY_CONFIG is set on the management plane; these metrics require it.

Filter probe and scanner noise before building dashboards

When the data plane is exposed on a public address, two sources contribute noise to router_errors_total and router_auth_attempts_total:

Health probes (AGIC, AGC, or any load balancer): the probe pings the backend every few seconds. The egress envoy applies its auth filter before path matching, so unauthenticated probes register as authentication failures on the probe path (default /).
Internet bot and scanner traffic: any public IP attracts opportunistic scans targeting paths such as /wiki, /favicon.ico, /SDK/webLanguage, and /invoker/EJBInvokerServlet. Each scan increments the auth-failure counter.

These counters reach the thousands within a few hours. Unfiltered charts make a healthy service appear to be failing.

Filter to the data plane's real endpoints (/v1/* and /mcp/*):

# Prometheus: keep only real customer traffic
router_requests_total{endpoint=~"^/v1/.*|^/mcp/.*"}
router_errors_total{endpoint=~"^/v1/.*|^/mcp/.*"}

// Azure Log Analytics / Application Insights equivalent
| extend endpoint = tostring(customDimensions.endpoint)
| where endpoint startswith "/v1/" or endpoint startswith "/mcp/"

Step c.1: deploy the OpenTelemetry collector

Deploy into the tars-dataplane namespace; any other namespace fails with unknown namespace for the cache. The ConfigMap below wires both logs and metrics pipelines, exposes router_* on a Prometheus scrape endpoint (:9464), and is ready to fan out to additional backends. See Send to a real observability backend.

cat <<'EOF' | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-collector-config
  namespace: tars-dataplane
data:
  config.yaml: |
    receivers:
      otlp:
        protocols:
          http: { endpoint: 0.0.0.0:4318 }
          grpc: { endpoint: 0.0.0.0:4317 }
    processors:
      batch: { timeout: 5s }
      # Strip envoy's internal dynamic-modules scope prefix so router_* metrics
      # ship with their canonical names (router_requests_total, etc.) instead
      # of dynamicmodulescustom.router_requests_total.
      transform/strip_scope:
        metric_statements:
          - context: metric
            statements:
              - replace_pattern(name, "^dynamicmodulescustom\\.", "")
    exporters:
      # In-cluster Prometheus scrape target. Names land clean as router_*.
      prometheus:
        endpoint: 0.0.0.0:9464
        namespace: ""
        send_timestamps: true
        metric_expiration: 30m
        resource_to_telemetry_conversion: { enabled: true }
    service:
      pipelines:
        metrics:
          receivers: [otlp]
          processors: [transform/strip_scope, batch]
          exporters: [prometheus]
        # Add a 'logs' pipeline when forwarding envoy HTTP access logs
        # (per-request method, status, path, MCP headers) to a backend
        # such as Azure Monitor or Datadog. See "Send to a real
        # observability backend" below for an example.
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: otel-collector
  namespace: tars-dataplane
spec:
  replicas: 1
  selector: { matchLabels: { app: otel-collector } }
  template:
    metadata: { labels: { app: otel-collector } }
    spec:
      containers:
      - name: collector
        image: otel/opentelemetry-collector-contrib:0.98.0
        ports:
        - { containerPort: 4317, name: otlp-grpc }
        - { containerPort: 4318, name: otlp-http }
        volumeMounts:
        - { name: config, mountPath: /etc/otelcol-contrib }
        resources:
          requests: { cpu: 100m, memory: 128Mi }
          limits:   { cpu: 250m, memory: 256Mi }
      volumes:
      - name: config
        configMap: { name: otel-collector-config }
---
apiVersion: v1
kind: Service
metadata:
  name: otel-collector
  namespace: tars-dataplane
spec:
  selector: { app: otel-collector }
  ports:
  - { name: otlp-grpc,   port: 4317, targetPort: 4317 }
  - { name: otlp-http,   port: 4318, targetPort: 4318 }
  - { name: prometheus,  port: 9464, targetPort: 9464 }
EOF

Once the manifest is applied, an in-cluster Prometheus can scrape http://otel-collector.tars-dataplane.svc:9464/metrics and find clean router_requests_total, router_model_requests_total, and similar names. The transform/strip_scope processor removes envoy's internal scope prefix before export, so dashboards and alerts work without dealing with the OTel encoding.

Step c.2: add the metrics sink to the envoyproxy

Push router_* and envoy native stats from the egress envoy into the collector. The access-log sink is a separate, opt-in step described in Forwarding access logs.

kubectl patch envoyproxy tars-egress-proxy -n tars-system --type=merge -p '{
  "spec":{"telemetry":{"metrics":{"sinks":[
    {"type":"OpenTelemetry","openTelemetry":{"backendRefs":[
      {"group":"","kind":"Service","name":"otel-collector","namespace":"tars-dataplane","port":4317,"weight":1}
    ]}}
  ]}}}
}'

Step c.3: restart the egress Envoy

kubectl rollout restart -n tars-dataplane deployment/egress
kubectl rollout status -n tars-dataplane deployment/egress --timeout=120s

Step c.4: verify

After running the smoke tests, scrape the collector's Prometheus endpoint to confirm router_* metrics are flowing with clean names:

POD=$(kubectl get pods -n tars-dataplane -l app=otel-collector -o jsonpath='{.items[0].metadata.name}')
kubectl port-forward -n tars-dataplane pod/$POD 9464:9464 &
sleep 2
curl -s http://localhost:9464/metrics | grep '^router_' | head

Expected: router_requests_total, router_model_requests_total, router_auth_attempts_total, router_request_duration_ms_bucket, and similar names with non-zero counts matching the traffic sent.

If router_* metrics do not appear after traffic, ask the MP operator to check PROXY_CONFIG on the management plane.

For deeper diagnostics, the collector's self-metrics report pipeline throughput:

kubectl port-forward -n tars-dataplane pod/$POD 8888:8888 &
sleep 2
curl -s http://localhost:8888/metrics | grep otelcol_exporter_sent

Non-zero otelcol_exporter_sent_metric_points confirms metrics are leaving the collector toward each configured exporter. The _log_records counter appears only after a logs pipeline is added (see Forwarding access logs).

note

The EnvoyProxy metrics-sink patch resets on every tare install re-run. Re-apply after each reinstall, or script it as a post-install hook.

Send to a real observability backend

The starter configuration exports router_* metrics to an in-cluster Prometheus endpoint only. Two common extensions:

Fan out metrics to a managed backend (Azure Monitor, Datadog, Grafana Cloud): add a backend exporter alongside prometheus.
Forward HTTP per-request access logs: opt-in. Requires both an EnvoyProxy patch (to make egress emit access logs) and a logs pipeline in the collector.

The Azure Monitor walkthrough below shows both.

Example: Azure monitor / application insights

Create the Application Insights resource (workspace-based, reusing the AKS Log Analytics workspace):

az monitor app-insights component create \
  --app tars-dp-insights \
  --location "${LOCATION}" \
  --kind web \
  --resource-group "${RESOURCE_GROUP}" \
  --workspace "/subscriptions/<sub-id>/resourceGroups/<workspace-rg>/providers/Microsoft.OperationalInsights/workspaces/<workspace-name>"

CONN_STR=$(az monitor app-insights component show \
  --app tars-dp-insights -g "${RESOURCE_GROUP}" \
  --query connectionString -o tsv)

Store the connection string in a secret and inject it as an environment variable into the collector pod:

kubectl create secret generic otel-azure-creds -n tars-dataplane \
  --from-literal=APP_INSIGHTS_CONN_STR="${CONN_STR}"

kubectl set env deploy/otel-collector -n tars-dataplane --from secret/otel-azure-creds

Add the azuremonitor exporter (ships with otel/opentelemetry-collector-contrib) alongside the default prometheus. Keep the transform/strip_scope processor in the metrics pipeline so names appear in Azure Monitor as clean router_*. The logs pipeline carries envoy HTTP per-request access logs and does not include prometheus, since that exporter handles only metrics:

exporters:
  prometheus:
    endpoint: 0.0.0.0:9464
    namespace: ""
    send_timestamps: true
    metric_expiration: 30m
    resource_to_telemetry_conversion: { enabled: true }
  azuremonitor:
    connection_string: ${env:APP_INSIGHTS_CONN_STR}
service:
  pipelines:
    metrics:
      receivers: [otlp]
      processors: [transform/strip_scope, batch]
      exporters: [prometheus, azuremonitor]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [azuremonitor]

tip

Environment variable syntax. The collector requires ${env:VAR_NAME} (with the env: prefix). Plain ${VAR_NAME} silently fails to substitute and the exporter does not load. The collector log shows no error, so check the otelcol_exporter_sent_log_records and _metric_points self-metrics on :8888 to confirm.

Restart and verify both exporters are shipping:

kubectl rollout restart deploy/otel-collector -n tars-dataplane
# After a curl to /v1/chat/completions:
POD=$(kubectl get pods -n tars-dataplane -l app=otel-collector -o jsonpath='{.items[0].metadata.name}')
kubectl port-forward -n tars-dataplane pod/$POD 8888:8888 &
sleep 2
curl -s http://localhost:8888/metrics | grep otelcol_exporter_sent

Both prometheus and azuremonitor exporters should show non-zero otelcol_exporter_sent_log_records and _metric_points.

Where to view data: Azure Portal → Application Insights tars-dp-insights → Logs (for KQL) or Workbooks (for custom dashboards).

note

Default Application Insights panes will not populate. The Overview, Performance, Failures, and Application Map panes require AI-native event types (requests, dependencies, exceptions). The azuremonitor OTel exporter does not translate envoy access logs into those types; the data resides in customMetrics (envoy and router_*) and traces (envoy access logs). Build a Workbook with the queries below for a usable dashboard.

Useful queries (paste into the Logs pane of tars-dp-insights):

// Recent envoy access logs
traces
| where timestamp > ago(15m)
| extend method = tostring(customDimensions["method"]),
         status = tostring(customDimensions["response_code"]),
         route  = tostring(customDimensions["route_name"])
| project timestamp, method, status, route, duration=customDimensions["duration"]
| order by timestamp desc

// router_* metrics: latest cumulative values per metric
customMetrics
| where timestamp > ago(1h)
| where name startswith "router_"
| extend endpoint = tostring(customDimensions.endpoint)
| where endpoint startswith "/v1/" or endpoint startswith "/mcp/"  // Exclude probe and scanner noise
| summarize total = max(valueSum) by name
| order by name asc

// Per-minute request rate by status code
customMetrics
| where timestamp > ago(1h)
| where name == "router_requests_total"
| extend status = tostring(customDimensions.status_code)
| summarize cum = max(valueSum) by bin(timestamp, 1m), status
| order by status, timestamp asc
| serialize
| extend per_min = cum - prev(cum, 1, 0)
| where per_min >= 0
| render timechart

important

Counter aggregation: use max, not sum. Every router_*_total is a cumulative counter. OTel re-ships the current value on every flush (~5s default), so customMetrics rows accumulate by hundreds per hour. sum(valueSum) inflates the result by orders of magnitude (for example, 1.8M when the real cumulative count is ~2,300).

For cumulative totals: max(valueSum) (latest snapshot).
For rates over time: compute deltas with serialize | extend ... = cum - prev(cum, 1, 0).

Forwarding access logs (optional)

Adds envoy HTTP per-request access logs (method, status, path, latency, MCP headers, downstream and upstream addresses) on top of the metrics. Two changes are required; both can be applied incrementally without re-running tare install.

1. Patch the EnvoyProxy to emit access logs to the collector. The default accessLog block varies between tare builds; check the existing shape before patching:

kubectl get envoyproxy tars-egress-proxy -n tars-system \
  -o jsonpath='{.spec.telemetry.accessLog}'
# Non-empty: use Path A. Empty: use Path B.

Path A: JSON-patch (default accessLog already present, append a sink):

kubectl patch envoyproxy tars-egress-proxy -n tars-system --type=json -p '[
  {
    "op": "add",
    "path": "/spec/telemetry/accessLog/settings/0/sinks/-",
    "value": {
      "type": "OpenTelemetry",
      "openTelemetry": {
        "backendRefs": [
          {"group":"","kind":"Service","name":"otel-collector","namespace":"tars-dataplane","port":4317,"weight":1}
        ]
      }
    }
  }
]'

Path B: merge-patch (no default accessLog, create the entire block):

kubectl patch envoyproxy tars-egress-proxy -n tars-system --type=merge -p '{
  "spec":{"telemetry":{"accessLog":{"settings":[
    {"sinks":[
      {"type":"OpenTelemetry","openTelemetry":{"backendRefs":[
        {"group":"","kind":"Service","name":"otel-collector","namespace":"tars-dataplane","port":4317,"weight":1}
      ]}}
    ]}
  ]}}}
}'

2. Add a logs pipeline to the collector ConfigMap, pointing at the chosen backend exporter (for example, azuremonitor). The Azure Monitor walkthrough above shows the full ConfigMap diff; the relevant addition is:

service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [azuremonitor]      # Or datadog, otlphttp, and similar

3. Restart egress so the new accessLog sink loads:

kubectl rollout restart -n tars-dataplane deployment/egress

Verify access logs are flowing:

# Azure Monitor: open Application Insights → Logs and run
#   traces | where timestamp > ago(15m) | take 10
#
# Datadog / Grafana Cloud / others: check the corresponding Logs explorer.
#
# Real-time verification on the collector side: temporarily add 'debug' to
# the logs pipeline exporters list and grep:
#   kubectl logs -n tars-dataplane deployment/otel-collector --tail=100 \
#     | grep otel_envoy_accesslog

note

The EnvoyProxy accessLog patch resets on every tare install re-run. Re-apply after each reinstall.

Other backends

Platform	Exporter	Reference
Datadog	`datadog`	`https://docs.datadoghq.com/opentelemetry/otel_collector_datadog_exporter/`
Grafana Cloud	`otlphttp` to a grafana.net endpoint	`https://grafana.com/docs/grafana-cloud/send-data/otlp/`
SigNoz Cloud	`otlphttp` with the `signoz-access-token` header	`https://signoz.io/docs/instrumentation/opentelemetry-collector/`
Splunk Observability	`signalfx`	`https://docs.splunk.com/observability/en/gdi/opentelemetry/exporters/signalfx-exporter.html`
In-cluster SigNoz, Jaeger, or Grafana	`otlp` or `otlphttp` to the local service	Platform-specific

The pattern is consistent across backends: define the exporter in the collector's ConfigMap, add it to the relevant pipelines, then restart the collector. The EnvoyProxy patches applied above remain unchanged.

note

Both EnvoyProxy patches (metrics and accessLog) reset on every tare install re-run. Re-apply after each reinstall.

Appendix d: use an existing private registry

Use this path when an organization-wide private container registry (Nexus, Harbor, JFrog Artifactory, or another enterprise registry) is already in place. This appendix replaces Step 4 and Step 5 in the main flow.

Two separate registry credentials are involved:

Operator credentials: used by the workstation running tare install --image-sync to push images into the private registry.
Kubernetes pull credentials: stored as an image-pull secret so AKS nodes can pull images from the private registry.

The data plane credential is still required. tare uses it to authenticate to Tetrate's source registry while syncing images. The private registry username and password are used only for the destination registry and Kubernetes image pulls.

Step d.1: set variables for the private registry

DP_CREDENTIAL=./data-plane-credentials.json
PRIVATE_REGISTRY_HOST=registry.acme.example.com
PRIVATE_IMAGE_REGISTRY=registry.acme.example.com/tare
PULL_SECRET=acme-registry-pull
REGISTRY_USERNAME=<registry-user>
REGISTRY_PASSWORD=<registry-password-or-token>

Step d.2: sync images into the private registry

printf '%s' "${REGISTRY_PASSWORD}" | \
  docker login "${PRIVATE_REGISTRY_HOST}" \
    --username "${REGISTRY_USERNAME}" \
    --password-stdin

Copy the pinned Agent Router images into the registry:

tare install "${DP_CREDENTIAL}" \
  --image-sync "${PRIVATE_IMAGE_REGISTRY}" \
  --sync-only

tare authenticates to the Tetrate source registry using the data plane credential. The local Docker login authenticates to the destination registry.

Step d.3: create Kubernetes pull secrets

Create the same pull secret in both namespaces:

kubectl create namespace tars-system --dry-run=client -o yaml | kubectl apply -f -
kubectl create namespace tars-dataplane --dry-run=client -o yaml | kubectl apply -f -

kubectl create secret docker-registry "${PULL_SECRET}" \
  --docker-server="${PRIVATE_REGISTRY_HOST}" \
  --docker-username="${REGISTRY_USERNAME}" \
  --docker-password="${REGISTRY_PASSWORD}" \
  --namespace tars-system \
  --dry-run=client -o yaml | kubectl apply -f -

kubectl create secret docker-registry "${PULL_SECRET}" \
  --docker-server="${PRIVATE_REGISTRY_HOST}" \
  --docker-username="${REGISTRY_USERNAME}" \
  --docker-password="${REGISTRY_PASSWORD}" \
  --namespace tars-dataplane \
  --dry-run=client -o yaml | kubectl apply -f -

Step d.4: install from the private registry

Install Agent Router with --image-registry pointing at the private registry and --image-pull-secret-name referencing the existing secret:

tare install "${DP_CREDENTIAL}" \
  --image-registry "${PRIVATE_IMAGE_REGISTRY}" \
  --image-pull-secret-name "${PULL_SECRET}" \
  --wait

This does not create the secret. It tells the Helm install to use the secret that already exists in tars-system and tars-dataplane.

note

If a platform team mirrors Agent Router images into the private registry before the install, skip the --image-sync step from Step D.2 and run only the tare install command above.

After the install, continue with Step 7 or the organization's preferred ingress path.

Where to go next

Gateway installation

Install the data plane gateway components that manage inbound access and configure request routing.

Console quickstart

Issue an API key and make a first routed AI request once the gateway is running.

Architecture​

Table of contents​

Prerequisites​

Dashboard and Router app access​

Required tools​

Infrastructure​

Azure permissions​

Cluster sizing​

Conventions​

Step 1: obtain the data plane credential​

Step 2: install the tare CLI​

Step 3: provision the aks cluster​

Step 3.1: set environment variables​

Step 3.2: sign in to Azure​

Step 3.3: create the resource group​

Step 3.4: create the aks cluster​

Step 3.5: fetch the kubeconfig​

Step 4: create an Azure container registry​

Step 5: sync Agent Router images to acr​

Step 5.1: authenticate Docker to the acr​

Step 5.2: sync images​

Step 5.3: grant aks pull access to acr​

Step 6: install the Agent Router data plane​

Step 7: expose the data plane via agic​

Step 7.1: enable agic on the cluster​

Step 7.2: create the ingress​

Step 7.3: retrieve the public ip​

Step 8: wire DNS and register the URL​

Step 8.1: add the DNS a record​

Step 8.2: register the URL on the management plane​

Step 9: verify provider routes​

Step 10: verify the install​

Step 11: smoke tests​

Chat Completions (OpenAI shape)​

Anthropic Messages (native shape)​

List available models​

Streaming​

MCP​

Upgrading​

Cleanup​

Troubleshooting​

Image synchronization issues​

Acr pull access​

Agic​

Testing​

Cleanup: Helm uninstall hangs​

Appendix a: alternative ingress (agc)​

Appendix b: enable TLS​

B.1: bring your own certificate​

B.2: cert-manager and let's encrypt​

Step b.2.1: install cert-manager​

Step b.2.2: create the clusterissuer​

Step b.2.3: update the ingress with TLS and cert-manager annotations​

Step b.2.4: wait for cert-manager to issue the certificate​

Step b.2.5: verify HTTPS end-to-end​

Step b.2.6: update the registered URL​

Appendix c: forward observability data to an OpenTelemetry collector​

Available metrics​

Filter probe and scanner noise before building dashboards​

Step c.1: deploy the OpenTelemetry collector​

Step c.2: add the metrics sink to the envoyproxy​

Step c.3: restart the egress Envoy​

Step c.4: verify​

Send to a real observability backend​

Example: Azure monitor / application insights​

Forwarding access logs (optional)​

Other backends​

Appendix d: use an existing private registry​

Step d.1: set variables for the private registry​

Step d.2: sync images into the private registry​

Step d.3: create Kubernetes pull secrets​

Step d.4: install from the private registry​

Architecture

Table of contents

Prerequisites

Dashboard and Router app access

Required tools

Infrastructure

Azure permissions

Cluster sizing

Conventions

Step 1: obtain the data plane credential

Step 2: install the `tare` CLI

Step 3: provision the aks cluster

Step 3.1: set environment variables

Step 3.2: sign in to Azure

Step 3.3: create the resource group

Step 3.4: create the aks cluster

Step 3.5: fetch the kubeconfig

Step 4: create an Azure container registry

Step 5: sync Agent Router images to acr

Step 5.1: authenticate Docker to the acr

Step 5.2: sync images

Step 5.3: grant aks pull access to acr

Step 6: install the Agent Router data plane

Step 7: expose the data plane via agic

Step 7.1: enable agic on the cluster

Step 7.2: create the ingress

Step 7.3: retrieve the public ip

Step 8: wire DNS and register the URL

Step 8.1: add the DNS a record

Step 8.2: register the URL on the management plane

Step 9: verify provider routes

Step 10: verify the install

Step 11: smoke tests

Chat Completions (OpenAI shape)

Anthropic Messages (native shape)

List available models

Streaming

MCP

Upgrading

Cleanup

Troubleshooting

Image synchronization issues

Acr pull access

Agic

Testing

Cleanup: Helm uninstall hangs

Appendix a: alternative ingress (agc)

Appendix b: enable TLS

B.1: bring your own certificate

B.2: cert-manager and let's encrypt

Step b.2.1: install cert-manager

Step b.2.2: create the clusterissuer

Step b.2.3: update the ingress with TLS and cert-manager annotations

Step b.2.4: wait for cert-manager to issue the certificate

Step b.2.5: verify HTTPS end-to-end

Step b.2.6: update the registered URL

Appendix c: forward observability data to an OpenTelemetry collector

Available metrics

Filter probe and scanner noise before building dashboards

Step c.1: deploy the OpenTelemetry collector

Step c.2: add the metrics sink to the envoyproxy

Step c.3: restart the egress Envoy

Step c.4: verify

Send to a real observability backend

Example: Azure monitor / application insights

Forwarding access logs (optional)

Other backends

Appendix d: use an existing private registry

Step d.1: set variables for the private registry

Step d.2: sync images into the private registry

Step d.3: create Kubernetes pull secrets

Step d.4: install from the private registry