Agent Router data plane installation for Google cloud platform

This guide installs the Agent Router data plane on Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE).

Architecture overview

The Agent Router platform uses a split-plane model. A Management Plane hosted by Tetrate holds configuration and serves the web UI. A Data Plane runs in a customer-managed GKE cluster and handles all AI traffic. The two planes communicate over a single outbound HTTPS connection initiated by the data plane; no inbound connections from the internet reach the management plane.

The data plane requires a stable public hostname so applications can call it; TLS certificates are issued for hostnames, not IPs. The gateway-install step provisions a Google-managed certificate against that hostname automatically.

By the end of this guide you will have:

An Artifact Registry repository mirroring the Agent Router images
A GKE cluster running the data plane in the tars-system and tars-dataplane namespaces
A Google Cloud Application Load Balancer fronting the data plane at a DNS name you own
A registered data plane URL on the management plane

Plan for 30–45 minutes of installation time, plus DNS propagation.

Prepare for the installation: obtain the data plane credential and install the CLI
Cluster setup: create or reuse a Kubernetes cluster
- Step 3: Provision the GKE cluster
Registry setup: mirror Agent Router images so the cluster can pull them
- Step 4: Create an Artifact Registry repository
- Step 5: Sync Agent Router images to the registry
Data Plane Installation: deploy the data plane
- Step 6: Install the Agent Router data plane
Gateway setup: provision the public gateway, certificate, and DNS authorization
- Step 7: Install the gateway
DNS configuration: point a domain at the gateway and register the URL
- Step 8: Wire DNS and register the URL
Testing the installation: verify the install works end-to-end

Appendices
- Appendix A: Cross-project and external-registry pull access
- Appendix B: Forward observability data to an OpenTelemetry Collector

Prerequisites

Dashboard and Router app access

Agent Router exposes two web surfaces. Both URLs are provided during onboarding:

Dashboard (admin): https://dashboard.<your-tenant>.tetrate.ai. Used in Step 1 and Step 8.
Router app (end-user): https://router.<your-tenant>.tetrate.ai. Used in Step 11 for creating API keys and MCP profiles.

Required tools

Install the following on the workstation used to run this guide:

Tool	Install
`gcloud`	cloud.google.com/sdk/docs/install
`kubectl`	`gcloud components install kubectl` (or kubernetes.io/docs/tasks/tools)
`curl`	Preinstalled on macOS and most Linux distributions
`jq`	jqlang.github.io/jq/download
`tare` CLI	Installed in Step 2

A data-plane-credentials.json file is also required. See Step 1.

Infrastructure

A dedicated workload cluster must be provisioned before starting the installation. The cluster should consist of at least three (3) nodes. See Cluster sizing for more details.

warning

Tetrate support does not cover client-side infrastructure provisioning or Kubernetes issues. The instructions for creating clusters and related infrastructure components are provided as a courtesy and should be carefully evaluated before executing them.

GCP iam roles

The following roles are required on the project that will host the data plane:

Role	Scope	Required for
`roles/container.admin`	The GCP project	Creating and managing the GKE cluster
`roles/artifactregistry.admin`	The GCP project (or specific repo)	Creating Artifact Registry and pushing images
`roles/iam.serviceAccountUser`	Attached service accounts	Cluster node SA, gateway provisioner SA
`roles/compute.networkAdmin`	The GCP project	Reserving static IPs for the gateway
`roles/certificatemanager.editor`	The GCP project	Provisioning Google-managed certificates
`roles/dns.admin`	The DNS-managing project (often separate)	Adding A and CNAME records when the IAM principal manages DNS directly; not required when records are handed off to a DNS team

Cluster type: standard or autopilot

GKE Standard is the recommended default: full control over node pools, predictable scheduling, and easier cost and performance tuning. This guide assumes Standard.

GKE Autopilot is supported, but its placement and security constraints may reject the chart's resource and security assumptions on the first run. For Autopilot, validate the install against the cluster's policy before applying:

# Dry-run validation against Autopilot policy
tare install /path/to/data-plane-credentials.json --print-resources | \
  kubectl apply --dry-run=server -f -

Size pod requests and limits to match the policy, and add headroom since Autopilot scaling is request-driven.

Cluster sizing

The default chart installs multiple always-on components: egress envoy (minimum 2 replicas), AI gateway controller and ext_proc, controller and worker, Redis, and rate-limit services. This is not a single-node footprint.

The egress envoy is the dominant resource consumer. Its CPU and memory usage scale with the configuration size held in memory: the number of AIGatewayRoute and AIServiceBackend resources, header-mutation rules, and per-route features. Plan capacity for route counts that grow as providers, models, and projects are added. General-purpose machine types (n2-* family, balanced CPU and RAM) provide a balanced default.

Size	Use case	Recommended node pool	Approximate allocatable target
Small	Dev / test / low traffic	3 × `e2-standard-2`	≥ 6 vCPU, ≥ 20 GiB RAM
Medium	Staging / light production	3 × `n2-standard-4`	≥ 12 vCPU, ≥ 40 GiB RAM
High	Production with burst headroom	3 × `n2-standard-8` (or split into system + data plane pools)	≥ 24 vCPU, ≥ 80 GiB RAM

Maintain a minimum of three nodes to tolerate node upgrades and evictions. Demo installs may start at Small; production installations should start at Medium.

Conventions

All commands assume the environment variables defined in Step 3 and Step 4 are exported in the current shell. Re-export them when opening a new terminal.

Replace any <placeholder> value with one of your own.

Step 1: obtain your data plane credential

In the dashboard, navigate to System → Settings → Data plane credentials and click + Generate Data plane credential.

Save the downloaded file as data-plane-credentials.json on the workstation used for installation. This is the long-lived identity the data plane uses to authenticate to the management plane.

note

Some parts of the product still use the older "service account" naming for this file. The dashboard is standardizing on "data plane credential"; the file is the same.

Each data plane uses its own credential. Credentials can be revoked from the dashboard, and additional credentials can be generated (for example, one per environment) at any time.

Step 2: install the `tare` CLI

Run the installer script:

curl -sSL https://tare.tetrate.ai/tools/install.sh | bash

Output:

==> tare installer
==> channel: stable

==> Detected platform: darwin-arm64

==> Installing tare for darwin-arm64...
==> Downloading from: https://tare.tetrate.ai/tools/tags/v0.1.0-beta.2/tare-darwin-arm64.tar.gz
ok Installed tare to /Users/jonhdoe/.tare/bin/tare

==> tare version: tare version v0.1.0-beta.2

ok Installation directory is already in your PATH

==> Get started:
    tare install identity.json --serve-url https://proxy.acme.com
    tare install --help

The installer prints the install path (typically ~/.tare/bin/tare). Add it to PATH and verify the version:

export PATH="$PATH:$HOME/.tare/bin"
echo 'export PATH="$PATH:$HOME/.tare/bin"' >> ~/.zshrc   # or ~/.bashrc

$ tare --version
tare version v0.1.0-beta.2

Step 3: provision the gke cluster

Step 3.1: set environment variables

PROJECT_ID=<your-project>
REGION=us-central1
ZONE=us-central1-a
CLUSTER_NAME=tare-dp

Step 3.2: select the project

gcloud config set project "${PROJECT_ID}"

Step 3.3: create the cluster

gcloud container clusters create "${CLUSTER_NAME}" \
  --zone "${ZONE}" \
  --num-nodes 3 \
  --machine-type n2-standard-4 \
  --release-channel regular \
  --enable-ip-alias \
  --workload-pool "${PROJECT_ID}.svc.id.goog" \
  --gateway-api=standard

--gateway-api=standard enables the GKE Gateway controller and installs the GCPBackendPolicy and HealthCheckPolicy CRDs that the gateway Helm chart depends on. Without it, Step 7 fails with resource mapping not found for ... GCPBackendPolicy.

The --num-nodes and --machine-type values above correspond to the Medium tier in Cluster sizing. Adjust as needed.

Provisioning takes approximately 5–10 minutes.

tip

Already have a GKE cluster? Reuse it after confirming two things:

Workload Identity is enabled:

gcloud container clusters describe <name> --zone <zone> \
  --format='value(workloadIdentityConfig.workloadPool)'

Gateway API is enabled:

kubectl get crd gateways.gateway.networking.k8s.io
kubectl get crd gcpbackendpolicies.networking.gke.io

If either is missing, enable Gateway API:

gcloud container clusters update <name> --zone <zone> --gateway-api=standard

Step 3.4: fetch the kubeconfig

gcloud container clusters get-credentials "${CLUSTER_NAME}" --zone "${ZONE}"
kubectl get nodes

Expected output:

NAME                                          STATUS   ROLES    AGE   VERSION
gke-tare-dp-default-pool-xxxxxxxx-xxxx        Ready    <none>   5m    v1.34.x
gke-tare-dp-default-pool-xxxxxxxx-yyyy        Ready    <none>   5m    v1.34.x
gke-tare-dp-default-pool-xxxxxxxx-zzzz        Ready    <none>   5m    v1.34.x

Step 4: create an artifact registry repository

AR_REGION=us-central1
AR_REPO=tare

Create the repository:

gcloud artifacts repositories create "${AR_REPO}" \
  --repository-format=docker \
  --location="${AR_REGION}" \
  --description="Agent Router data plane images"

The repository is addressable as ${AR_REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}.

Step 5: sync Agent Router images to the registry

Step 5.1: authenticate Docker to artifact registry

gcloud auth configure-docker "${AR_REGION}-docker.pkg.dev"

Step 5.2: sync images

Copy the container images from Tetrate's registry into the registry created above. The tare CLI authenticates to the source registry automatically.

tare install /path/to/data-plane-credentials.json \
  --image-sync "${AR_REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}" \
  --sync-only

The sync produces no progress output and takes several minutes. After it completes, verify the images:

gcloud artifacts docker images list \
  "${AR_REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}" \
  --include-tags --limit 20

The output should list ten repositories under the tare/ prefix, including ai-gateway-controller, envoy-tars, gateway, liaison, ratelimit, redis, tare-doctor, and valet.

tip

To preview which images would sync without pulling them, replace --sync-only with --print-images.

Step 6: install the Agent Router data plane

Select the hostname used to expose the data plane externally and pass it as --serve-url. DNS for this hostname is configured in Step 8 after the gateway is up.

tare install /path/to/data-plane-credentials.json \
  --image-sync "${AR_REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}" \
  --serve-url https://<your-data-plane-hostname>

tare install performs the following actions:

Creates the tars-system and tars-dataplane namespaces.
Installs the Agent Router data plane via Helm.

note

The --serve-url flag is currently required by the CLI (legacy behavior). The URL does not need to resolve at install time; tare install records it on the management plane. The URL can be changed later through Dashboard → System → Settings → Data planes.

note

When the cluster and Artifact Registry reside in the same project, GKE nodes pull images using the node service account automatically; no image-pull secret is required. For cross-project or external-registry setups, see Appendix A.

Step 7: install the gateway

tare gateway install reads a JSON config and creates the global static IP, the Google-managed certificate (via Certificate Manager), the DNS authorization, and the Kubernetes Gateway and HTTPRoute that wire it all together.

tip

Already have GCP infrastructure? tare gateway install is opinionated. It assumes a public global L7 load balancer, a Google-managed certificate, and DNS records published by the operator. Two escape hatches when this does not fit the environment:

Use --ack-prereqs instead of --apply-prereqs when the platform team has pre-provisioned the static IP, certificate, DNS authorization, and certificate map via Terraform or other IaC. Reference the existing names in gcp-gateway.json; tare will use them without creating or modifying anything.
Skip tare gateway install entirely and author a Gateway and HTTPRoute against a preferred GatewayClass (for example, gke-l7-rilb for an internal load balancer, or cert-manager-managed certs through any GatewayClass). The data plane only requires an Ingress or Gateway that routes to the egress service in tars-dataplane on port 10080. Everything else in this section is a convenience wrapper around the GCP-native flow.

Step 7.1: configure the gateway

Create a file named gcp-gateway.json in the same directory as the credential file:

{
  "projectId": "<your-project>",
  "serveDomain": "<your-data-plane-hostname>",
  "serveUrl": "<your-data-plane-hostname>",
  "customer": "<your-customer-id>",
  "environment": "production",
  "certificateMap": { "name": "tare-cert-map" },
  "certificate":    { "name": "tare-gateway-cert" },
  "dnsAuthorization": { "name": "tare-dns-auth" },
  "namespaces": {
    "gateway":   "tars-gateway",
    "system":    "tars-system",
    "dataplane": "tars-dataplane"
  },
  "gcloud": { "skipExisting": true }
}

Field reference:

projectId: the project hosting the gateway infrastructure.
serveDomain / serveUrl: the fully-qualified hostname clients will use.
customer: the customer identifier (visible in the credential or on the dashboard).
environment: a free-form label used as a Helm value.
certificateMap, certificate, dnsAuthorization: resource names. Prefix as needed.
gcloud.skipExisting: true: makes the command idempotent across retries.

tip

Reserve a static IP up front to allow DNS coordination to start before the gateway finishes provisioning:

gcloud compute addresses create tare-gateway-ip --global --project "${PROJECT_ID}"
gcloud compute addresses describe tare-gateway-ip --global --project "${PROJECT_ID}" \
  --format='value(address)'

Then add "gateway": { "staticIpName": "tare-gateway-ip" } to gcp-gateway.json. Without a reserved IP, the gateway receives an ephemeral address that can change during maintenance.

Step 7.2: validate the config

tare gateway config lint --config gcp-gateway.json

Errors block the install; warnings include remediation guidance and can be configured to fail CI runs.

Step 7.3: preview the plan

tare gateway install /path/to/data-plane-credentials.json \
  --type gcp \
  --config gcp-gateway.json \
  --plan-only

The output is a deployment plan: resource diffs against the current GCP and Kubernetes state, with each value traced back to its source flag or config field. Review before applying.

Step 7.4: apply

tare gateway install /path/to/data-plane-credentials.json \
  --type gcp \
  --config gcp-gateway.json \
  --apply-prereqs \
  --wait

--apply-prereqs instructs tare to create the cloud resources directly. When the platform team has pre-provisioned these via Terraform or similar tools, use --ack-prereqs instead; tare will reference but not modify the assets.

On success, the command prints the static IP and the CNAME record required for certificate validation. Record both.

note

Certificate activation takes approximately 5–20 minutes after DNS propagates (see Step 8). Track activation with:

gcloud certificate-manager certificates describe <name> \
  --format='yaml(managed.state,managed.domainStatus)'

Step 8: wire DNS and register the URL

After the gateway is up, two DNS records are required: an A record for traffic and a CNAME for certificate validation. The exact values are printed by the gateway-install output and can be fetched at any time.

Step 8.1: retrieve the DNS values

# Static IP for the A record
gcloud compute addresses describe tare-gateway-ip --global --format='value(address)'

# DNS authorization values for the CNAME record
gcloud certificate-manager dns-authorizations describe tare-dns-auth \
  --project "${PROJECT_ID}" \
  --format='value(dnsResourceRecord.name,dnsResourceRecord.type,dnsResourceRecord.data)'

Step 8.2: add the DNS records

Add the following records in the DNS provider (Cloud DNS, Route 53, Cloudflare, registrar, and similar):

# Traffic
<your-data-plane-hostname>.                  A       <static-ip>                TTL 300

# Certificate authorization
_acme-challenge.<your-data-plane-hostname>.  CNAME   <google-managed-target>    TTL 300

Verify propagation:

dig +short <your-data-plane-hostname>
# Should return the static IP

dig +short _acme-challenge.<your-data-plane-hostname>
# Should return the Google-managed CNAME target

Once DNS resolves, the certificate's managed.state flips to ACTIVE and the gateway begins serving HTTPS.

Step 8.3: register the URL on the management plane

In the dashboard, navigate to System → Settings → Profile → Proxy URL and set it to https://<your-data-plane-hostname>.

Changes propagate to the data plane within approximately 30 seconds.

Step 9: verify provider routes

Providers (OpenAI, Anthropic, and others) and their upstream API keys are configured during Agent Router onboarding, not during this install. The data plane retrieves that configuration automatically once it is connected to the management plane.

Verify the data plane received the provider routes:

kubectl get aigatewayroutes -A
kubectl get aiservicebackends -A

Both should show Accepted resources within a minute of the data plane coming up. If the lists are empty, consult the Agent Router onboarding guide to confirm providers are configured.

Step 10: verify the install

Run tare doctor:

tare doctor /path/to/data-plane-credentials.json --verbose

Pass criteria: all in-cluster checks report Status: OK (or Healthy) with 0 errors, 0 warnings, and the final line confirms the health-report bundle was accepted (Sending health report ... OK (bundle <id>)).

Expected output (abridged):

CHECKS PERFORMED:
- Namespace existence (system, dataplane)
- CRD presence (Gateway API, AI Gateway, RouteDeployment)
- Controller deployments ready (TARS, AI Gateway, Envoy Gateway)
- Proxy deployment ready
- GatewayClass and Gateway accepted/programmed
- EnvoyPatchPolicy acceptance (per instance)
- EnvoyProxy acceptance (per instance)
- Egress EnvoyProxy image uses envoy-tars
- Identity Secret and ConfigMap present
- AIServiceBackend acceptance
- Envoy Gateway Backend acceptance
- BackendTrafficPolicy acceptance
- BackendSecurityPolicy acceptance
- BackendTLSPolicy acceptance
- ClientTrafficPolicy acceptance
- HTTPRouteFilter presence/acceptance
- ReferenceGrant presence
- RouteDeployment status conditions
- AIGatewayRoute acceptance/resolution
- HTTPRoute parent acceptance/resolution
- MCPRoute parent acceptance/resolution
- Proxy admin and forward endpoints
- Pod CrashLoopBackOff (excluding tars-config-monitor)

Sending health report to https://api.<your-tenant>.tetrate.ai/v1/dataplane-status... OK (bundle <bundle-id>)

Then send a request with an invalid token to confirm auth is enforced:

curl -sS -o /dev/null -w "HTTP %{http_code}\n" \
  "https://<your-data-plane-hostname>/v1/chat/completions" \
  -X POST -H "Content-Type: application/json" \
  -H "Authorization: Bearer NotREAL" \
  -d '{"model":"gpt-5-mini","messages":[{"role":"user","content":"hi"}]}'
# Expected: HTTP 401

If the response is HTTP 200, contact Tetrate Support; auth is enforced automatically on every install.

Step 11: smoke tests

In the router app (https://router.<your-tenant>.tetrate.ai), select API Keys in the sidebar and create a key. This is the key applications (and the tests below) use as Authorization: Bearer ....

Set the host and key once:

export DP_HOST=<your-data-plane-hostname>
export TARS_API_KEY="<your-api-key-from-router-app>"

Chat Completions (OpenAI shape)

curl -s "https://${DP_HOST}/v1/chat/completions" \
  -X POST -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  -d '{
    "model": "gpt-5-mini",
    "messages": [{"role": "user", "content": "hello, what are you?"}]
  }'

Anthropic Messages (native shape)

curl -s "https://${DP_HOST}/v1/messages" \
  -X POST -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  -d '{
    "model": "claude-haiku-4-5",
    "max_tokens": 64,
    "messages": [{"role": "user", "content": "hello"}]
  }'

List available models

curl -s "https://${DP_HOST}/v1/models" \
  -H "Authorization: Bearer ${TARS_API_KEY}" | jq '.data[].id' | sort -u

Streaming

curl -s "https://${DP_HOST}/v1/chat/completions" \
  -X POST -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  -d '{
    "model": "gpt-5-mini",
    "stream": true,
    "messages": [{"role": "user", "content": "count from 1 to 5"}]
  }'

MCP

note

MPC is licensed separately from inference and guardrails. Contact your Tetrate sales contact for more information.

Create an MCP profile in the router app: MCP Profiles (sidebar) → Create profile. The profile ID is used in the URL below.

curl -s "https://${DP_HOST}/mcp/<profile-id>" \
  -X POST -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${TARS_API_KEY}" \
  -d '{"jsonrpc":"2.0","method":"tools/list","id":1}'

To register the MCP profile with Claude Code:

claude mcp add --transport http <profile-name> \
  https://${DP_HOST}/mcp/<profile-id> \
  --header "Authorization: Bearer ${TARS_API_KEY}"

Upgrading

To upgrade the data plane to a newer Agent Router release:

Install the new tare CLI version (re-run Step 2).
Re-run Step 5.2 and Step 6.
The gateway, certificate, and DNS authorizations persist; tare gateway install does not need to be re-run unless its config has changed.

note

Cluster state (namespaces, gateway, DNS authorization, certificate, and registered URL) persists across upgrades.

Cleanup

To remove the deployment, delete resources in dependency order:

# 1. Helm release
helm uninstall tars -n tars-system 2>/dev/null || true

# 2. Gateway resources (cert, DNS auth, static IP, namespaces)
tare gateway uninstall /path/to/data-plane-credentials.json \
  --type gcp \
  --config gcp-gateway.json \
  --wait

# 3. Remove any GCP assets that survived the uninstall (idempotent)
gcloud certificate-manager certificates delete tare-gateway-cert --quiet || true
gcloud certificate-manager certificate-maps delete tare-cert-map --quiet || true
gcloud certificate-manager dns-authorizations delete tare-dns-auth --quiet || true
gcloud compute addresses delete tare-gateway-ip --global --quiet || true

# 4. Artifact Registry repository
gcloud artifacts repositories delete "${AR_REPO}" --location="${AR_REGION}" --quiet

# 5. GKE cluster (only when not required for other workloads)
gcloud container clusters delete "${CLUSTER_NAME}" --zone "${ZONE}" --quiet

# 6. Local kubeconfig
kubectl config delete-context "gke_${PROJECT_ID}_${ZONE}_${CLUSTER_NAME}" || true

note

DNS records (the A record and the _acme-challenge CNAME) live in the DNS provider and must be removed manually.

Troubleshooting

Issues are grouped by the step where they are most likely to occur.

Image synchronization issues

Symptom	Cause	Fix
`401 Unauthorized` on `HEAD https://registry.tetrate.ai/v2/...`	The credential is not authorized to pull from `registry.tetrate.ai`.	Regenerate the credential from Dashboard → System → Settings → Data plane credentials and retry.
`unauthenticated` / `permission denied` from the destination registry	The `gcloud auth configure-docker` token is invalid for this registry.	Re-run `gcloud auth configure-docker ${AR_REGION}-docker.pkg.dev` and retry the sync.

Image pull from the cluster fails

Symptom	Cause	Fix
Pods in `ImagePullBackOff` with `permission denied` from Artifact Registry	The cluster and registry reside in different projects, or the node service account lacks `roles/artifactregistry.reader`.	Grant the node SA reader access on the cross-project repo, or use the image-pull-secret approach in Appendix A.

Gateway installation issues

Symptom	Cause	Fix
Wait timeout while gateway provisions	Google Cloud Load Balancer provisioning is in flight; first-run provisioning takes 5–15 minutes.	Watch `kubectl get gateway -n tars-gateway -o wide` and look for an `ADDRESS`. If still empty after 20 minutes, inspect Gateway events for the underlying error.
`Failed to load dynamic module: composer` in egress envoy logs	The auth filter wiring did not program. On current builds this indicates the management plane is missing the proxy-settings config.	Contact support; this should not appear on a correctly configured tenant.
Certificate stays `PROVISIONING` indefinitely	The DNS authorization CNAME does not resolve, or the A record points at the wrong IP.	Verify with `dig +short _acme-challenge.<host>` and `dig +short <host>`. Both must return the values shown by `tare gateway install`.

Failed tests

Symptom	Cause	Fix
HTTP 404 with body `No matching route found.`	The requested model is not configured for any provider, or no providers are configured.	Verify `kubectl get aigatewayroutes -A` shows `Accepted` rows. If empty, consult the Agent Router onboarding guide.
HTTP 401 with a valid bearer	The API key was issued against a different management plane than this data plane is registered to.	Issue a new key from the router app for this tenant.
HTTPS connect failure (`SSL_ERROR`, `unable to verify the first certificate`)	The certificate is not yet `ACTIVE`, or DNS has not propagated.	Check `gcloud certificate-manager certificates describe <name> --format='value(managed.state)'` and wait until `ACTIVE`.

Appendix a: cross-project and external-registry pull access

The main flow assumes the GKE cluster and the Artifact Registry reside in the same GCP project. In that setup, GKE uses the node service account automatically; no image-pull secret is required.

For other arrangements:

A.1: cross-project artifact registry (no pull secret)

Grant the GKE cluster's node service account roles/artifactregistry.reader on the destination project or repository:

NODE_SA=$(gcloud container clusters describe "${CLUSTER_NAME}" \
  --zone "${ZONE}" --format='value(nodeConfig.serviceAccount)')
gcloud projects add-iam-policy-binding "${PROJECT_ID}" \
  --member="serviceAccount:${NODE_SA}" \
  --role="roles/artifactregistry.reader"

After the binding, tare install works exactly as in the main flow; no additional flags are required.

A.2: cross-project via impersonated service account (pull secret)

When the node service account cannot be added to the destination project, pipe a short-lived OAuth token to tare install:

echo "oauth2accesstoken:$(gcloud auth print-access-token --impersonate-service-account=<SA_EMAIL>)" | \
tare install /path/to/data-plane-credentials.json \
  --image-registry "${AR_REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}" \
  --image-pull-secret-stdin \
  --serve-url https://<your-data-plane-hostname> \
  --wait

note

--image-pull-secret-stdin creates the tars-image-pull-secret Secret in both Agent Router namespaces.

A.3: external registry (not artifact registry)

Pipe a registry username and password directly:

echo "<username>:<password>" | \
tare install /path/to/data-plane-credentials.json \
  --serve-url https://<your-data-plane-hostname> \
  --image-sync <your-registry-host>/<repo> \
  --image-pull-secret-stdin \
  --image-pull-secret-name tars-image-pull-secret \
  --wait

Appendix b: forward observability data to an OpenTelemetry collector

The data plane can stream envoy HTTP access logs and router_* application metrics to a customer-managed OpenTelemetry Collector, which can then forward to any backend (Google Cloud Monitoring via the googlecloud exporter, Datadog, Grafana Cloud, SigNoz, and others).

The two streams are configured in separate fields on the EnvoyProxy resource:

Stream	EnvoyProxy field	Contents
Access logs (per-request HTTP metadata)	`accessLog.sinks[]`	Method, status, path, latency, MCP headers, downstream and upstream addresses
Metrics (`router_*` and envoy native stats)	`metrics.sinks[]`	`router_requests_total`, `router_model_requests_total`, plus envoy cluster and listener counters

Either stream can be configured independently.

Inline at install time (recommended)

tare install accepts --otel-collector-endpoint and --otel-exporter-auth-headers so the data plane ships telemetry from the first request:

tare install /path/to/data-plane-credentials.json \
  --image-sync "${AR_REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}" \
  --serve-url https://<your-data-plane-hostname> \
  --otel-collector-endpoint 'https://<your.otel.grpc.endpoint>' \
  --otel-exporter-auth-headers 'Bearer <token>'

Alternative: deploy a collector in-cluster

The collector ConfigMap and EnvoyProxy patch shapes are documented in Appendix C of the Azure installation guide. The data plane resources are Kubernetes-native, so the manifests apply identically on GCP.

For GCP-specific backends:

Backend	Exporter	Reference
Google Cloud Monitoring (Stackdriver)	`googlecloud`	GoogleCloudPlatform/opentelemetry-operations-collector
Google Cloud Trace / Logging	`googlecloud`	Same as above
Datadog	`datadog`	docs.datadoghq.com/opentelemetry/otel_collector_datadog_exporter
Grafana Cloud	`otlphttp` to a grafana.net endpoint	grafana.com/docs/grafana-cloud/send-data/otlp

note

The EnvoyProxy patches reset on every tare install re-run. Re-apply after each reinstall, or use the inline --otel-collector-endpoint flags so the install owns the configuration.

Where to go next

Gateway installation

Install the data plane gateway components that manage inbound access and configure request routing.

Console quickstart

Issue an API key and make a first routed AI request once the gateway is running.

Architecture overview​

Table of contents​

Prerequisites​

Dashboard and Router app access​

Required tools​

Infrastructure​

GCP iam roles​

Cluster type: standard or autopilot​

Cluster sizing​

Conventions​

Step 1: obtain your data plane credential​

Step 2: install the tare CLI​

Step 3: provision the gke cluster​

Step 3.1: set environment variables​

Step 3.2: select the project​

Step 3.3: create the cluster​

Step 3.4: fetch the kubeconfig​

Step 4: create an artifact registry repository​

Step 5: sync Agent Router images to the registry​

Step 5.1: authenticate Docker to artifact registry​

Step 5.2: sync images​

Step 6: install the Agent Router data plane​

Step 7: install the gateway​

Step 7.1: configure the gateway​

Step 7.2: validate the config​

Step 7.3: preview the plan​

Step 7.4: apply​

Step 8: wire DNS and register the URL​

Step 8.1: retrieve the DNS values​

Step 8.2: add the DNS records​

Step 8.3: register the URL on the management plane​

Step 9: verify provider routes​

Step 10: verify the install​

Step 11: smoke tests​

Chat Completions (OpenAI shape)​

Anthropic Messages (native shape)​

List available models​

Streaming​

MCP​

Upgrading​

Cleanup​

Troubleshooting​

Image synchronization issues​

Image pull from the cluster fails​

Gateway installation issues​

Failed tests​

Appendix a: cross-project and external-registry pull access​

A.1: cross-project artifact registry (no pull secret)​

A.2: cross-project via impersonated service account (pull secret)​

A.3: external registry (not artifact registry)​

Appendix b: forward observability data to an OpenTelemetry collector​

Inline at install time (recommended)​

Alternative: deploy a collector in-cluster​

Architecture overview

Table of contents

Prerequisites

Dashboard and Router app access

Required tools

Infrastructure

GCP iam roles

Cluster type: standard or autopilot

Cluster sizing

Conventions

Step 1: obtain your data plane credential

Step 2: install the `tare` CLI

Step 3: provision the gke cluster

Step 3.1: set environment variables

Step 3.2: select the project

Step 3.3: create the cluster

Step 3.4: fetch the kubeconfig

Step 4: create an artifact registry repository

Step 5: sync Agent Router images to the registry

Step 5.1: authenticate Docker to artifact registry

Step 5.2: sync images

Step 6: install the Agent Router data plane

Step 7: install the gateway

Step 7.1: configure the gateway

Step 7.2: validate the config

Step 7.3: preview the plan

Step 7.4: apply

Step 8: wire DNS and register the URL

Step 8.1: retrieve the DNS values

Step 8.2: add the DNS records

Step 8.3: register the URL on the management plane

Step 9: verify provider routes

Step 10: verify the install

Step 11: smoke tests

Chat Completions (OpenAI shape)

Anthropic Messages (native shape)

List available models

Streaming

MCP

Upgrading

Cleanup

Troubleshooting

Image synchronization issues

Image pull from the cluster fails

Gateway installation issues

Failed tests

Appendix a: cross-project and external-registry pull access

A.1: cross-project artifact registry (no pull secret)

A.2: cross-project via impersonated service account (pull secret)

A.3: external registry (not artifact registry)

Appendix b: forward observability data to an OpenTelemetry collector

Inline at install time (recommended)

Alternative: deploy a collector in-cluster