Control what request data leaves the cluster
Prompt and response bodies routed through the gateway routinely contain the most sensitive data an organisation handles: customer records, source code, internal documents, regulated personal data. For many operators, the question is not whether that content is useful in the dashboard, it plainly is, but whether it is permitted to leave the data-plane cluster at all. Data-residency rules, privacy commitments, and compliance regimes frequently require prompt and completion content to stay inside the customer's own infrastructure, while usage and cost still need to be visible centrally for billing and capacity planning.
The Request logs setting governs exactly that boundary. It controls what reaches the Tetrate management plane in request_log records, so the amount of request detail that leaves the cluster becomes a deliberate operator decision rather than a fixed default. The richest setting forwards full prompt and response content for inspection in the dashboard; the most restrictive setting keeps that content out of the management plane entirely. The setting accepts three values:
| Mode | What reaches the management plane | What's dropped |
|---|---|---|
| Full (default) | Envelope, counters, costs, headers, llm_parameters, request body, response body. | Nothing. |
| Metadata only | Envelope, counters, costs, headers, llm_parameters. | request_body, response_body. |
| Off | Nothing. | The entire request_log record. |
Selecting Request logs mode:
Each mode is a strict superset of the one below it. Full forwards the complete record, including request and response bodies, which gives the dashboard's per-request views their full detail. Metadata only forwards the envelope, counters, costs, headers, and llm_parameters but omits the two body fields, so usage and cost stay visible while prompt and completion content remains inside the cluster. Off forwards nothing, suppressing the request_log record at the management plane altogether.
The setting is system-wide for the deployment. Switching takes effect across the data plane within a few seconds of saving: no restart, no reinstall.
Persona: Platform operator working in the Admin Dashboard, often in coordination with security or compliance stakeholders.
Estimated time: 5--10 minutes to choose, set, and verify a mode; longer if an external collector is also being wired up.
When this guide applies
The setting is relevant whenever a decision has to be made about how much request content may cross the cluster boundary toward Tetrate. Typical cases:
| Situation | Suggested mode |
|---|---|
| Default evaluation or development, where full per-request detail in the dashboard is wanted | Full |
| Prompt and completion content must stay inside the cluster, but usage and cost visibility is still required centrally | Metadata only |
No request_log record may leave the cluster at all, for the strictest data-residency posture | Off |
| Full content is required for observability, but in a self-managed stack rather than the dashboard | Metadata only or Off, paired with an external OTEL destination |
Outcomes
By the end of this guide:
- The three modes are understood, including exactly what each one forwards and what it drops.
- A mode has been set from the dashboard and confirmed active with a smoke test.
- The audit trail for the change has been located.
- Where required, the full record has been confirmed arriving at a self-managed OTEL collector while content is withheld from the management plane.
What each mode affects
The mode affects only the records stored at the management plane. Its effect on each dashboard surface and downstream path is shown below:
| Surface | Full | Metadata only | Off |
|---|---|---|---|
| Dashboard Request logs view (per-request rows) | rows with bodies | rows, body panels empty | no rows |
| Dashboard Usage view (tokens, costs over time) | populated | populated | populated |
| Dashboard Audit logs view | populated | populated | populated |
| Billing / transactions table | populated | populated | populated |
| Customer-attached OTEL destination on the data plane | full record | full record (mode applies only to MP) | full record |
Two things are never affected by this setting:
- Billing. Transactions are written through a separate path. Every request still produces a transaction regardless of mode, so monthly billing remains accurate.
- Customer-attached destinations. A self-managed OTEL receiver (see the External destination section below) receives the full record under every mode; the toggle scopes only what crosses the cluster boundary toward Tetrate.
How to change the mode
The mode is changed entirely from the dashboard. No command-line access, redeployment, or pod restart is required.
- Sign in to the dashboard with an admin account.
- Go to Settings → Request logs.
- Pick a mode from the dropdown. The card shows a short description of what each mode stores.
- Click Save.
The save action:
- Persists the new mode to system settings.
- Pushes the change to the data plane via the existing self-heal channel.
- Writes an audit log entry recording who changed the mode and to what value.
Propagation takes a few seconds. Confirm it by sending a prompt through the gateway and checking the Request logs view (see the next section).
How to verify the mode is active
Because the change takes effect within seconds, the active mode is confirmed by sending a known prompt through the gateway and observing how it appears in the dashboard. This smoke test uses the dashboard alone; no kubectl is required.
Setup once
In the dashboard, go to API Keys → Create key. Copy the sk-... value. Note the data plane's gateway URL (under Settings → Workspace).
Smoke test per mode
-
Set the mode in Settings → Request logs.
-
Send a prompt with a distinctive word:
curl -X POST https://<your-gateway>/v1/chat/completions \-H "Authorization: Bearer sk-<your-key>" \-H "Content-Type: application/json" \-d '{"model":"claude-haiku-4-5","messages":[{"role":"user","content":"Say MANGO once and stop."}],"max_tokens":50}'Alternatively, send a prompt from the playground.
-
Wait ~10 seconds, then check the dashboard.
Expected result by mode:
| Mode | Request logs view | Usage view |
|---|---|---|
| Full | New row for MANGO. Detail panel shows prompt + response bodies. | MANGO's tokens + cost reflected. |
| Metadata only | New row for MANGO. Detail panel shows tokens + headers; body panels say "No request body available" / "No response body available". | MANGO's tokens + cost reflected. |
| Off | No row for MANGO. | MANGO's tokens + cost still reflected (comes from the transaction path). |
Repeat for each mode under test, picking a different keyword each time (MANGO / PAPAYA / DURIAN / KIWI, etc.) so the rows are easy to spot.
Audit log check
After any change, Audit logs in the dashboard shows a new row:
- Resource type:
system_settings - Resource ID:
request_logs.mp_mode - Action:
UPDATE - Body:
{"mode":"<the saved value>"} - Actor: the acting user account
- Source IP, User agent, Correlation ID populated
External destination: sending the full record to a custom OTEL collector
To keep prompt and completion content inside a self-managed observability stack while still letting Tetrate see usage and costs, configure Metadata only or Off mode on the management-plane path and wire an OTEL HTTP collector to the data-plane egress.
The data plane always emits the full record to any local destination, regardless of mode. The typical operator pattern is:
- Mode =
metadata_onlyoroff(depending on what reaches Tetrate) - Local OTEL endpoint = an in-cluster collector
What to set
Two environment variables on the egress proxy container:
| Variable | Value |
|---|---|
OTEL_LOGS_EXPORTER | otlp |
OTEL_EXPORTER_OTLP_LOGS_ENDPOINT | the full URL of the collector's OTLP/HTTP logs endpoint, including the /v1/logs path |
For example, a collector listening at https://collector.observability.svc.cluster.local:4318 requires the endpoint https://collector.observability.svc.cluster.local:4318/v1/logs.
The /v1/logs suffix is mandatory. The OTEL exporter posts to the URL verbatim and most collectors only respond on that path. Omitting it is the single most common misconfiguration.
Where to set it
In the Helm values for the egress chart, add the env block:
envoyProxy:
envoyGateway:
egress:
egressDataplane:
container:
env:
OTEL_LOGS_EXPORTER: otlp
OTEL_EXPORTER_OTLP_LOGS_ENDPOINT: https://<your-collector>/v1/logs
Apply with helm upgrade (or the normal chart-management flow). The egress pod restarts and the new endpoint becomes the local destination.
Authenticated collectors
If the collector requires an auth header, add it via OTEL_EXPORTER_OTLP_HEADERS:
env:
OTEL_LOGS_EXPORTER: otlp
OTEL_EXPORTER_OTLP_LOGS_ENDPOINT: https://<your-collector>/v1/logs
OTEL_EXPORTER_OTLP_HEADERS: "Authorization=Bearer <token>"
For sensitive headers, bind from a Kubernetes Secret with valueFrom.secretKeyRef instead of inlining.
Verifying the external destination
The external destination is verified by setting the management-plane path to drop everything, then confirming the full record still arrives at the local collector.
- Stand up (or reuse) an OTEL collector reachable from the data-plane cluster on the URL above.
- Apply the Helm values change. Wait for the egress pod to roll.
- Set the dashboard mode to Off (the strongest case: Tetrate gets nothing).
- Send a test prompt via
curlas above. - Within ~10 seconds, the collector logs a record with:
- Body:
Request log - Attribute
event.type:request_log - Attribute
payload: gzipped JSON of the full record (decode withgunzip+ a JSON viewer)
- Body:
- The dashboard's Request logs view shows no new row for this prompt (off mode at MP). The dashboard's Usage view does show the tokens + cost (transaction path).
If the collector receives nothing:
- Confirm
OTEL_EXPORTER_OTLP_LOGS_ENDPOINTends in/v1/logs. - Confirm
OTEL_LOGS_EXPORTER=otlpis set; without it the local destination is disabled. - Check the egress pod's container logs for
OTEL SDK errorlines; they include the failing URL and error code. - Confirm the collector is reachable from the data-plane cluster's pod network. If the collector is outside the cluster (for example, behind a tunnel or external load balancer), test reachability with
curlfrom a debug pod first.
Frequently asked questions
Does switching to off break monthly billing?
No. Transactions are written through a separate code path and arrive regardless of mode. The Usage view and billing reports keep working.
Does switching to off lose audit information about who ran which prompt?
The request_log row is the per-request record at the management plane. Under off it is not stored at MP. If that detail is required, run metadata_only (envelope + tokens reach MP, bodies do not) or wire an external destination as above.
Does switching to metadata_only retroactively strip bodies from rows already in the database?
No. The setting affects only records produced from the moment it takes effect onward. Existing rows are unchanged.
Can different workspaces have different modes?
Not in this version. The setting is system-wide for the deployment. A per-workspace control is on the roadmap if customers need it.
What happens when an unknown mode value is saved via a direct database write?
The data plane treats anything it does not recognize as full, so a misconfigured value can never silently drop data. The dashboard only ever writes one of the three known values.
Why is the body panel empty in the dashboard?
Either the request genuinely had no body, the response was empty (for example, an error before a generation finished), or the mode when the request was processed was metadata_only. The dashboard does not currently distinguish these cases in the UI.
Where to go next