Dev Console quickstart

This 10-minute walkthrough takes you from sign-in to your first routed AI request through the gateway. By the end you will have an API key, a proxy endpoint URL, a working request in at least one language, and a verified entry in Request Logs. The Console is the developer-facing application. It gives you everything you need to call AI models, configure routing policies, and monitor usage, without managing provider credentials directly. All AI requests pass through the gateway, which handles authentication, routing, fallback, guardrail enforcement, and observability transparently.

info

The screenshots in this guide show a redacted URL. Your URL in a production environment will be something like router.<custom-name>.tetrate.ai. Your console URL will differ based on deployments in production and non-production.

Navigate to your Console URL (e.g., https://router.poc.tetrate.ai)

Click Sign In with Corporate SSO to authenticate through your organization's identity provider. If your deployment uses email/password authentication, enter your credentials directly.

Click Sign In with Corporate SSO

You land on the Dashboard, which displays your proxy endpoint URL and a summary of recent usage activity. Note the proxy endpoint URL; you will need it in Step 3.

Step 2: create an API key

API keys authenticate your application's requests to the gateway. Each key can be assigned its own routing policy (fallback chain, traffic splitting), rate limits, and model scope. For this quickstart, you will create a basic key with default settings.

In the sidebar, click API Keys

Click Add API Key

Click Add API Key

Enter a descriptive name for your key (e.g., my-agent-router-key). Choose a name that identifies the application or workload this key will serve; the key value is shown only once, but you can always identify keys by name in logs and usage reports.

Enter key name

Click Create key

Click Create key

Copy the generated key immediately and store it in a secure location (a secrets manager or environment variable). This is the only time the key value is displayed.

Copy the API key

API key created

caution

API keys grant the bearer the ability to route requests through the gateway and consume AI quota. Store your key securely; do not commit it to source control or include it in logs or error messages.

Step 3: find your proxy endpoint

Your proxy endpoint URL is the address your application calls instead of calling AI providers directly. When a request arrives at the proxy, the gateway:

Authenticates the request using your API key
Applies the routing policy configured for that key (fallback chain, traffic splitting rules)
Translates the request format if routing to a different provider (e.g., OpenAI format to Anthropic backend)
Returns the normalized response, including error normalization if the upstream provider returns an error

The proxy endpoint URL is displayed on your Dashboard and typically looks like:

https://proxy.poc.tetrate.ai/v1

Append the appropriate path for the API format you intend to use:

Format	Path	Description
OpenAI Chat Completions	`/v1/chat/completions`	Most widely supported format; compatible with all OpenAI-compatible SDKs and agent frameworks
OpenAI Responses	`/v1/responses`	Newer OpenAI Responses API with a simplified interface
Anthropic Messages	`/v1/messages`	Anthropic native format for Claude models

For detailed examples of each format including streaming, see Supported APIs.

Step 4: make your first request

Replace YOUR_API_KEY with the key you created in Step 2. All examples below call gpt-4o, but you can substitute any model name from your Model Catalog.

The only difference between calling the gateway and calling a provider directly is the base_url (or the curl URL). All other parameters, response formats, and SDK behaviours are identical.

Using curl

curl https://proxy.poc.tetrate.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello, world!"}]
  }'

Using Python

from openai import OpenAI

client = OpenAI(
    base_url="https://proxy.poc.tetrate.ai/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello, world!"}]
)

print(response.choices[0].message.content)

Streaming with curl

curl https://proxy.poc.tetrate.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello, world!"}],
    "stream": true
  }'

Streaming with Python

from openai import OpenAI

client = OpenAI(
    base_url="https://proxy.poc.tetrate.ai/v1",
    api_key="YOUR_API_KEY"
)

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello, world!"}],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

Using the playground

The Playground lets you test model routing interactively without writing any code, useful for evaluating models or verifying routing behaviour before integrating into an application.

In the sidebar, go to Build > Playground
Select a model from the dropdown. The list shows all models currently enabled by your administrator.
Type a message and press Send
View the response alongside token usage, latency, and the upstream provider that served the request

Step 5: check your request logs

Every request the gateway processes is recorded in Request Logs. This is your primary tool for debugging, auditing, and analysing cost.

Go to Monitoring > Request Logs
You should see your request with the model name, upstream provider, token counts, estimated cost, and total latency
Click any row to view the full request and response payloads, including which provider was selected and whether any fallback attempts were made

Request Logs also display the x-request-id correlation header that the gateway attaches to every response. Use this ID to locate the corresponding span in your OpenTelemetry tracing backend. See Gateway Behavior for details on correlation IDs and the full debugging workflow.

Step 6: view usage analytics

Usage Analytics provides aggregated metrics across all your requests. Use it to track consumption trends, compare model costs over time, and understand traffic distribution across API keys.

Go to Monitoring > Usage
Select a time range (e.g., Last 24 hours)
View breakdowns by model and API key, including total tokens consumed, estimated cost, and request volume

Evaluation checkpoint

Successfully signed in to the Console
Created an API key and stored it securely
Located the proxy endpoint URL on the Dashboard
Made a successful request via curl, Python, or the Playground
Confirmed the request appears in Request Logs with provider, latency, and token details
Reviewed usage analytics for the time period

Where to go next

Route requests across providers

Browse the model catalog and configure routing policies.

Admin Dashboard quickstart

Learn the operator experience.

Step 1: sign in​

Step 2: create an API key​

Step 3: find your proxy endpoint​

Step 4: make your first request​

Using curl​

Using Python​

Streaming with curl​

Streaming with Python​

Using the playground​

Step 5: check your request logs​

Step 6: view usage analytics​

Evaluation checkpoint​

Step 1: sign in