Working with budgets
AI traffic has a habit of growing without anybody noticing until the invoice arrives. A research team starts a benchmark, leaves it running over a weekend, and bills three weeks of normal spend in two days. A new agent framework gets wired into a production path and starts making ten calls where a human would have made one. A leaked credential lands in a public repository and an attacker spends a few hours mining it before the security team notices. Each of these scenarios is preventable, but only if there is a clear answer to two questions: how much should this thing be allowed to spend, and who is responsible for noticing when it spends more.
The platform's answer to those questions is layered. There is no single "budgets" screen that owns the entire mechanism end-to-end; instead, the budgeting story is the combination of three things that already exist: per-key rate limits, which enforce hard ceilings on token consumption inline at the gateway; Usage Analytics, which is where spending against those ceilings becomes visible after the fact; and the API-key-per-purpose convention that makes the first two tools precise rather than blurry. This guide covers how to combine those mechanisms into an actual budgeting discipline, when each piece is the right tool, and the operational patterns that work across the most common spending scenarios.
Persona: Platform operator working in the Admin Dashboard, often in partnership with the developer teams that own specific API keys.
Estimated time: 15--25 minutes for an initial setup; ongoing as workloads evolve.