Usage
The usage page shows how your organization consumes Privatemode over time.
What you can see
The portal provides usage summaries and charts for:
- prompt tokens
- cached prompt tokens
- completion tokens
- total tokens
It also provides usage by task for capabilities such as:
- chat
- embeddings
- transcription
Filter and group usage
Use the portal filters to inspect usage by:
- month
- token type
- grouping, such as model
- selected subsets of results
This helps answer questions like:
- which model is driving usage
- whether usage spikes came from prompting or completions
- how much usage came from embeddings or transcription
Cached tokens
Cached tokens are shown separately because they reduce repeated compute and latency. Cached tokens don't count toward rate limits.
Model multipliers
Some models use multipliers for effective usage accounting. Review the Rate limits page to understand how this affects your organization's effective usage.
Best practices
- Review usage after deploying a new feature
- Check monthly trends before changing plans
- Use filtered views to investigate spikes by model or task
- Pair usage review with access key hygiene and billing review