Version: Next

Usage

The usage page shows how your organization consumes Privatemode over time.

What you can see

The portal provides usage summaries and charts for:

prompt tokens
cached prompt tokens
completion tokens
total tokens

It also provides usage by task for capabilities such as:

chat
embeddings
transcription

Filter and group usage

Use the portal filters to inspect usage by:

month
token type
grouping, such as model
selected subsets of results

This helps answer questions like:

which model is driving usage
whether usage spikes came from prompting or completions
how much usage came from embeddings or transcription

Cached tokens

Cached tokens are shown separately because they reduce repeated compute and latency. Cached tokens don't count toward rate limits.

Model multipliers

Some models use multipliers for effective usage accounting. Review the Rate limits page to understand how this affects your organization's effective usage.

Best practices

Review usage after deploying a new feature
Check monthly trends before changing plans
Use filtered views to investigate spikes by model or task
Pair usage review with access key hygiene and billing review

What you can see​

Filter and group usage​

Cached tokens​

Model multipliers​

Best practices​

What you can see

Filter and group usage

Cached tokens

Model multipliers

Best practices