Skip to main content
Version: Next

Usage

The usage page shows how your organization consumes Privatemode over time.

What you can see

The portal provides usage summaries and charts for:

  • prompt tokens
  • cached prompt tokens
  • completion tokens
  • total tokens

It also provides usage by task for capabilities such as:

  • chat
  • embeddings
  • transcription

Filter and group usage

Use the portal filters to inspect usage by:

  • month
  • token type
  • grouping, such as model
  • selected subsets of results

This helps answer questions like:

  • which model is driving usage
  • whether usage spikes came from prompting or completions
  • how much usage came from embeddings or transcription

Cached tokens

Cached tokens are shown separately because they reduce repeated compute and latency. Cached tokens don't count toward rate limits.

Model multipliers

Some models use multipliers for effective usage accounting. Review the Rate limits page to understand how this affects your organization's effective usage.

Best practices

  • Review usage after deploying a new feature
  • Check monthly trends before changing plans
  • Use filtered views to investigate spikes by model or task
  • Pair usage review with access key hygiene and billing review