Skip to main content
Version: 1.38

Models

Privatemode gives you access to the following models. For pricing and rate limits, see Pricing and Rate limits.

ModelModel IDTypeInputContext / limitEndpoints
Gemma 3 27Bgemma-3-27bChatText, image128k tokens/v1/chat/completions
gpt-oss-120bgpt-oss-120bChatText128k tokens/v1/chat/completions, /v1/completions, /v1/messages
Kimi K2.5kimi-k2.5ChatText, image262k tokens/v1/chat/completions, /v1/completions, /v1/messages
Qwen3-Coder 30B-A3B (deprecated)qwen3-coder-30b-a3bChatText128k tokens/v1/chat/completions, /v1/completions
Qwen3-Embedding 4Bqwen3-embedding-4bEmbeddingText32k tokens/v1/embeddings
Voxtral Mini 3B (preview)voxtral-mini-3bSpeech-to-textAudio50 MB/v1/audio/transcriptions
Whisper large-v3whisper-large-v3Speech-to-textAudio50 MB/v1/audio/transcriptions

All chat models support streaming, tool calling, and structured outputs.

Kimi K2.5

The model supports images as input. Follow this guide to use the feature. Kimi K2.5 performs reasoning steps to improve response quality. To disable reasoning and reduce latency at the cost of quality, extend your request with:

{
"chat_template_kwargs": {"thinking": false}
}

Gemma 3 27B

The model supports images as input. Follow this guide to use the feature.

The model supports tool calling with the following constraints:

  • Gemma requires alternating user/assistant roles, so you can't use multiple user messages without assistant messages between them. For tool calling, the user role can use the tool role instead, i.e., user -> assistant (tool call) -> tool (result) -> assistant.
  • Gemma doesn't support mixed text and tool call outputs. Make sure you don't ask it to generate both in the same response, but separate requests for tool use and text responses.

Qwen3-Embedding 4B

The model uses Matryoshka training and supports output dimensions of 1024 or 2560, set via the dimensions field in the embeddings request. For most tasks, 1024 dimensions is sufficient. For other dimensionalities, truncate the returned vector client-side and re-normalize it afterward.

Voxtral Mini 3B

Preview

This model is currently offered as a preview and may be removed without a prior deprecation period.

Use sufficiently high-quality audio with adequate bit rates. MPEG with the MP2 codec is too low quality and can result in cut-off or excessively long generations. Consider Whisper large-v3 if you face quality issues.

Qwen3-Coder 30B-A3B

Deprecated

This model is deprecated and will be removed in a future release. Migrate coding workflows to Kimi K2.5.