Models
Privatemode gives you access to the following models. For pricing and rate limits, see Pricing and Rate limits.
| Model | Model ID | Type | Input | Context / limit | Endpoints |
|---|---|---|---|---|---|
| Gemma 4 31B | gemma-4-31b | Chat | Text, image | 256k tokens | /v1/chat/completions, /v1/completions, /v1/messages |
| gpt-oss-120b | gpt-oss-120b | Chat | Text | 128k tokens | /v1/chat/completions, /v1/completions, /v1/messages |
| Kimi K2.6 | kimi-k2.6, kimi-latest | Chat | Text, image | 256k tokens | /v1/chat/completions, /v1/completions, /v1/messages |
| Qwen3-Embedding 4B | qwen3-embedding-4b | Embedding | Text | 32k tokens | /v1/embeddings |
| Voxtral Mini 3B | voxtral-mini-3b | Speech-to-text | Audio | 50 MB | /v1/audio/transcriptions |
| Whisper large-v3 | whisper-large-v3 | Speech-to-text | Audio | 50 MB | /v1/audio/transcriptions |
All chat models support streaming, tool calling, and structured outputs.
Kimi K2.6
The model supports images as input. Follow this guide to use the feature. Kimi K2.6 performs reasoning steps to improve response quality. To disable reasoning and reduce latency at the cost of quality, extend your request with:
{
"chat_template_kwargs": {"thinking": false}
}
Use the model ID kimi-latest to always route to the latest available Kimi model.
Gemma 4 31B
The model supports images as input. Follow this guide to use the feature. The model supports tool calling. Gemma 4 can perform reasoning steps to improve response quality. To disable reasoning and reduce latency at the cost of quality, extend your request with:
{
"chat_template_kwargs": {"enable_thinking": false}
}
Qwen3-Embedding 4B
The model uses Matryoshka training and supports output dimensions of 1024 or 2560, set via the dimensions field in the embeddings request. For most tasks, 1024 dimensions is sufficient. For other dimensionalities, truncate the returned vector client-side and re-normalize it afterward.
Voxtral Mini 3B
Use sufficiently high-quality audio with adequate bit rates. Consider Whisper large-v3 if you face issues with low quality audio.