Version: 1.37

Voxtral Mini 3B

info

This model is currently offered as a preview. It may be removed without a prior deprecation period.

Voxtral Mini 3B is an audio-capable model from Mistral AI that supports automatic speech recognition (speech-to-text) with multilingual language detection.

Model ID

voxtral-mini-3b

Source

Hugging Face

Modality

Input: Audio (e.g., .mp3, .mp4, .wav)
Output: Text

Endpoints

/v1/audio/transcriptions

note

Make sure to use sufficiently high-quality audio with high enough bit rates. We don't recommend MPEG with the MP2 codec, as it's too low quality and can result in cut-off or excessively long generations. Consider using Whisper if you face quality issues.

Rate limits

The effective monthly quota for the Free tier is 100 minutes/month.

Model ID​

Source​

Modality​

Endpoints​

Rate limits​

Model ID

Source

Modality

Endpoints

Rate limits