Skip to main content
Version: 1.23

Translations API

Use the Privatemode translations API to translate audio files to English text. The API is compatible with the OpenAI translations API. To translate audio, send your requests to the privatemode-proxy. Audio requests and responses are encrypted, both in transit and during processing.

Generating translations

Send a POST form request to the following endpoint on your proxy:

POST /v1/audio/translations

This endpoint generates an English translation of the provided audio file.

Request body

  • model (string): The name of the model to use for translation, e.g., openai/whisper-large-v3.
  • file (file): The audio file to translate. Supported formats are flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, and webm.
  • language (string, optional): The language of the audio in ISO-639-1 (e.g. de) format. Not setting the correct language can lead to poor accuracy and performance.
  • prompt (string, optional): An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.
  • For additional parameters see the vLLM translations API documentation.

Returns

The translated text.

Examples

Note: To run the examples below, start the privatemode-proxy with a pre-configured API key or add an authentication header to the requests.

Example request

curl localhost:8080/v1/audio/translations \
-H "Content-Type: multipart/form-data" \
-F 'model=openai/whisper-large-v3' \
-F 'language=de' \
-F 'file=@hallo_welt.mp3'

Example response

{
"text": "Hello World."
}

Available translation models

To list the available translation models, call the /v1/models endpoint or see the models overview.

warning

Privatemode's serving backend only supports files up to 25 MB in size. For larger files, consider splitting the audio into smaller segments, or try compressing the file to reduce its size.