Translations API
Use the Privatemode translations API to translate audio files to English text. The API is compatible with the OpenAI translations API. To translate audio, send your requests to the privatemode-proxy. Audio requests and responses are encrypted, both in transit and during processing.
Generating translations
Send a POST form request to the following endpoint on your proxy:
POST /v1/audio/translations
This endpoint generates an English translation of the provided audio file.
Request body
model
(string): The name of the model to use for translation, e.g.,openai/whisper-large-v3
.file
(file): The audio file to translate. Supported formats areflac
,mp3
,mp4
,mpeg
,mpga
,m4a
,ogg
,wav
, andwebm
.language
(string, optional): The language of the audio in ISO-639-1 (e.g.de
) format. Not setting the correct language can lead to poor accuracy and performance.prompt
(string, optional): An optional text to guide the model's style or continue a previous audio segment. The prompt should match the audio language.- For additional parameters see the vLLM translations API documentation.
Returns
The translated text.
Examples
Note: To run the examples below, start the privatemode-proxy with a pre-configured API key or add an authentication header to the requests.
Example request
curl localhost:8080/v1/audio/translations \
-H "Content-Type: multipart/form-data" \
-F 'model=openai/whisper-large-v3' \
-F 'language=de' \
-F 'file=@hallo_welt.mp3'
Example response
{
"text": "Hello World."
}
Available translation models
To list the available translation models, call the /v1/models
endpoint or see the models overview.
Privatemode's serving backend only supports files up to 25 MB in size. For larger files, consider splitting the audio into smaller segments, or try compressing the file to reduce its size.