Skip to main content
Version: Next

Embeddings API

Use the Privatemode embeddings API to convert text into multidimensional text embeddings. The API is compatible with the OpenAI Embeddings API. To create embeddings, send your requests to the privatemode-proxy. Embedding requests and responses are encrypted, both in transit and during processing.

Generating embeddings

Send a POST request to the following endpoint on your proxy:

POST /v1/embeddings

This endpoint returns vector embeddings for your provided text input.

Request body

  • input (string or list of strings): The texts for which you want embeddings. The maximum length of the input depends on the model.
  • model (string): The name of the embedding model, e.g., intfloat/multilingual-e5-large-instruct.
  • dimensions (int, optional) The number of dimensions of the output embedding vector. If not specified, the model’s default is used. Note: It depends on the embedding model whether a different value than the default is supported.
  • encoding_format (string, optional): Set to "float" for a list of float values or "base64" for base64 encoded values.
info

Check available models for the model-specific input requirements

Returns

Returns an embeddings response object compatible with OpenAI's Embeddings API:

  • data: List of embedding objects (each with an embedding array and index).
  • object: Always "list".
  • model: The model used.
  • usage: Token usage statistics.

Examples

Note: To run the examples below, start the privatemode proxy with a pre-configured API key or add an authentication header to the requests.

Example request

curl localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "intfloat/multilingual-e5-large-instruct",
"encoding_format": "float"
}'

Example response

{
"id": "embd-b0f2e2ede7234a83aa5052128a239d9c",
"object": "list",
"created": 1747923707,
"model": "intfloat/multilingual-e5-large-instruct",
"data": [
{
"index": 0,
"object": "embedding",
"embedding": [
0.0351, 0.0375, -0.0050, ... // truncated for brevity
]
}
],
"usage": {
"prompt_tokens": 13,
"total_tokens": 13,
}
}

Available embedding models

To list the available embedding models, call the /v1/models endpoint or see the models overview.