Version: Next

Embeddings API

Use the Privatemode embeddings API to convert text into multidimensional text embeddings. The API is compatible with the OpenAI Embeddings API. To create embeddings, send your requests to the privatemode-proxy. Embedding requests and responses are encrypted, both in transit and during processing.

Generating embeddings

Send a POST request to the following endpoint on your proxy:

POST /v1/embeddings

This endpoint returns vector embeddings for your provided text input.

Request body

input (string or list of strings): The texts for which you want embeddings. The maximum length of the input depends on the model.
model (string): The name of the embedding model, e.g., intfloat/multilingual-e5-large-instruct.
dimensions (int, optional) The number of dimensions of the output embedding vector. If not specified, the model’s default is used. Note: It depends on the embedding model whether a different value than the default is supported.
encoding_format (string, optional): Set to "float" for a list of float values or "base64" for base64 encoded values.

info

Check available models for the model-specific input requirements

Returns

Returns an embeddings response object compatible with OpenAI's Embeddings API:

data: List of embedding objects (each with an embedding array and index).
object: Always "list".
model: The model used.
usage: Token usage statistics.

Examples

Note: To run the examples below, start the privatemode proxy with a pre-configured API key or add an authentication header to the requests.

Default
Batch
Python

Example request

curl localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "input": "The food was delicious and the waiter...",
    "model": "intfloat/multilingual-e5-large-instruct",
    "encoding_format": "float"
  }'

Example response

{
  "id": "embd-b0f2e2ede7234a83aa5052128a239d9c",
  "object": "list",
  "created": 1747923707,
  "model": "intfloat/multilingual-e5-large-instruct",
  "data": [
    {
      "index": 0,
      "object": "embedding",
      "embedding": [
        0.0351, 0.0375, -0.0050, ... // truncated for brevity
      ]
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "total_tokens": 13,
  }
}

Example request (batch input)

curl localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "input": [
      "The food was delicious and the waiter...",
      "I would definitely come back again!"
    ],
    "model": "intfloat/multilingual-e5-large-instruct"
  }'

Example response

{
  "id": "embd-584a54ff36c84996b6ce667339ea3f40",
  "created": 1747924226,
  "model": "intfloat/multilingual-e5-large-instruct",
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [ 0.0351, ... ]  // truncated
    },
    {
      "object": "embedding",
      "index": 1,
      "embedding": [ 0.0096, ... ]  // truncated
    }
  ],
  "usage": {
    "prompt_tokens": 22,
    "total_tokens": 22
  }
}

Example usage with OpenAI Python client

# Use the OpenAI client to connect to the privatemode proxy.
from openai import OpenAI
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="http://localhost:8080/v1",
)

# Find an embedding model to use.
models = client.models.list()
embed_models = [m for m in models.data if "embed" in m.tasks]
model = embed_models[0].id

# Create embeddings.
print("Embedding model:", model)
responses = client.embeddings.create(
    input=[
        "Hello my name is",
        "Edgeless enables confidential and privacy-preserving AI"
    ],
    model=model,
)
[print(f"dim: {len(r.embedding)}, embedding: {r.embedding[:3]}...") for r in responses.data]

Output

Embedding model: intfloat/multilingual-e5-large-instruct
dim: 1024, embedding: [0.032440185546875, 0.004032135009765625, -0.01043701171875]...
dim: 1024, embedding: [0.0236663818359375, 0.035919189453125, -0.0012216567993164062]...

Available embedding models

To list the available embedding models, call the /v1/models endpoint or see the models overview.

Generating embeddings​

Request body​

Returns​

Examples​

Available embedding models​

Generating embeddings

Request body

Returns

Examples

Available embedding models