Embeddings API
Use the Privatemode embeddings API to convert text into multidimensional text embeddings. The API is compatible with the OpenAI Embeddings API. To create embeddings, send your requests to the privatemode-proxy. Embedding requests and responses are encrypted, both in transit and during processing.
Generating embeddings
Send a POST request to the following endpoint on your proxy:
POST /v1/embeddings
This endpoint returns vector embeddings for your provided text input.
Request body
input
(string or list of strings): The texts for which you want embeddings. The maximum length of the input depends on the model.model
(string): The name of the embedding model, e.g.,intfloat/multilingual-e5-large-instruct
.dimensions
(int, optional) The number of dimensions of the output embedding vector. If not specified, the model’s default is used. Note: It depends on the embedding model whether a different value than the default is supported.encoding_format
(string, optional): Set to"float"
for a list of float values or"base64"
for base64 encoded values.
Check available models for the model-specific input requirements
Returns
Returns an embeddings response object compatible with OpenAI's Embeddings API:
data
: List of embedding objects (each with anembedding
array andindex
).object
: Always"list"
.model
: The model used.usage
: Token usage statistics.
Examples
Note: To run the examples below, start the privatemode proxy with a pre-configured API key or add an authentication header to the requests.
- Default
- Batch
- Python
Example request
curl localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "intfloat/multilingual-e5-large-instruct",
"encoding_format": "float"
}'
Example response
{
"id": "embd-b0f2e2ede7234a83aa5052128a239d9c",
"object": "list",
"created": 1747923707,
"model": "intfloat/multilingual-e5-large-instruct",
"data": [
{
"index": 0,
"object": "embedding",
"embedding": [
0.0351, 0.0375, -0.0050, ... // truncated for brevity
]
}
],
"usage": {
"prompt_tokens": 13,
"total_tokens": 13,
}
}
Example request (batch input)
curl localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": [
"The food was delicious and the waiter...",
"I would definitely come back again!"
],
"model": "intfloat/multilingual-e5-large-instruct"
}'
Example response
{
"id": "embd-584a54ff36c84996b6ce667339ea3f40",
"created": 1747924226,
"model": "intfloat/multilingual-e5-large-instruct",
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [ 0.0351, ... ] // truncated
},
{
"object": "embedding",
"index": 1,
"embedding": [ 0.0096, ... ] // truncated
}
],
"usage": {
"prompt_tokens": 22,
"total_tokens": 22
}
}
Example usage with OpenAI Python client
# Use the OpenAI client to connect to the privatemode proxy.
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="http://localhost:8080/v1",
)
# Find an embedding model to use.
models = client.models.list()
embed_models = [m for m in models.data if "embed" in m.tasks]
model = embed_models[0].id
# Create embeddings.
print("Embedding model:", model)
responses = client.embeddings.create(
input=[
"Hello my name is",
"Edgeless enables confidential and privacy-preserving AI"
],
model=model,
)
[print(f"dim: {len(r.embedding)}, embedding: {r.embedding[:3]}...") for r in responses.data]
Output
Embedding model: intfloat/multilingual-e5-large-instruct
dim: 1024, embedding: [0.032440185546875, 0.004032135009765625, -0.01043701171875]...
dim: 1024, embedding: [0.0236663818359375, 0.035919189453125, -0.0012216567993164062]...
Available embedding models
To list the available embedding models, call the /v1/models
endpoint or see the models overview.