Messages API
Use the Privatemode messages API to generate text from a prompt via a large language model. The API is compatible with the Anthropic Messages API. To generate text, send your requests to the Privatemode proxy. Message requests and responses are encrypted, both in transit and during processing.
As of now, the messages API doesn't support reporting of cached tokens. Therefore, all tokens used with the messages API will be billed as non-cached.
Example prompting
For prompting, use the following proxy endpoint:
POST /v1/messages
This endpoint generates a response to a message prompt.
Request body
modelstring: The name of a currently available model. Note that models are updated regularly, and support for older models is discontinued over time. UseGET /v1/modelsto get a list of available models as described in the models API.max_tokensinteger: The maximum number of tokens to generate.messageslist: The conversation messages for which a response is generated.streamboolean (optional): Whether to stream the response. Defaults tofalse.systemstring (optional): A system prompt to guide the model's behavior.- Additional parameters: These mirror the Anthropic API and are supported based on the model server's capabilities.
Returns
The response is a message object containing:
idstring: Unique identifier for the message.typestring: Object type, always"message".rolestring: The role of the generated message, always"assistant".contentarray: The content blocks generated by the model.modelstring: The model that generated the response.stop_reasonstring: The reason the model stopped generating.usageobject: Token usage statistics.
- Default
- Streaming
Example request
#!/usr/bin/env bash
curl http://localhost:8080/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "<model>",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Tell me a joke."}
]
}'
Example response
{
"id": "msg-1234567890",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "Why don't scientists trust atoms?\n\nBecause they make up everything!"
}
],
"model": "<model>",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 12,
"output_tokens": 18
}
}
Example request
#!/usr/bin/env bash
curl http://localhost:8080/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "<model>",
"max_tokens": 1024,
"stream": true,
"messages": [
{"role": "user", "content": "Tell me a joke."}
]
}'
Example response
event: message_start
data: {"type":"message_start","message":{"id":"msg-1234567890","type":"message","role":"assistant","content":[],"model":"<model>","stop_reason":null,"usage":{"input_tokens":12,"output_tokens":0}}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Why"}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" don't"}}
...
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":18}}
event: message_stop
data: {"type":"message_stop"}
System prompts
You can set a system prompt using the system field in the request to guide the model's behavior:
{
"model": "<model>",
"max_tokens": 1024,
"system": "You are a helpful assistant that speaks like a pirate.",
"messages": [
{"role": "user", "content": "Hello!"}
]
}
Available models
To list the available models, call the /v1/models endpoint or see the models overview.