Version: 1.35

Messages API

Use the Privatemode messages API to generate text from a prompt via a large language model. The API is compatible with the Anthropic Messages API. To generate text, send your requests to the Privatemode proxy. Message requests and responses are encrypted, both in transit and during processing.

info

As of now, the messages API doesn't support reporting of cached tokens. Therefore, all tokens used with the messages API will be billed as non-cached.

Example prompting

For prompting, use the following proxy endpoint:

POST /v1/messages

This endpoint generates a response to a message prompt.

Request body

model string: The name of a currently available model. Note that models are updated regularly, and support for older models is discontinued over time. Use GET /v1/models to get a list of available models as described in the models API.
max_tokens integer: The maximum number of tokens to generate.
messages list: The conversation messages for which a response is generated.
stream boolean (optional): Whether to stream the response. Defaults to false.
system string (optional): A system prompt to guide the model's behavior.
Additional parameters: These mirror the Anthropic API and are supported based on the model server's capabilities.

Returns

The response is a message object containing:

id string: Unique identifier for the message.
type string: Object type, always "message".
role string: The role of the generated message, always "assistant".
content array: The content blocks generated by the model.
model string: The model that generated the response.
stop_reason string: The reason the model stopped generating.
usage object: Token usage statistics.

Default
Streaming

Example request

#!/usr/bin/env bash

curl http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<model>",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Tell me a joke."}
    ]
  }'

Example response

{
  "id": "msg-1234567890",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "Why don't scientists trust atoms?\n\nBecause they make up everything!"
    }
  ],
  "model": "<model>",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 12,
    "output_tokens": 18
  }
}

Example request

#!/usr/bin/env bash

curl http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "<model>",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Tell me a joke."}
    ]
  }'

Example response

event: message_start
data: {"type":"message_start","message":{"id":"msg-1234567890","type":"message","role":"assistant","content":[],"model":"<model>","stop_reason":null,"usage":{"input_tokens":12,"output_tokens":0}}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Why"}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" don't"}}

    ...

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn"},"usage":{"output_tokens":18}}

event: message_stop
data: {"type":"message_stop"}

System prompts

You can set a system prompt using the system field in the request to guide the model's behavior:

{
  "model": "<model>",
  "max_tokens": 1024,
  "system": "You are a helpful assistant that speaks like a pirate.",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ]
}

Available models

To list the available models, call the /v1/models endpoint or see the models overview.

Example prompting​

Request body​

Returns​

System prompts​

Available models​

Example prompting

Request body

Returns

System prompts

Available models