Version: 1.37

Qwen3-Embedding-4B

The Qwen3 Embedding 4b model is a model from the Qwen family, specifically designed for text embedding tasks. The model inherits the multilingual capabilities skills of its foundational model.

Model ID

qwen3-embedding-4b

Source

Hugging face

Modality

Input: text
Output: embedding vector

Context limit

Embedding size:
- Matryoshka model trained for 32 to 2560 dimensions.
- You can request 1024 or 2560 output dimensions via the dimensions field in the request. For many tasks, 1024 is sufficient with performance equal to 2560 while significantly reducing storage.
- For other dimensionalities, truncate the returned vector client-side and re-normalize it afterward.
Maximum input size: 32768 tokens

Endpoints

/v1/embeddings

Rate limits

This model has a rate limit multiplier of 0.1. The effective rate limit for the Free and Standard tier is 1,000,000 prompt tokens/minute. The effective monthly quota for the Free tier is 10,000,000 tokens/month.

Model ID​

Source​

Modality​

Context limit​

Endpoints​

Rate limits​