Qwen3-Embedding-4B
The Qwen3 Embedding 4b model is a model from the Qwen family, specifically designed for text embedding tasks. The model inherits the multilingual capabilities skills of its foundational model.
Model ID
qwen3-embedding-4b
Source
Modality
- Input: text
- Output: embedding vector
Context limit
- Embedding size:
- Matryoshka model trained for 32 to 2560 dimensions.
- You can request 1024 or 2560 output dimensions via the
dimensionsfield in the request. For many tasks, 1024 is sufficient with performance equal to 2560 while significantly reducing storage. - For other dimensionalities, truncate the returned vector client-side and re-normalize it afterward.
- Maximum input size: 32768 tokens
Endpoints
Rate limits
This model has a rate limit multiplier of 0.1. The effective rate limit for the Free and Standard tier is 1,000,000 prompt tokens/minute. The effective monthly quota for the Free tier is 10,000,000 tokens/month.