Skip to main content
Version: 1.35

Qwen3-Embedding-4B

The Qwen3 Embedding 4b model is a model from the Qwen family, specifically designed for text embedding tasks. The model inherits the multilingual capabilities skills of its foundational model.

Model ID

qwen3-embedding-4b

Source

Hugging face

Modality

  • Input: text
  • Output: embedding vector

Context limit

  • Embedding size:
    • Matryoshka model trained for 32 to 2560 dimensions.
    • You can request 1024 or 2560 output dimensions via the dimensions field in the request. For many tasks, 1024 is sufficient with performance equal to 2560 while significantly reducing storage.
    • For other dimensionalities, truncate the returned vector client-side and re-normalize it afterward.
  • Maximum input size: 32768 tokens

Endpoints

Rate limits

This model has a rate limit multiplier of 0.1. The effective rate limit for the Free and Standard tier is 1,000,000 prompt tokens/minute. The effective monthly quota for the Free tier is 10,000,000 tokens/month.