Meta Llama 3.3 70B Instruct
The Meta Llama 3.3 Instruct 70B multilingual model is an instruction tuned generative model in 70B (text in/text out). Privatemode provides a variant of this model that was quantized using AutoAWQ from FP16 down to INT4 using GEMM kernels, with zero-point quantization and a group size of 128.
Model ID
ibnzterrell/Meta-Llama-3.3-70B-Instruct-AWQ-INT4
Source
Modality
- Input: text
- Output: text
Features
Context limit
- Context window: 70k tokens
- Max output length: 4028