Llama-3.2-1B-Instruct is Meta’s multilingual, instruction-tuned large language model with 1.24 billion parameters, optimized for dialogue, summarization, and retrieval tasks. It builds upon the Llama 3.1 architecture and incorporates fine-tuning techniques like SFT, DPO, and quantization-aware training for improved alignment, efficiency, and safety. The model supports eight primary languages (including English, Spanish, Hindi, and Thai) and was trained on a curated mix of publicly available online data, with a December 2023 knowledge cutoff. Llama-3.2-1B is lightweight enough for deployment on constrained devices like smartphones, using formats like SpinQuant and QLoRA to reduce model size and latency. Despite its small size, it performs competitively across benchmarks such as MMLU, ARC, and TLDR summarization. The model is distributed under the Llama 3.2 Community License, requiring attribution and adherence to Meta’s Acceptable Use Policy.
Features
- 1.24B parameter instruction-tuned LLM optimized for dialogue tasks
- Multilingual support for 8+ languages with text and code output
- Uses supervised fine-tuning, DPO, and rejection sampling
- Efficient on-device deployment with SpinQuant and QLoRA
- Supports context lengths up to 128k tokens
- Fine-tuned with safety-focused datasets and refusal strategies
- Outperforms many open and closed models in 1B–3B parameter range
- Released under Meta’s Llama 3.2 Community License with clear use guidelines