ERNIE-4.5-21B-A3B-PT is Baidu’s post-trained Mixture-of-Experts (MoE) large language model optimized for text understanding and generation. With 21 billion total parameters and 3 billion active per token, it delivers high efficiency in both performance and resource usage. The model was trained using a multimodal pre-training setup, but this version focuses solely on text and is tailored for post-training inference. It supports advanced fine-tuning strategies like Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Unified Preference Optimization (UPO). ERNIE-4.5 uses a modular MoE architecture with 64 text experts, of which 6 are activated per token, and offers extremely long context length support (up to 131,072 tokens). It integrates with Hugging Face Transformers and PaddlePaddle, and is compatible with ERNIEKit and FastDeploy for streamlined training and deployment. This model is also being adapted for use with vLLM for faster inference.
Features
- 21B parameters with 3B active per token
- Optimized for post-training language tasks
- Long context length up to 131,072 tokens
- 64 text experts with 6 active per token
- Supports SFT, DPO, UPO fine-tuning methods
- Compatible with PaddlePaddle, Hugging Face Transformers
- Designed for fast inference via FastDeploy and vLLM
- Apache 2.0 license for commercial use