ERNIE-4.5-21B-A3B-Paddle download

ERNIE-4.5-21B-A3B-Paddle is a post-trained Mixture-of-Experts (MoE) language model from Baidu, designed for high-performance generation and understanding tasks. With 21 billion total parameters and 3 billion activated per token, it is optimized for large-scale inference using the PaddlePaddle framework. The model architecture supports efficient training and inference through advanced routing strategies, FP8 mixed-precision training, expert parallelism, and quantization. While primarily text-based, the architecture also includes vision experts for broader applicability, though this version focuses on text. ERNIE-4.5 incorporates fine-tuning methods like SFT, DPO, and UPO for performance and alignment with user preferences. It supports long context windows up to 131,072 tokens and integrates with ERNIEKit for streamlined fine-tuning. Deployment is supported via FastDeploy and is being adapted for vLLM and Hugging Face Transformers.

Features

21B parameters with 3B active per token
Text modality with extended context (131,072 tokens)
PaddlePaddle-optimized for efficient deployment
Supports SFT, DPO, UPO fine-tuning via ERNIEKit
Compatible with FastDeploy and Transformers
Expert routing and quantization for performance
Hybrid architecture includes vision expert stubs
Designed for scalable inference across GPU setups

Project Samples

Project Activity

See All Activity >

Follow ERNIE-4.5-21B-A3B-Paddle

ERNIE-4.5-21B-A3B-Paddle Web Site

Other Useful Business Software

Build Securely on AWS with Proven Frameworks

Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now

Rate This Project

User Reviews

Be the first to post a review of ERNIE-4.5-21B-A3B-Paddle!

Additional Project Details

Registered

2025-06-30

Similar Business Software

DeepSeek-V2

DeepSeek-V2 is a state-of-the-art Mixture-of-Experts (MoE) language model introduced by DeepSeek-AI, characterized by its economical training and efficient inference capabilities. With a total of 236 billion parameters, of which only 21 billion are active per token, it supports a context length...

See Software
DeepSeek-Coder-V2

DeepSeek-Coder-V2 is an open source code language model designed to excel in programming and mathematical reasoning tasks. It features a Mixture-of-Experts (MoE) architecture with 236 billion total parameters and 21 billion activated parameters per token, enabling efficient processing and high...

See Software
Kimi K2

Kimi K2 is a state-of-the-art open source large language model series built on a mixture-of-experts (MoE) architecture, featuring 1 trillion total parameters and 32 billion activated parameters for task-specific efficiency. Trained with the Muon optimizer on over 15.5 trillion tokens and...

See Software