ERNIE-4.5-300B-A47B-PT download

ERNIE-4.5-300B-A47B-PT is a post-trained, text-only Mixture-of-Experts (MoE) model with 300 billion total parameters and 47 billion active per token. Built on Baidu's ERNIE 4.5 architecture, it benefits from advanced innovations in pretraining and routing, including modality-isolated routing and token-balanced loss—even though this variant focuses purely on text. Designed for general-purpose natural language understanding and generation, it is fine-tuned using Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Unified Preference Optimization (UPO). Developers can deploy and fine-tune it using ERNIEKit or integrate it via Hugging Face Transformers with full support for custom prompts and chat templates. It supports highly efficient inference via FastDeploy, with multiple quantized variants (WINT4, WINT8, WINT2, FP8) for a range of hardware setups.

Features

300B total parameters with 47B activated per token
Pure text-only MoE architecture optimized for LLM tasks
Post-trained with SFT, DPO, and UPO methods
Deployable with FastDeploy and Hugging Face Transformers
Multiple quantization formats: FP8, WINT4/8/2
Instruction fine-tuning and alignment with ERNIEKit
Supports context lengths up to 131,072 tokens
Includes prompt templates for web search in English and Chinese

Project Samples

Project Activity

See All Activity >

Follow ERNIE-4.5-300B-A47B-PT

ERNIE-4.5-300B-A47B-PT Web Site

Other Useful Business Software

Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now

Rate This Project

User Reviews

Be the first to post a review of ERNIE-4.5-300B-A47B-PT!

Additional Project Details

Registered

2025-06-30

Similar Business Software

DeepSeek-Coder-V2

DeepSeek-Coder-V2 is an open source code language model designed to excel in programming and mathematical reasoning tasks. It features a Mixture-of-Experts (MoE) architecture with 236 billion total parameters and 21 billion activated parameters per token, enabling efficient processing and high...

See Software
DeepSeek-V2

DeepSeek-V2 is a state-of-the-art Mixture-of-Experts (MoE) language model introduced by DeepSeek-AI, characterized by its economical training and efficient inference capabilities. With a total of 236 billion parameters, of which only 21 billion are active per token, it supports a context length...

See Software
Kimi K2

Kimi K2 is a state-of-the-art open source large language model series built on a mixture-of-experts (MoE) architecture, featuring 1 trillion total parameters and 32 billion activated parameters for task-specific efficiency. Trained with the Muon optimizer on over 15.5 trillion tokens and...

See Software