distilbert-base-uncased is a compact, faster alternative to BERT developed through a distillation process. It retains 97% of BERT's language understanding performance while being 40% smaller and 60% faster. Trained on English Wikipedia and BookCorpus, it was distilled using BERT base as the teacher model with three objectives: distillation loss, masked language modeling (MLM), and cosine embedding loss. The model is uncased (treats "english" and "English" as the same) and is suitable for a wide range of downstream NLP tasks like sequence classification, token classification, or question answering. While efficient, it inherits biases present in the original BERT model. DistilBERT is available under the Apache 2.0 license and is compatible with PyTorch, TensorFlow, and JAX.

Features

  • 40% smaller and 60% faster than BERT base
  • Trained with distillation, MLM, and cosine loss
  • Achieves 97% of BERT's performance on GLUE benchmarks
  • Pretrained on BookCorpus and English Wikipedia
  • Uncased: capitalization is ignored
  • Ideal for fine-tuning on classification and QA tasks
  • Available for PyTorch, TensorFlow, and JAX

Project Samples

Project Activity

See All Activity >

Categories

AI Models

Follow distilbert-base-uncased

distilbert-base-uncased Web Site

Other Useful Business Software
Our Free Plans just got better! | Auth0 Icon
Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
Try free now
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of distilbert-base-uncased!

Additional Project Details

Registered

2025-07-01