Papertrainingv1.0

DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter

by Hugging Face · free · Last verified 2026-03-17

Introduces DistilBERT, a knowledge-distilled version of BERT that retains 97% of BERT's language understanding while being 40% smaller and 60% faster. Demonstrates the effectiveness of task-agnostic knowledge distillation for pretrained language models.

https://arxiv.org/abs/1910.01108 ↗

B—Above Average

Adoption: AQuality: AFreshness: CCitations: B+Engagement: F

Specifications

License: Apache 2.0
Pricing: free
Capabilities: knowledge-distillation, efficient-inference, text-classification
Integrations: huggingface-transformers
Use Cases: edge-deployment, mobile-nlp, production-inference, text-classification
API Available: No
Tags: distilbert, knowledge-distillation, bert, efficient, compression, huggingface
Added: 2026-03-17
Completeness: 100%

Index Score

69.9

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service