PaperLLMsv1.0

Training Compute-Optimal Large Language Models (Chinchilla)

by DeepMind · free · Last verified 2026-03-17

Challenges the Kaplan et al. scaling laws by showing that model size and training tokens should scale equally. Trains Chinchilla (70B) on 4× more data than Gopher, matching or beating models 4× its size, redefining compute-optimal training strategies.

https://arxiv.org/abs/2203.15556 ↗

C+

C+—Average

Adoption: AQuality: A+Freshness: BCitations: FEngagement: F

Specifications

License: Open Access
Pricing: free
Capabilities: language-modeling, reasoning, scaling-analysis
Integrations
Use Cases: language-modeling, compute-optimal-training, research
API Available: No
Tags: chinchilla, scaling-laws, compute-optimal, deepmind, training, foundational
Added: 2026-03-17
Completeness: 100%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service