Benchmarkbenchmarks-evaluationv1.0

LiveBench

by LiveBench OSS · free · Last verified 2026-04-24

LiveBench is a contamination-resistant benchmark that continuously updates with new questions sourced from recent math competitions, research papers, and news. By using only data post-dating model training cutoffs, LiveBench mitigates benchmark saturation and provides more reliable capability assessments of frontier models.

https://livebench.ai ↗

C—Below Average

Adoption: C+Quality: B+Freshness: ACitations: CEngagement: F

Specifications

License: Proprietary
Pricing: free
Capabilities
Integrations
Use Cases
API Available: No
Evaluated Models: claude-4, gpt-5, gemini-2.5-pro, deepseek-v3, llama-4-405b
Metrics: global-average, math-score, coding-score, reasoning-score
Methodology: Monthly-refreshed question sets across 6 categories. All answers programmatically verifiable without LLM judges. Questions sourced from recent events to prevent training contamination.
Last Run: 2026-03-15
Tags: benchmark, contamination-resistant, live, math, reasoning, continuously-updated
Added: 2026-04-24
Completeness: 60%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service