Skip to main content
BenchmarkLLMsv2024-08

LiveBench

by LiveBench Team · open-source · Last verified 2026-03-01

Continuously updated benchmark with new questions released monthly to prevent data contamination. Covers math, coding, reasoning, language, data analysis, and instruction following with automatically verifiable answers that do not require LLM judges.

https://livebench.ai
B
BAbove Average
Adoption: BQuality: AFreshness: A+Citations: BEngagement: F

Specifications

License
Apache-2.0
Pricing
open-source
Capabilities
model-evaluation, contamination-free-testing, dynamic-assessment
Integrations
livebench-api
Use Cases
contamination-free-evaluation, ongoing-model-comparison, research
API Available
No
Evaluated Models
claude-4, gpt-5, gemini-2.5-pro, deepseek-v3, llama-4-405b
Metrics
global-average, math-score, coding-score, reasoning-score
Methodology
Monthly-refreshed question sets across 6 categories. All answers programmatically verifiable without LLM judges. Questions sourced from recent events to prevent training contamination.
Last Run
2026-03-15
Tags
benchmark, evaluation, live, contamination-free, dynamic
Added
2026-03-17
Completeness
100%

Index Score

60.3
Adoption
68
Quality
88
Freshness
96
Citations
62
Engagement
0

Explore the full AI ecosystem on Agents as a Service