BenchmarkAI for Codev1.0

Codeforces Benchmark

by Codeforces / Community · open-source · Last verified 2026-03-01

Evaluates models on competitive programming problems from the Codeforces platform across difficulty ratings. Tests algorithmic thinking, data structure knowledge, and the ability to produce correct and efficient solutions under competitive constraints.

https://codeforces.com ↗

C—Below Average

Adoption: BQuality: AFreshness: ACitations: FEngagement: F

Specifications

License: CC-BY-4.0
Pricing: open-source
Capabilities: model-evaluation, algorithmic-testing, competitive-programming-assessment
Integrations: codeforces-api
Use Cases: algorithmic-ability-testing, competitive-programming-evaluation, research
API Available: No
Evaluated Models: claude-4, gpt-5, gemini-2.5-pro, deepseek-v3
Metrics: pass-rate, elo-rating
Methodology: Problems from Codeforces rated 800-3000. Models generate solutions judged by the online judge for correctness and time/memory limits.
Last Run: 2026-03-01
Tags: benchmark, evaluation, competitive-programming, algorithms, problem-solving
Added: 2026-03-17
Completeness: 80%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service