MGSM
by Google Research · free · Last verified 2026-03-01
MGSM (Multilingual Grade School Math) is a benchmark for evaluating the mathematical reasoning of large language models across multiple languages. It consists of 250 grade-school math problems from the GSM8K dataset, professionally translated into ten typologically diverse languages, including low-resource ones like Swahili and Telugu.
https://github.com/google-research/url-nlp/tree/main/mgsm ↗B
B—Above Average
Adoption: B+Quality: AFreshness: B+Citations: BEngagement: F
Specifications
- License
- Apache-2.0
- Pricing
- free
- Capabilities
- Evaluating multilingual mathematical reasoning, Benchmarking large language models (LLMs), Assessing cross-lingual transfer learning, Testing numerical and algebraic reasoning skills, Supporting evaluation in 10 languages: Bengali, Chinese, French, German, Japanese, Russian, Spanish, Swahili, Telugu, and Thai, Analyzing model performance on low-resource languages
- Integrations
- Use Cases
- [object Object], [object Object], [object Object], [object Object]
- API Available
- No
- Evaluated Models
- claude-4, gpt-5, gemini-2.5-pro, deepseek-v3
- Metrics
- average-accuracy, per-language-accuracy
- Methodology
- 250 GSM8K problems translated into 10 languages by professional translators. Models solve problems in each language with chain-of-thought prompting.
- Last Run
- 2026-02-01
- Tags
- benchmark, evaluation, math, multilingual, reasoning, llm-evaluation, cross-lingual-transfer, grade-school-math, numerical-reasoning, natural-language-understanding
- Added
- 2026-03-17
- Completeness
- 0.8%
Index Score
61.4Adoption
70
Quality
82
Freshness
76
Citations
68
Engagement
0