MLPerf Inference
by MLCommons · open-source · Last verified 2026-03-17
MLPerf Inference is the industry-standard benchmark for measuring AI inference performance across hardware platforms. It covers image classification, object detection, NLP, speech recognition, and generative AI workloads, enabling fair apples-to-apples comparison of accelerators and inference stacks.
https://mlcommons.org/benchmarks/inference/ ↗B+
B+—Good
Adoption: AQuality: A+Freshness: ACitations: AEngagement: F
Specifications
- License
- Apache-2.0
- Pricing
- open-source
- Capabilities
- evaluation, inference-benchmarking, hardware-evaluation
- Integrations
- tensorrt, onnx, triton
- Use Cases
- model-evaluation, hardware-selection, inference-optimization
- API Available
- No
- Evaluated Models
- bert-large, llama-3-70b, stable-diffusion-xl, resnet-50
- Metrics
- samples-per-second, latency-ms, tokens-per-second
- Methodology
- Standardized scenarios (single-stream, multi-stream, server, offline) run on physical hardware with reproducible software stacks. Results submitted to MLCommons and peer-reviewed before publication. Closed and open divisions allow different optimization levels.
- Last Run
- 2026-02-25
- Tags
- inference, throughput, latency, hardware, mlcommons
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
73.1Adoption
85
Quality
93
Freshness
85
Citations
82
Engagement
0