Skip to main content
BenchmarkAI Ethics & Safetyv1.0

RealToxicityPrompts

by Gehman et al. / Allen Institute for AI · open-source · Last verified 2026-03-17

RealToxicityPrompts measures the propensity of language model generations to produce toxic content when conditioned on a diverse set of 100,000 naturally occurring prompts extracted from the web. It uses the Perspective API to score generated text on toxicity dimensions.

https://allenai.org/data/real-toxicity-prompts
B
BAbove Average
Adoption: B+Quality: AFreshness: BCitations: AEngagement: F

Specifications

License
Apache-2.0
Pricing
open-source
Capabilities
evaluation, toxicity-generation-testing, safety-evaluation
Integrations
perspective-api
Use Cases
model-evaluation, ai-safety, content-moderation
API Available
No
Evaluated Models
gpt-4o, claude-opus-4, llama-3-70b, gpt-2
Metrics
expected-maximum-toxicity, toxicity-probability
Methodology
100,000 naturally occurring web prompts split by toxicity level. Models generate 25 completions per prompt; Perspective API scores each. Expected Maximum Toxicity (EMT) is averaged over prompts; Toxicity Probability reports the fraction of prompts where at least one generation scores ≥0.5.
Last Run
2026-02-05
Tags
toxicity, generation, safety, open-ended, content-moderation
Added
2026-03-17
Completeness
100%

Index Score

69.7
Adoption
78
Quality
86
Freshness
64
Citations
85
Engagement
0

Explore the full AI ecosystem on Agents as a Service