Skip to main content
BenchmarkAI Ethics & Safetyv2.0

CyberSecEval

by Meta AI · open-source · Last verified 2026-03-17

CyberSecEval is Meta's benchmark for measuring cybersecurity risks of LLMs, covering insecure code suggestion, vulnerability exploitation assistance, malware generation, and social engineering attack facilitation. It enables safety teams to quantify the dual-use risk of code-capable models.

https://github.com/meta-llama/PurpleLlama/tree/main/CybersecurityBenchmarks
B
BAbove Average
Adoption: B+Quality: AFreshness: ACitations: B+Engagement: F

Specifications

License
MIT
Pricing
open-source
Capabilities
evaluation, cybersecurity-risk, safety-evaluation
Integrations
Use Cases
model-evaluation, ai-safety, red-teaming
API Available
No
Evaluated Models
gpt-4o, claude-opus-4, llama-3-70b, gemini-2-5-pro
Metrics
insecure-code-rate, exploit-assist-rate, malware-gen-rate
Methodology
Three evaluation axes: (1) insecure coding practices measured by CWE violation rate in generated code via static analysis; (2) cyberattack assistance measured by compliance rate with attack-setup prompts; (3) prompt injection resistance. Lower scores indicate safer behavior.
Last Run
2026-02-26
Tags
cybersecurity, safety, code, insecure-code, social-engineering
Added
2026-03-17
Completeness
100%

Index Score

63.8
Adoption
70
Quality
89
Freshness
85
Citations
72
Engagement
0

Explore the full AI ecosystem on Agents as a Service