CaseHOLD
by Zheng et al. / Berkeley Law / LexGLUE · free · Last verified 2026-03-17
CaseHOLD is a legal NLP benchmark for evaluating a model's ability to identify the correct holding statement for a US court case. Given a citing context, the model must choose the correct holding from a list of candidates. Sourced from over 53,000 cases, it is a core component of the LexGLUE benchmark suite for legal AI.
https://huggingface.co/datasets/lex_glue ↗C+
C+—Average
Adoption: BQuality: AFreshness: BCitations: B+Engagement: F
Specifications
- License
- CC BY 4.0
- Pricing
- free
- Capabilities
- Legal Reasoning Evaluation, Case Law Analysis, Contextual Understanding of Legal Texts, Precedent Identification, Distinguishing Nuanced Legal Statements, Multiple-Choice Question Answering, Information Retrieval from Legal Documents
- Integrations
- Use Cases
- [object Object], [object Object], [object Object], [object Object]
- API Available
- No
- Evaluated Models
- gpt-4o, claude-opus-4, legal-bert, roberta-large
- Metrics
- accuracy, macro-f1
- Methodology
- Five-way multiple-choice classification: given a legal context with a masked holding, models select the correct holding from five candidates. Evaluated on a 3,900-example test split. Macro-F1 is the primary metric.
- Last Run
- 2025-12-01
- Tags
- legal-nlp, benchmark, case-law, legal-reasoning, multiple-choice, text-classification, lex-glue, us-law, information-retrieval, ai-evaluation
- Added
- 2026-03-17
- Completeness
- 0.9%
Index Score
58.8Adoption
61
Quality
83
Freshness
62
Citations
71
Engagement
0