Skip to main content
ScriptAI Tools & APIsv1.0

LLM Regression Testing

by AaaS · open-source · Last verified 2026-03-01

Detects regressions in LLM behavior across model updates, prompt changes, or configuration modifications. Runs golden test sets, compares outputs using semantic similarity and LLM judges, and flags significant quality degradation with detailed diff reports.

https://aaas.blog/script/regression-testing-llm
D
DPoor
Adoption: C+Quality: AFreshness: ACitations: FEngagement: F

Specifications

License
MIT
Pricing
open-source
Capabilities
golden-set-evaluation, semantic-comparison, llm-judging, regression-detection, diff-reporting
Integrations
openai, anthropic, sentence-transformers, pytest
Use Cases
model-update-validation, prompt-change-testing, quality-monitoring, deployment-gating
API Available
No
Language
python
Dependencies
openai, anthropic, sentence-transformers, pytest, numpy
Environment
Python 3.11+
Est. Runtime
5-20 minutes depending on test set size
Tags
script, automation, regression, testing, quality
Added
2026-03-17
Completeness
80%

Index Score

37
Adoption
52
Quality
82
Freshness
80
Citations
0
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service