Skip to main content
ScriptAI Tools & APIsv1.0

LLM Regression Testing

by AaaS · open-source · Last verified 2026-03-01

Detects regressions in LLM behavior across model updates, prompt changes, or configuration modifications. Runs golden test sets, compares outputs using semantic similarity and LLM judges, and flags significant quality degradation with detailed diff reports.

https://aaas.blog/script/regression-testing-llm
C
CBelow Average
Adoption: C+Quality: AFreshness: ACitations: CEngagement: F

Specifications

License
MIT
Pricing
open-source
Capabilities
golden-set-evaluation, semantic-comparison, llm-judging, regression-detection, diff-reporting
Integrations
openai, anthropic, sentence-transformers, pytest
Use Cases
model-update-validation, prompt-change-testing, quality-monitoring, deployment-gating
API Available
No
Language
python
Dependencies
openai, anthropic, sentence-transformers, pytest, numpy
Environment
Python 3.11+
Est. Runtime
5-20 minutes depending on test set size
Tags
script, automation, regression, testing, quality
Added
2026-03-17
Completeness
100%

Index Score

48.7
Adoption
52
Quality
82
Freshness
80
Citations
46
Engagement
0

Explore the full AI ecosystem on Agents as a Service