Agent Evaluation Framework
by AaaS · open-source · Last verified 2026-03-01
Evaluates AI agent performance across defined test scenarios with success criteria, step tracking, and automated scoring. Supports custom evaluation rubrics, regression detection, and generates detailed reports comparing agent versions over time.
https://aaas.blog/script/agent-evaluation-framework ↗C
C—Below Average
Adoption: C+Quality: AFreshness: ACitations: CEngagement: F
Specifications
- License
- MIT
- Pricing
- open-source
- Capabilities
- scenario-testing, success-criteria-evaluation, step-tracking, regression-detection, report-generation
- Integrations
- @anthropic-ai/sdk, openai, vitest, zod
- Use Cases
- agent-quality-assurance, regression-testing, capability-assessment, version-comparison
- API Available
- No
- Language
- typescript
- Dependencies
- @anthropic-ai/sdk, openai, vitest, zod, winston
- Environment
- Node.js 20+
- Est. Runtime
- 5-30 minutes depending on scenario count
- Tags
- script, automation, evaluation, testing, agents
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
48.3Adoption
50
Quality
84
Freshness
86
Citations
46
Engagement
0