Paperai-evaluationv1.0

Holistic Evaluation of Text-To-Image Models

by Stanford CRFM · free · Last verified 2026-03-17

Presents HEIM, a comprehensive framework for evaluating text-to-image models across 12 aspects like alignment, quality, aesthetics, bias, and toxicity. The study benchmarks 26 models, revealing that no single model excels in all areas and highlighting significant safety gaps in current generative AI.

https://arxiv.org/abs/2311.04287 ↗

C—Below Average

Adoption: BQuality: AFreshness: BCitations: FEngagement: F

Specifications

License: Apache-2.0
Pricing: free
Capabilities: text-to-image model evaluation, multi-aspect performance assessment, social bias and fairness auditing, toxicity and safety analysis, image quality and aesthetics scoring, originality and compositionality testing, reasoning and knowledge evaluation, comparative model benchmarking
Integrations
Use Cases: [object Object], [object Object], [object Object], [object Object]
API Available: No
Tags: evaluation, text-to-image, holistic-evaluation, benchmark, multimodal-ai, ai-safety, generative-ai, responsible-ai, model-comparison, ai-ethics
Added: 2026-03-17
Completeness: 0.9%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service