Skip to main content
PaperComputer Visionv1.0

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding (Imagen)

by Google Brain · free · Last verified 2026-03-17

Introduced Imagen, a text-to-image diffusion model that leverages large pretrained language models (T5-XXL) for text understanding combined with cascaded diffusion models for image synthesis. Imagen demonstrated that scaling text encoders is more impactful than scaling diffusion models, establishing DrawBench as a new evaluation benchmark.

https://arxiv.org/abs/2205.11487
B+
B+Good
Adoption: B+Quality: A+Freshness: B+Citations: AEngagement: F

Specifications

License
Open Access
Pricing
free
Capabilities
text-to-image, cascaded-diffusion, photorealistic-synthesis
Integrations
google-vertex-ai
Use Cases
photorealistic-image-generation, creative-ai, commercial-design
API Available
Yes
Tags
imagen, text-to-image, diffusion, t5, photorealism
Added
2026-03-17
Completeness
100%

Index Score

72.2
Adoption
78
Quality
95
Freshness
75
Citations
88
Engagement
0

Put AI to work for your business

Deploy this paper alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.

Explore the full AI ecosystem on Agents as a Service