PaperComputer Visionv1.0

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding (Imagen)

by Google Brain · free · Last verified 2026-03-17

Introduced Imagen, a text-to-image diffusion model that leverages large pretrained language models (T5-XXL) for text understanding combined with cascaded diffusion models for image synthesis. Imagen demonstrated that scaling text encoders is more impactful than scaling diffusion models, establishing DrawBench as a new evaluation benchmark.

https://arxiv.org/abs/2205.11487 ↗

C+

C+—Average

Adoption: B+Quality: A+Freshness: B+Citations: FEngagement: F

Specifications

License: Open Access
Pricing: free
Capabilities: text-to-image, cascaded-diffusion, photorealistic-synthesis
Integrations: google-vertex-ai
Use Cases: photorealistic-image-generation, creative-ai, commercial-design
API Available: Yes
Tags: imagen, text-to-image, diffusion, t5, photorealism
Added: 2026-03-17
Completeness: 100%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service