Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding (Imagen)
by Google Brain · free · Last verified 2026-03-17
Introduced Imagen, a text-to-image diffusion model that leverages large pretrained language models (T5-XXL) for text understanding combined with cascaded diffusion models for image synthesis. Imagen demonstrated that scaling text encoders is more impactful than scaling diffusion models, establishing DrawBench as a new evaluation benchmark.
https://arxiv.org/abs/2205.11487 ↗B+
B+—Good
Adoption: B+Quality: A+Freshness: B+Citations: AEngagement: F
Specifications
- License
- Open Access
- Pricing
- free
- Capabilities
- text-to-image, cascaded-diffusion, photorealistic-synthesis
- Integrations
- google-vertex-ai
- Use Cases
- photorealistic-image-generation, creative-ai, commercial-design
- API Available
- Yes
- Tags
- imagen, text-to-image, diffusion, t5, photorealism
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
72.2Adoption
78
Quality
95
Freshness
75
Citations
88
Engagement
0
Put AI to work for your business
Deploy this paper alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.