Skip to main content
brand
context
industry
strategy
AaaS
DatasetLLMsv1.0

LAION-400M Text Captions

by LAION · free · Last verified 2026-03-17

The text caption component of the LAION-400M dataset, offering 400 million English alt-text captions. These captions were scraped from the web and filtered using CLIP to ensure a minimum similarity to their corresponding images. The text is used independently for large-scale NLP and multimodal research.

https://laion.ai/blog/laion-400-open-dataset/
B
BAbove Average
Adoption: B+Quality: B+Freshness: CCitations: AEngagement: F

Specifications

License
CC-BY-4.0
Pricing
free
Capabilities
caption-generation, image-text-alignment, concept-grounding, large-scale-language-model-training, multimodal-model-pre-training, visual-question-answering-dataset-creation, zero-shot-classification-research, text-to-image-model-training
Integrations
Use Cases
[object Object], [object Object], [object Object], [object Object]
API Available
Yes
Tags
nlp, captions, image-text, multilingual, clip, large-scale, web-scraped, multimodal-research, dataset, natural-language-processing, computer-vision
Added
2026-03-17
Completeness
0.8%

Index Score

66.3
Adoption
74
Quality
76
Freshness
45
Citations
86
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service