PaperLLMsv1.0

Flamingo: a Visual Language Model for Few-Shot Learning

by DeepMind · free · Last verified 2026-03-17

Introduced Flamingo, a family of visual language models that bridge powerful pretrained vision and language models, enabling few-shot learning on a diverse range of multimodal tasks by training on arbitrarily interleaved sequences of images, video, and text. Flamingo set new few-shot state-of-the-art on 16 benchmarks.

https://arxiv.org/abs/2204.14198 ↗

C+

C+—Average

Adoption: B+Quality: A+Freshness: B+Citations: FEngagement: F

Specifications

License: Open Access
Pricing: free
Capabilities: visual-question-answering, image-captioning, few-shot-learning, video-understanding
Integrations
Use Cases: few-shot-multimodal-tasks, vqa, image-captioning
API Available: No
Tags: flamingo, multimodal, few-shot, vision-language, deepmind
Added: 2026-03-17
Completeness: 100%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service