Skip to main content
PaperLLMsv1.0

Gemini: A Family of Highly Capable Multimodal Models

by Google DeepMind · freemium · Last verified 2026-03-17

Introduced the Gemini family of multimodal models (Ultra, Pro, Nano) natively trained to process and combine text, images, audio, and video. Gemini Ultra is the first model to surpass human expert performance on MMLU and achieves state-of-the-art across 30 of 32 benchmarks evaluated.

https://arxiv.org/abs/2312.11805
B+
B+Good
Adoption: A+Quality: A+Freshness: ACitations: AEngagement: F

Specifications

License
Proprietary
Pricing
freemium
Capabilities
text-generation, image-understanding, audio-understanding, video-understanding, code-generation
Integrations
google-ai-studio, vertex-ai
Use Cases
multimodal-reasoning, code-generation, document-understanding, on-device-inference
API Available
Yes
Tags
gemini, multimodal, google, deepmind, foundation-model
Added
2026-03-17
Completeness
100%

Index Score

77.8
Adoption
92
Quality
95
Freshness
84
Citations
88
Engagement
0

Put AI to work for your business

Deploy this paper alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.

Explore the full AI ecosystem on Agents as a Service