PaperLLMsv1.0

Gemini: A Family of Highly Capable Multimodal Models

by Google DeepMind · freemium · Last verified 2026-03-17

Introduced the Gemini family of multimodal models (Ultra, Pro, Nano) natively trained to process and combine text, images, audio, and video. Gemini Ultra is the first model to surpass human expert performance on MMLU and achieves state-of-the-art across 30 of 32 benchmarks evaluated.

https://arxiv.org/abs/2312.11805 ↗

C+

C+—Average

Adoption: A+Quality: A+Freshness: ACitations: FEngagement: F

Specifications

License: Proprietary
Pricing: freemium
Capabilities: text-generation, image-understanding, audio-understanding, video-understanding, code-generation
Integrations: google-ai-studio, vertex-ai
Use Cases: multimodal-reasoning, code-generation, document-understanding, on-device-inference
API Available: Yes
Tags: gemini, multimodal, google, deepmind, foundation-model
Added: 2026-03-17
Completeness: 100%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service