Modelmultimodalv1.0

Cambrian-1

by New York University (NYU) · free · Last verified 2026-03-17

Cambrian-1 is a research vision-language model from NYU focused on spatial intelligence and visual grounding, introducing the Spatial Vision Aggregator to fuse features from multiple vision encoders. It achieves strong performance on spatial reasoning and visual understanding benchmarks, providing a fully open research platform for multimodal model development.

https://cambrian-mllm.github.io ↗

C—Below Average

Adoption: DQuality: B+Freshness: ACitations: CEngagement: F

Specifications

License: Apache 2.0
Pricing: free
Capabilities: vision, visual-question-answering, spatial-reasoning, image-understanding, visual-grounding
Integrations: Hugging Face
Use Cases: spatial-intelligence-research, visual-qa, multimodal-research, robotics-perception
API Available: No
Parameters: 34B
Context Window: 8K
Modalities: text, image
Training Cutoff: 2024
Tags: nyu, research, vision-language, spatial-intelligence, open-source
Added: 2026-03-17
Completeness: 95%

Index Score

41.8

Adoption

Quality

Freshness

Citations

Engagement

Need help choosing the right model?

Get Expert Guidance

Explore the full AI ecosystem on Agents as a Service