ModelLLMsv2.0

InternVL 2

by Shanghai AI Laboratory · open-source · Last verified 2026-03-17

Shanghai AI Laboratory's scalable open-source vision-language model achieving GPT-4V-level performance on multimodal benchmarks. Features dynamic resolution support and progressive training from 1B to 108B parameter configurations.

https://github.com/OpenGVLab/InternVL ↗

C—Below Average

Adoption: CQuality: B+Freshness: C+Citations: C+Engagement: F

Specifications

License: MIT
Pricing: open-source
Capabilities: image-understanding, visual-reasoning, dynamic-resolution, multi-image-understanding, ocr
Integrations: huggingface, vllm, transformers, lmdeploy
Use Cases: visual-qa, document-understanding, chart-analysis, multimodal-research
API Available: No
Parameters: 108B
Context Window: 32K tokens
Modalities: text, image
Training Cutoff: Mid 2024
Tags: multimodal, vision, open-source, shanghai-ai-lab, scalable
Added: 2026-03-17
Completeness: 90%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need help choosing the right model?

Get Expert Guidance

Explore the full AI ecosystem on Agents as a Service