Skip to main content
DatasetComputer Visionv1.0

ShareGPT4V

by Shanghai AI Lab · open-source · Last verified 2026-03-17

A high-quality instruction-following dataset of 100,000 image-text pairs generated by GPT-4V, designed to supervise open-source large vision-language models during instruction-tuning. The detailed, GPT-4V-generated captions and QA pairs substantially improve open-source vision models' ability to perform complex scene understanding, OCR, and visual reasoning tasks.

https://sharegpt4v.github.io
B
BAbove Average
Adoption: B+Quality: A+Freshness: B+Citations: B+Engagement: F

Specifications

License
Apache-2.0
Pricing
open-source
Capabilities
instruction-tuning, vision-language, image-understanding
Integrations
hugging-face
Use Cases
vision-model-fine-tuning, instruction-tuning, research
API Available
Yes
Tags
multimodal, instruction-tuning, vision-language, gpt-4v, llava
Added
2026-03-17
Completeness
100%

Index Score

65.1
Adoption
70
Quality
93
Freshness
78
Citations
74
Engagement
0

Put AI to work for your business

Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.

Explore the full AI ecosystem on Agents as a Service