From Seeing to Thinking: Decoupling Perception and Reasoning Improves Post-Training of Vision-Language Models
by [unverified] · free · Last verified 2026-06-21T03:07:53.687Z
Recent advances in vision-language models (VLMs) emphasize long chain-of-thought reasoning; yet, we find that their performance on visual tasks is primarily limited by a lack of visual perception as opposed to reasoning itself. In this work, we systematically study the interplay between perceptio...
https://huggingface.co/papers/2605.20177 ↗F
F—Critical
Adoption: FQuality: FFreshness: A+Citations: FEngagement: F
Specifications
- Pricing
- free
- Capabilities
- Integrations
- Use Cases
- API Available
- No
- Tags
- auto-discovered
- Added
- 2026-06-21T03:07:53.687Z
- Completeness
- 0%
Index Score
0Adoption
0
Quality
0
Freshness
100
Citations
0
Engagement
0