Skip to main content
Paperroboticsv1.0

VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models

by Stanford University · free · Last verified 2026-03-17

VoxPoser uses LLMs and vision-language models to synthesize 3D voxel-based value and constraint maps that guide robot motion planners, enabling zero-shot generalization to novel language instructions and object configurations. The approach produces trajectories without any robot-specific training by composing affordance maps in 3D space.

https://arxiv.org/abs/2307.05973
C+
C+Average
Adoption: BQuality: AFreshness: B+Citations: BEngagement: F

Specifications

License
Open Access
Pricing
free
Capabilities
3d-reasoning, zero-shot-manipulation, value-map-synthesis, motion-planning
Integrations
Use Cases
robotic-manipulation, novel-instruction-following, dexterous-robot-tasks
API Available
No
Tags
robotics, 3d, value-maps, language-models, manipulation, zero-shot
Added
2026-03-17
Completeness
100%

Index Score

57.7
Adoption
60
Quality
87
Freshness
78
Citations
65
Engagement
0

Put AI to work for your business

Deploy this paper alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.

Explore the full AI ecosystem on Agents as a Service