RT-2
by Google DeepMind · paid · Last verified 2026-03-17
RT-2 (Robotics Transformer 2) is Google DeepMind's vision-language-action model that directly maps visual observations and language instructions to robot actions, enabling robots to perform novel tasks through generalization from web-scale pretraining. It represents a breakthrough in combining foundation model capabilities with physical robot control.
https://robotics-transformer2.github.io/ ↗C+
C+—Average
Adoption: CQuality: A+Freshness: B+Citations: AEngagement: F
Specifications
- License
- Proprietary
- Pricing
- paid
- Capabilities
- visual-instruction-following, robot-action-generation, zero-shot-task-generalization, natural-language-robot-control
- Integrations
- Google Robot Platforms
- Use Cases
- tabletop manipulation, household task automation, novel object interaction, instruction-following robotics
- API Available
- No
- Parameters
- ~55B
- Context Window
- N/A
- Modalities
- vision, text, action
- Training Cutoff
- 2023
- Tags
- robotics, google, vision-language-action, embodied-ai, manipulation
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
54Adoption
40
Quality
90
Freshness
72
Citations
80
Engagement
0