SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation

SimToolReal is a generalist, object-centric policy for dexterous manipulation of unseen tools and tasks

Abstract

The ability to manipulate tools significantly expands the set of tasks a robot can perform. Yet, tool manipulation represents a challenging class of dexterity, requiring grasping thin objects, in-hand object rotations, and forceful interactions. Since collecting teleoperation data for these behaviors is challenging, sim-to-real reinforcement learning (RL) is a promising alternative. However, prior approaches typically require substantial engineering effort to model objects and tune reward functions for each task.

In this work, we propose SimToolReal, taking a step towards generalizing sim-to-real RL policies for tool manipulation. Instead of focusing on a single object and task, we procedurally generate a large variety of tool-like object primitives in simulation and train a single RL policy with the universal goal of manipulating each object to random goal poses. This approach enables SimToolReal to perform general dexterous tool manipulation at test-time without any object or task-specific training. We demonstrate that SimToolReal outperforms prior retargeting and fixed-grasp methods by 37% while matching the performance of specialist RL policies trained on specific target objects and tasks. Finally, we show that SimToolReal generalizes across a diverse set of everyday tools, achieving strong zero-shot performance over 120 real-world rollouts spanning 24 tasks, 12 object instances, and 6 tool categories.

Video

Method

Real-World Robot Performance

All SimToolReal videos are played at 1x speed.

Baselines

All SimToolReal videos are played at 1x speed.

Failures and Recovery Behavior

All SimToolReal videos are played at 1x speed.

Visualization of Policy Observations

All SimToolReal videos are played at 1x speed.