|
|
|
|
|
|
|
|
Learning tool manipulation from unconstrained instructional videos here shown on learning the spade task policy for the Panda robot. The input video demonstration (A) is first processed to extract the 3D trajectory of the human and the manipulated tool (B). The extracted tool trajectory from the video is used to learn the robot policy in a simulated environment (C). The learned policy is then applied to the real robot (D). |
A seamless integration of robots into human environments requires robots to learn how to use existing human tools. Current approaches for learning tool manipulation skills mostly rely on expert demonstrations provided in the target robot environment, for example, by manually guiding the robot manipulator or by teleoperation. In this work, we introduce an automated approach that replaces an expert demonstration with a Youtube video for learning a tool manipulation strategy. The main contributions are twofold. First, we design an alignment procedure that aligns the simulated environment with the real-world scene observed in the video. This is formulated as an optimization problem that finds a spatial alignment of the tool trajectory to maximize the sparse goal reward given by the environment. Second, we describe an imitation learning approach that focuses on the trajectory of the tool rather than the motion of the human. For this we combine reinforcement learning with an optimization procedure to find a control policy and the placement of the robot based on the tool motion in the aligned environment. |
K. Zorina, J. Carpentier, J. Sivic, V. Petrík Learning to Manipulate Tools by Aligning Simulation to Video Demonstration Accepted to IEEE Robotics and Automation Letters (RA-L), 2021. (hosted on ArXiv) |
Acknowledgements |