Multi-Contact Task and Motion Planning Guided by Video Demonstration

K. Zorina

D. Kovar

F. Lamiraux

N. Mansard

J. Carpentier

J. Sivic

V. Petrík

[Paper]

[Benchmark]

[Code]

[Video]

The proposed planning approach is guided by the demonstration video (A). The video depicts a person manipulating a known object; the cheez-it box in this particular example. The video can contain several pick-and-place actions with multiple objects. Here only a short clip with only one object and one action is shown. From the video we recognize (i) the contact states between the human hand and the object, marked by red bounding boxes in (B); and (ii) the object 6D pose (3D translation and 3D rotation w.r.t camera) at the grasp and release contact states, marked in yellow in (B). The robot trajectory planned by the proposed approach is shown in (C). The start and goal object poses in (C) are shown in magenta and green, respectively.

Abstract

This work aims at leveraging instructional video to guide the solving of complex multi-contact task-and-motion planning tasks in robotics. Towards this goal, we propose an extension of the well-established Rapidly-Exploring Random Tree (RRT) planner, which simultaneously grows multiple trees around grasp and release states extracted from the guiding video. Our key novelty lies in combining contact states, and 3D object poses extracted from the guiding video with a traditional planning algorithm that allows us to solve tasks with sequential dependencies, for example, if an object needs to be placed at a specific location to be grasped later. To demonstrate the benefits of the proposed video-guided planning approach, we design a new benchmark with three challenging tasks: (i) 3D re-arrangement of multiple objects between a table and a shelf, (ii) multi-contact transfer of an object through a tunnel, and (iii) transferring objects using a tray in a similar way a waiter transfers dishes. We demonstrate the effectiveness of our planning algorithm on several robots, including the Franka Emika Panda and the KUKA KMR iiwa.

Supplementary video

Paper and Supplementary Material

K. Zorina, D. Kovar, F. Lamiraux, N. Mansard, J. Carpentier, J. Sivic, V. Petrik
Multi-Contact Task and Motion Planning Guided by Video Demonstration
Accepted to IEEE International Conference on Robotics and Automation, 2023.

@inproceedings{2023VideoGuidedTAMP,
    title={Multi-Contact Task and Motion Planning Guided by Video Demonstration},
    author={Kateryna Zorina, David Kovar, Florent Lamiraux, Nicolas Mansard, Justin Carpentier, Josef Sivic, Vladimir Petrik},
    booktitle={Accepted to IEEE International Conference on Robotics and Automation},
    pages={0--0},
    year={2023}
}

[Bibtex]

Acknowledgements

This work is part of the AGIMUS project, funded by the European Union under GA no.101070165. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the European Commission can be held responsible for them. This work was funded by the European Regional Development Fund under the project IMPACT (reg. No. CZ.02.1.01/0.0/0.0/15\_003/0000468), the Grant Agency of the Czech Technical University in Prague, grant No. SGS21/178/OHK3/3T/17.