Robot Learns Context-Driven User Preferences & How to Safely Handle a Knife

Researchers from the Cornell University Personal Robotics Lab have developed an algorithm for robots to learn user preferences. The robot can generalize its learning and produce preferred trajectories for new environments and situations, such as household chores and grocery checkout tasks. In this video, Baxter the robot does not know how to move a knife safely at first. After just three preference feedback iterations of the algorithm, the robot learns to safely manipulate a knife in the presence of humans.



Transcript

00:00:00 in this work we learn user preferences over robot trajectories we consider the grocery store scenario where the robot checks out various objects in the presence of human for example if the robot checks out a knife then without any learning it doesn't care about the knife's distance from the human hence it scares the Human by passing a knife very close to

00:00:18 him we teach the robot user preferences by iteratively improving the trajectory it proposes in the following rounds of iterations in the first iteration the user moves baer's arm teaching it to keep the knife away from humans the robot improves its model from this feedback and updates the heat map in the next iteration the robot ranks

00:00:41 trajectories and displays the top three trajectories in a simulator as feedback the user improves the ranking by moving trajectory 3 over 1 and two in the third iteration the user further improves the trajectory by rotating the the knife in just three preference feedback iterations of our algorithm the robot

00:01:08 learns to safely manipulate the knife in the presence of humans it now also understands that humans prefer to stay a safe distance from the knife despite being trained by non-expert user our algorithms regret bound decays at the same rate as learning from an expert not only this the robot also generalizes to

00:01:28 environments it has not seen before and manipulates the knife without pointing it at humans and other delicate objects along its trajectory