A technique was developed to quickly teach robots novel traversal behaviors with minimal human oversight. The technique allows mobile robot platforms to navigate autonomously in environments while carrying out actions a human would expect of the robot in a given situation.
Robot teammates can be used as an initial investigator for potentially dangerous scenarios, thereby keeping humans further from harm. To achieve this, the robot must be able to use its learned intelligence to perceive, reason, and make decisions. The new technique focuses on how robot intelligence can be learned from a few human example demonstrations. The learning process requires minimal human demonstration, making it an ideal learning technique for on-the-fly learning in the field when mission requirements change.
Researchers focused their initial investigation on learning robot traversal behaviors with respect to the robot’s visual perception of terrain and objects in the environment. More specifically, the robot was taught how to navigate from various points in the environment while staying near the edge of a road, and also how to traverse covertly using buildings as cover.
Given different mission tasks, the most appropriate learned traversal behavior can be activated during robot operation. This is done by leveraging inverse optimal control, also commonly referred to as inverse reinforcement learning, which is a class of machine learning that seeks to recover a reward function given a known optimal policy. In this case, a human demonstrates the optimal policy by driving a robot along a trajectory that best represents the behavior to be learned. These trajectory exemplars are then related to the visual terrain/object features — such as grass, roads, and buildings — to learn a reward function with respect to these environment features.
The work seeks to create intelligent robotic systems that reliably operate in warfighter environments, meaning the scene is highly unstructured, possibly noisy, and needs to be done given relatively little a priori knowledge of the current state of the environment. The research has helped demonstrate the feasibility of quickly learning and encoding of traversal behaviors.
The learning framework is flexible enough to use a priori intelligence that may be available about an environment. This could include information about areas that are likely visible by adversaries or areas known to have reliable communication. This additional information may be relevant for certain mission scenarios; learning with respect to these features would enhance the intelligence of the mobile robot. The researchers are also exploring how this type of behavior learning transfers between different mobile platforms.
Evaluation to date has been performed with an unmanned Clearpath Husky robot, which has a visual field of view that is relatively low to the ground. Transferring this technology to larger platforms will introduce new perception viewpoints and different platform maneuvering capabilities. Learning to encode behaviors that can be easily transferred between different platforms would be extremely valuable, given a team of heterogeneous robots. In this case, the behavior can be learned on one platform instead of each platform individually.