With a training technique commonly used to teach dogs to sit and stay, computer scientists showed a robot how to teach itself several new tricks including stacking blocks. With the method, the robot (named Spot) was able to learn in days what typically takes a month. By using positive reinforcement — an approach familiar to anyone who’s used treats to change a dog’s behavior — the team dramatically improved the robot’s skills and did it quickly enough to make training robots for real-world work a more feasible enterprise.
Unlike humans and animals that are born with highly intuitive brains, computers are blank slates and must learn everything from scratch. But true learning is often accomplished with trial and error and roboticists are still figuring out how robots can learn efficiently from their mistakes. The team accomplished that by devising a reward system that works for a robot the way treats work for a dog. Where a dog might get a cookie for a job well done, the robot earned numeric points.
To stack blocks, Spot the robot needed to learn how to focus on constructive actions. As the robot explored the blocks, it quickly learned that correct behaviors for stacking earned high points but incorrect ones earned nothing. Spot earned the most by placing the last block on top of a four-block stack.
The training tactic not only worked but it also took just days to teach the robot what used to take weeks. The team was able to reduce the practice time by first training a simulated robot, which is a lot like a video game, then running tests with Spot. The robot quickly learns the right behavior to get the best reward. In fact, what used to take a month of practice for the robot to achieve 100 percent accuracy was done in two days.
Positive reinforcement not only worked to help the robot teach itself to stack blocks but also with the point system, the robot quickly learned several other tasks — even how to play a simulated navigation game. The ability to learn from mistakes in all types of situations is critical for designing a robot that could adapt to new environments.
The team imagines these findings could help train household robots to do laundry and wash dishes — tasks that could help seniors live independently. It could also help design improved self-driving cars or perform product assembly.