Meet RoboAgent: Enabling Robots to Acquire Manipulation Abilities

The way babies learn and explore their surroundings inspired researchers at Carnegie Mellon University and Meta AI to develop a new way to teach robots how to simultaneously learn multiple skills and leverage them to tackle unseen, everyday tasks. The researchers set out to develop a robotic AI agent with manipulation abilities equivalent to a 3-year-old child. Learn more with this video.

“RoboAgent is a critical milestone toward general robotic agents that are efficient learners, effective in novel situations and capable of expanding their behaviors over time,” said Vikash Kumar  , adjunct faculty in the School of Computer Science’s Robotics Institute. “Current robots are highly specialized and trained for individual tasks in isolation. In contrast, we set out to create a single artificial intelligence agent capable of exhibiting a wide range of skills in unseen scenarios. RoboAgent learns like human babies — leveraging a combination of abundant passive observations and limited active play.”



Transcript

00:00:03 when we think of intelligence only one picture comes to mind an organism that is capable of some movements and interactions with the rest of the world so an intelligent agent is the one that changes the environment for its own Advantage this is not the central point that today's AI is taking they usually think of predictive AI some images are there I want to predict what is

00:00:27 happening in the images there are some sequence of letters I want to predict the next letter in the sequence we know how to solve specific tasks where the scenario of the task or the scene in which that task lives is relatively stationary or pre-specified and that is where the robotics is today what we are after is to create AI agents that are flexible

00:00:50 General and is able to evolve over time in unseen situations the perspective that we take is let's look at a human baby an infant and it's trying to learn different skills over time the babies observe a lot they see a lot of things around them and they are not initially capable of interacting with them initial learning comes from the passive modalities that we are trying to build

00:01:15 paradigms that is an active combination of passive experiences and active experiences the passive experience is the way we capitalize them is from internet data human videos a lot of pictures and usually you can think of these passive experiences give real world priors or real world understanding to our agent then the self-experiences or the things

00:01:40 the agent is trying themselves in their own embodiment helps connect this hallucinated world where we know something about to the true world and how my embodiment is actually going to work in that true World our approach is basically a robot is placed in a particular scene and you instruct a robot maybe through a bull image all through a language instruction of what

00:02:02 it should do and it just does that particular task so that is the level of generalization we are after one of the main aims of AI is to enable systems that can help people in the everyday tasks one of the ways of doing that is by actually deploying robots that live in the physical world in people's homes offices and other places that can help them solve everyday tasks when we learn

00:02:28 multiple skills together the agent is able to internalize the common structure that is shared across all these cells if I know how to open a microwave in this particular room I would know how to open a similar microwave in maybe my own home or in another office and so on because the details of the rest of the scene do not matter multitask learning allows us to be able to leverage our experiences

00:02:52 across each of these skills and then learn one common thing across all of them and that one common thing often reflects how the world works we can act in Universal settings new settings and this is the Holy Grail for robotics this is what is going to take artificially intelligent agents out of robotic cages into a real world robotics is all has always been data hungry and they have

00:03:15 been very limited data sets and what we are releasing is a large data set on very common Hardware that is accessible and used by many Labs across the world and I think this will motivate other people to actually join us in this effort and be able to multiply this data set with their own data sets being able to curate these data sets and hope that other people adopt it and multiply it

00:03:37 with their own experiences is going to be critical towards this journey