Today’s robots can accomplish many repetitive tasks, but their inability to understand the nuances of human language makes them mostly useless for more complicated requests. For example, if a specific tool is placed in a toolbox and a robot is asked to “pick it up,” it would be completely lost. Picking it up means being able to see and identify objects, understand commands, recognize that the “it” in question is the tool, go back in time to remember the moment when the tool was put down, and distinguish that tool from other ones of similar shapes and sizes.

A system similar to Amazon’s Echo voice service, Alexa, allows robots to understand a wide range of commands that require contextual knowledge about objects and their environments. Called ComText (commands in context), the system was able to handle the toolbox situation described above. If ComText is told that “the tool I put down is my tool,” it adds that information to its knowledge base. The robot can be updated with more information about other objects, and can execute a range of tasks like picking up different sets of objects based on different commands.

Things like dates, birthdays, and facts are forms of “declarative memory.” There are two kinds of declarative memory: semantic memory, which is based on general facts like the “sky is blue,” and episodic memory, which is based on personal facts like remembering what happened at a party. Most approaches to robot learning have focused only on semantic memory, which leaves a big knowledge gap about events or facts that may be relevant context for future actions. ComText can observe a range of visuals and natural language to glean episodic memory about an object’s size, shape, position, type, and even if it belongs to somebody. From this knowledge base, it can then reason, infer meaning, and respond to commands. With ComText, a robot was successful in executing the right command about 90 percent of the time.

By creating much less constrained interactions, this line of research could enable better communications for a range of robotic systems, from self-driving cars to household helpers.

For more information, contact Adam Conner-Simons of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) at This email address is being protected from spambots. You need JavaScript enabled to view it.; 617-324-9135.