Researchers at Massachusetts Institute of Technology (MIT) have developed an Alexa-like system that allows robots to understand a wide range of commands that require contextual knowledge about objects and their environment.
Alexa is Amazon’s Artificial Intelligence (AI) assistant that powers Echo smart speaker, provides capabilities, or skills that enable customers to interact with devices in a more intuitive way using voice.
Today’s robots are still very limited in what they can do. They can be great for many repetitive tasks, but their inability to understand the nuances of human language makes them mostly useless for more complicated requests.
For example, if you put a specific tool in a toolbox and ask a robot to “pick it up,” it would be completely lost.
With the new system dubbed “ComText,” for “commands in context”, if someone tells the system that “the tool I put down is my tool,” it adds that fact to its knowledge base.
One can then update the robot with more information about other objects and have it execute a range of tasks like picking up different sets of objects based on different commands.
“Where humans understand the world as a collection of objects and people and abstract concepts, machines view it as pixels, point-clouds, and 3-D maps generated from sensors,” said Rohan Paul from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).
“This semantic gap means that, for robots to understand what we want them to do, they need a much richer representation of what we do and say,” Paul said in a statement released by MIT.
The team tested ComText on Baxter, a two-armed humanoid robot developed for Rethink Robotics by former CSAIL director Rodney Brooks.
With ComText, Baxter was successful in executing the right command about 90 per cent of the time, according to a paper presented at the International Joint Conference on Artificial Intelligence (IJCAI) in Australia.
In the future, the team hopes to enable robots to understand more complicated information, such as multi-step commands, the intent of actions, and using properties about objects to interact with them more naturally.
For example, if you tell a robot that one box on a table has crackers, and one box has sugar, and then ask the robot to “pick up the snack,” the hope is that the robot could deduce that sugar is a raw material and therefore unlikely to be somebody’s “snack.”
The researchers believe that by creating much less constrained interactions, this line of research could enable better communications for a range of robotic systems, from self-driving cars to household helpers.