Teleoperating robots with virtual reality

by MIT News

12 October 2017

share this:

by Rachel Gordon
Consisting of a headset and hand controllers, CSAIL’s new VR system enables users to teleoperate a robot using an Oculus Rift headset.
Photo: Jason Dorfman/MIT CSAIL

Certain industries have traditionally not had the luxury of telecommuting. Many manufacturing jobs, for example, require a physical presence to operate machinery.

But what if such jobs could be done remotely? Last week researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) presented a virtual reality (VR) system that lets you teleoperate a robot using an Oculus Rift headset.

The system embeds the user in a VR control room with multiple sensor displays, making it feel like they’re inside the robot’s head. By using hand controllers, users can match their movements to the robot’s movements to complete various tasks.

“A system like this could eventually help humans supervise robots from a distance,” says CSAIL postdoc Jeffrey Lipton, who was the lead author on a related paper about the system. “By teleoperating robots from home, blue-collar workers would be able to tele-commute and benefit from the IT revolution just as white-collars workers do now.”

The researchers even imagine that such a system could help employ increasing numbers of jobless video-gamers by “gameifying” manufacturing positions.

The team used the Baxter humanoid robot from Rethink Robotics, but said that it can work on other robot platforms and is also compatible with the HTC Vive headset.

Lipton co-wrote the paper with CSAIL Director Daniela Rus and researcher Aidan Fay. They presented the paper at the recent IEEE/RSJ International Conference on Intelligent Robots and Systems in Vancouver.

There have traditionally been two main approaches to using VR for teleoperation.

In a direct model, the user’s vision is directly coupled to the robot’s state. With these systems, a delayed signal could lead to nausea and headaches, and the user’s viewpoint is limited to one perspective.

In a cyber-physical model, the user is separate from the robot. The user interacts with a virtual copy of the robot and the environment. This requires much more data, and specialized spaces.

The CSAIL team’s system is halfway between these two methods. It solves the delay problem, since the user is constantly receiving visual feedback from the virtual world. It also solves the the cyber-physical issue of being distinct from the robot: Once a user puts on the headset and logs into the system, they’ll feel as if they’re inside Baxter’s head.

The system mimics the homunculus model of the mind — the idea that there’s a small human inside our brains controlling our actions, viewing the images we see, and understanding them for us. While it’s a peculiar idea for humans, for robots it fits: Inside the robot is a human in a virtual control room, seeing through its eyes and controlling its actions.

Using Oculus’ controllers, users can interact with controls that appear in the virtual space to open and close the hand grippers to pick up, move, and retrieve items. A user can plan movements based on the distance between the arm’s location marker and their hand while looking at the live display of the arm.

To make these movements possible, the human’s space is mapped into the virtual space, and the virtual space is then mapped into the robot space to provide a sense of co-location.

The system is also more flexible compared to previous systems that require many resources. Other systems might extract 2-D information from each camera, build out a full 3-D model of the environment, and then process and redisplay the data. In contrast, the CSAIL team’s approach bypasses all of that by simply taking the 2-D images that are displayed to each eye. (The human brain does the rest by automatically inferring the 3-D information.)

To test the system, the team first teleoperated Baxter to do simple tasks like picking up screws or stapling wires. They then had the test users teleoperate the robot to pick up and stack blocks.

Users successfully completed the tasks at a much higher rate compared to the direct model. Unsurprisingly, users with gaming experience had much more ease with the system.

Tested against current state-of-the-art systems, CSAIL’s system was better at grasping objects 95 percent of the time and 57 percent faster at doing tasks. The team also showed that the system could pilot the robot from hundreds of miles away; testing included controling Baxter at MIT from a hotel’s wireless network in Washington.

“This contribution represents a major milestone in the effort to connect the user with the robot’s space in an intuitive, natural, and effective manner.” says Oussama Khatib, a computer science professor at Stanford University who was not involved in the paper.

The team eventually wants to focus on making the system more scalable, with many users and different types of robots that can be compatible with current automation technologies.

The project was funded, in part, by the Boeing Company and the National Science Foundation.