Robohub.org
 

UMich team works on perception and localization using cameras

by
27 January 2015



share this:

Some new results from the NGV Team at the University of Michigan describe different approaches for perception (detecting obstacles on the road) and localizations (figuring out precisely where you are). Ford helped fund some of the research so they issued press releases about it and got some media stories. Here’s a look at what they propose.

Many hope to be able to solve robotics (and thus car) problems with just cameras. While LIDAR is going to become cheap, it is not yet, and cameras are much cheaper. I outline many of the trade-offs between the systems in my article on cameras vs lasers. Everybody hopes for a research breakthrough or computer vision breakthrough to make vision systems reliable enough for safe operation.

The Michigan lab’s approach is a special machine vision one. They map the road in advance in 3D and visible light by using a mapping car equipped with lots of expensive LIDAR and other sensors. They build a 3D representation of the road similar to what you need for a video game engine, and from that, with the use of GPUs, they can indeed create a 2D image of what a camera should see from any given point.

The car goes out into the world and its actual camera delivers a 2D frame of what it sees. Their system then compares that with generated 2D images of what the camera should see until it finds the closest match. Effectively, it’s like you looking out a window and then going into a video game and wandering around looking for a place that looks like what you see out that window, and then you know where the window is.

Of course it is not “wandering,” and they develop efficient search algorithms to quickly find the location that looks most like the real world image. We’ve all seen video games images, and know they only approximate the real world, so nothing will be an exact match, but if the system is good enough, there will be a “most similar” match that also corresponds with what other sensors, like your GPS and your odometer/dead reckoning system, tell you about where you probably are.

Localization with cameras has been done before, and this is a new approach taking advantage of new generations of GPUs, so it’s interesting. The big challenge is simulating the lighting, because the real world is full of different lighting, high dynamic range, and shadows. The human system has no problem understanding a stripe on the road as it moves through the shadow of a tree, but computer systems have a pretty tough time with that. Sun shadows can be mapped well with GPUs, but shadows from things like the moving limbs of trees are not possible to simulate, as are the shadows of other vehicles and road users. At night, light and shadows come from car headlights and urban lights. The team is optimistic about how well they will handle these problems.

The much larger challenge is object perception. Once you have a simulation of what the camera should see, you can notice when there are things present that are not in the prediction — like another car or pedestrian, or a new road sign. (Right now their system mostly is looking at the ground.) Once you identify the new region, you can attempt to classify it using computer vision techniques, and also by watching it move against the expected background.

This is where it gets challenging, because the bar is very high. To be used for driving it must effectively always work. Even if you miss 1 pedestrian in a million you have a real problem because there are billions of pedestrians encountered by a billion drivers every day. This is why people love LIDAR — if something (other than a mirror or sheet of glass) sufficiently large is sufficiently close you, you’re going to get laser returns from it, and not from what’s behind it. It has the reliability number that is needed.
The challenge of vision systems is to meet that reliability goal.

This work is interesting because it does a lot without relying on AI “computer vision” techniques. It is not trying to look at a picture and recognize a person. Humans are able to look at 2D pictures with bizarre lighting and still tell you not just what the things in the picture are, but often how far away they are and what they are doing. While we can be fooled in a 2D image, once you have a moving dynamic world, humans are, generally reliable enough at spotting other things on the road. (Though of course, with 1.2 million dead each year, and probably 50 million or more accidents, the majority because somebody was “not looking,” we are far from perfect.)

Some day, computer vision will be as good at recognizing and understanding the world as people are — and in fact surpass us. There are fields (like identifying traffic signs from photos) where they already surpass us. For those not willing to wait until that day, new techniques in perception that don’t require full object understanding are always interesting.

I should also point out that while lowering cost is of course a worthwhile goal, it is a false goal at this time. Today, maximal safety is the overriding goal, and as such, nobody will actually release a vehicle to consumers without LIDAR just to save the estimated 2017 cost of LIDAR, which will be sub-$500. Only later, when cameras get so good they completely replace LIDAR safety capabilities for less money would people release such a system to save cost. On the other hand, improving cameras to be used together with LIDAR is a real goal; superior safety, not lower cost.

A version of this article originally appeared on robocars.com. Want to learn more about the the University of Michigan research cited in this article? Check out the research paper by Ryan Wolcott and Ryan Eustice (Best Paper Award at IROS 2014!), as well as Wolcott’s IROS 2014 Webcam research pitch: 



tags: , , , , , ,


Brad Templeton, Robocars.com is an EFF board member, Singularity U faculty, a self-driving car consultant, and entrepreneur.
Brad Templeton, Robocars.com is an EFF board member, Singularity U faculty, a self-driving car consultant, and entrepreneur.





Related posts :



Meet the Oystamaran

Working directly with oyster farmers, MIT students are developing a robot that can flip heavy, floating bags of oysters, helping the shellfish to grow and stay healthy.
08 December 2021, by

Exploring ROS2 with a wheeled robot – #4 – Obstacle avoidance

In this post you’ll learn how to program a robot to avoid obstacles using ROS2 and C++. Up to the end of the post, the Dolly robot moves autonomously in a scene with many obstacles, simulated using Gazebo 11.
06 December 2021, by

Team builds first living robots that can reproduce

AI-designed Xenobots reveal entirely new form of biological self-replication—promising for regenerative medicine.
02 December 2021, by

Exploring ROS2 using wheeled Robot – #3 – Moving the robot

In this post you’ll learn how to publish to a ROS2 topic using ROS2 C++. We are moving the robot Dolly robot, simulated using Gazebo 11.
30 November 2021, by

An inventory of robotics roadmaps to better inform policy and investment

Silicon Valley Robotics in partnership with the Industrial Activities Board of the IEEE Robotics and Automation Society, is compiling an up to date resource list of various robotics, AIS and AI roadmaps, national or otherwise.
29 November 2021, by

Robots can be companions, caregivers, collaborators — and social influencers

People are hardwired to respond socially to technology that presents itself as even vaguely social. While this may sound like the beginnings of a Black Mirror episode, this tendency is precisely what allows us to enjoy social interactions with robots and place them in caregiver, collaborator or companion roles.
26 November 2021, by





©2021 - ROBOTS Association


 












©2021 - ROBOTS Association