EE362 Final Project, Winter 2005

Inferring Depth from Images

David Lieb
Andrew Lookingbill
Keith Rauenbuehler

Introduction

As part of research being conducted through the Stanford Artificial Intelligence Laboratory, our group looked at methods for inferring depth information from an image taken by a single camera. This could be used as a replacement for traditional 3-D reconstruction approaches typically taken in computer vision. The end goal was to implement code that could be run on the robot (part of the DARPA LAGR project) pictured below.

This work was intended to build on work previously done to segment a 2-D scene into traversable and non-traversable regions. Given the output of an algorithm that correctly labeled trees as hazards in a monocular camera image (blue and red regions in the picture below), we were seeking to infer the depths of the different obstacles in the scene, which would allow the robot to navigate around them.

In keeping with the spirit of this course, we decided to approach this problem by examining the ways in which the human vision system tackles the problem of depth perception. We examined the classes of depth cues discussed by E. Bruce Goldstein in his book Sensation and Perception. These, along with any analogs in the computer vision field, are discussed in the Depth Cues section. We implemented several of these approaches in C and MATLAB to evaluate their performance, and integrated elements from several of them into what turned out to be our most successful approach to monocular depth perception. The results from this work are shown in the Implementation section.