Object Recognition from Images

My thesis work addressed the problem of recognizing so-called “wiry” objects. Wiry objects are distinguished by a prevalence of very thin, elongated, stick-like components; examples include tables, chairs, bicycles, and desk lamps. They are difficult to recognize because their shapes are complex and they tend to lack distinctive color or texture characteristics. Recognizing them in images is important in a number of problem areas because they are relatively common. Our approach takes as input a set of training images of a target object in typical background environments. Binary edges extracted from the training images are labeled as belonging to the target object or the clutter. Here is an example training image: edges on the ladder are marked in green, and edges in the background are marked in red. Click on the image for a full-scale version.

The most striking finding of this work is that sub-clinical CVD is remarkably prevalent and relevant, even in studies that went to a great deal of effort to exclude individuals with overt CVD. The Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort, for example, was designed to isolate “pure AD” without much CVD. Yet, MRI markers of CVD are not only present in the ADNI cohort, they are also associated with greater cognitive decline, greater brain atrophy over time, and poorer brain metabolism independent of markers of AD. This suggests that clinical trials of therapies for AD should begin to take great caution to account for CVD in their participants, since many of their participants will show evidence of CVD that has the potential to impact their brain health and cognition.

Using these training images, we train a cascade of classifiers to discriminate object edges from background edges by automatically selecting informative shape cues. Below is an example result of our algorithm applied to a test image: the test image (top left), detected edges shown in green (top right), edges classified as “ladder” edges by our algorithm (bottom left), and post-processing to cluster all the ladder edges into a final box answer (bottom right).

In my thesis there are more results on the ladder in different environments (a lab, cubicle, warehouse, apartment, conference room, and classroom, respectively). There are also results on recognizing a chair, a push cart, and a stool. Finally, the thesis presents an extension of this technique so that edge detection and recognition are automatically optimized as a single, unified process.

Currently, Honda Corporation is evaluating this technique to see if they want to incorporate it into the vision systems on their humanoid robots, shown above.

Web Site:

WORD: Wiry Object Recognition Database.
All the images I used in my thesis (10000 or so), with ground truth.

Thesis:

O. Carmichael . Discriminative Techniques For The Recognition Of Complex-Shaped Objects. PhD Thesis, The Robotics Institute, Carnegie Mellon University. September 18, 2003. Technical Report CMU-RI-TR-03-34. [pdf, 7.4 MB]

Publications:

O. Carmichael, M. Hebert, Shape-based Recognition Of Wiry Objects, Proceedings CVPR 2003. [pdf,921 K].

O. Carmichael, M. Hebert, Object Recognition by a Cascade of Edge Probes, Proceedings of the British Machine Vision Conference, September 2002. [paper, pdf,481 K].

------------------------------------------------------------------------------------------------------------------

Prior to my thesis, I worked on methods for parts-based recognition of objects with rich visual texture. I proposed the use of convolutional image filters, called Discriminant Filters, which are tuned so that their outputs discriminate between images of parts of the object, and images of parts of the background. We tune the filters based on example images of a target object and clutter, where individual object parts have been labeled. Here are two training images of a mug.

Here is an example result in which Discriminant Filters, together with a standard classifier, detect the presence of parts of the mug in a cluttered scene. So that the display wouldn’t get too cluttered, we only searched for parts 5, 6, ,7 ,8, 11, and 13. “C” means “clutter.”

Publication:

Discriminant Filters For Object Recognition, CMU Technical Report, [pdf, 370 K].

Recently, a group of researchers at CMU used Discriminant Filters as part of a system to discriminate between tissue types in images of vocal chords.

Their Publication:

Allin, S., Galeotti, J., Dailey, S. and Stetten, G. Enhanced Snake-Based Segmentation of Vocal Folds. IEEE International Symposium on Biomedical Imaging, Washington D.C., 2004.