Recognizing objects in cluttered range data is a challenging problem. During my first two years at CMU I investigated the problem of using example range images and 3D models of target objects together in the recognition process. I extended the Spin Image technique of Johnson and Hebert so that various parameters of the recognition procedure are estimated based on example range data that has been labeled as to which sections of it correspond to objects of interest and which sections correspond to clutter. Here is an example in which we take range images of a scene containing four highly similar objects (animal statues, shown below) and are interested in detecting only one of them (“A”).
Detecting object “A” in the presence of B, C, and D is challenging because all four objects look alike. When we apply my algorithm to tune the Spin Image technique to this specific problem, we see a progressive improvement in recognition accuracy as more labeled example range images are employed at training time. In particular, when only one or two example images are used to tune the Spin Image process, model “A” is falsely detected at various incorrect locations (see images 1 and 2 below). When 3 example images are used for training, model “A” is located correctly but there is one false detection in the upper left . When 4 or more example images are used to tune the recognition process, model “A” is located correctly with no false detections (see “4” below).
Paper:
Large Data Sets and Confusing Scenes in 3-D Surface Matching and Recognition, 3DIM 1999, [pdf, 929 K].
Web Sites:
I also presented the 3D Cueing method for using a 3D model of the target object to quickly discard cluttered regions of the range data at recognition time. 3D Cueing compresses the 3D model into a compact representation and makes fast comparisons between the compressed model and shape data extracted from the scene. Based on these comparisons, points in the scene are marked as being likely to belong to the target object or the clutter. In the example below, a 3D model of the “U” joint is compressed and fast comparisons are made between the compressed model and points in the scene.
Points marked in green are labeled as being likely to belong to the U joint. Note that most of the points on the other objects have been quickly discarded.
Paper: 3D Cueing: A Data Filter For Object Recognition, ICRA 1999, [pdf, 592 K].
Video: Recognition using spin images (mpg 14 MB).
These recognition techniques were incorporated into the Artisan system for remote nuclear power plant decommissioning. Here is an example image from the Artisan user interface. On the left is a camera image of a mock-up scene of a nuclear plant, and on the right is a range image of the scene (darker points are farther away). The “J”-shaped joint at the top left, shown in white, has been recognized by the system.
Paper: V. Broz, O. Carmichael, S. Thayer, J. Osborn, and M. Hebert. ARTISAN: An Integrated Scene Mapping and Object Recognition System. American Nuclear Society 8th Intl. Topical Meeting on Robotics and Remote Systems, American Nuclear Society, April, 1999. [pdf, 814 K]
I also investigated the specific application of the Spin Image technique to recognition of faces across range sensing modalities during an internship at Minolta Research Lab in Osaka, Japan.