We have developed a scalable method for detecting multiple objects from a video stream in real time. The method is shape-based, thus suitable for texture-less objects. The method is based on constellations of edgelets, which are easy to calculate and occlusion-tolerant. Scalability is handled by fixed scanning paths that limit the number of considered constellations. Searching training views for constellations using a few fixed scanning paths builds a library of transformation-invariant descriptors. During testing, the image is searched for constalltions of edgelets using the same pre-defined fixed scanning paths. When a constellation is found, the descriptor is compared to the library to find candidate matche. The method was tested for up to 30 three-dimensional objects (> 100 views per object) and recall of over 50% was achieved at 7fps.
Real-time Learning and Detection of 3D Texture-less Objects: A Scalable Approach. British Machine Vision Conference (BMVC), 2012, pdf, abstract
Egocentric Real-time Workspace Monitoring using an RGB-D Camera. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pdf
Detecting and Localising Multiple 3D Objects: A Fast and Scalable Approach. IROS Workshop on Active Semantic Perception and Object Search in the Real World (ASP-AVS-11), 2011 pdf
When using this code, kindly reference our BMVC paper below.