Image database query by object recognition algorithms
Guillaume-Alexandre Bilodeau
Robert Bergevin (Supervisor)
Problem: With the popularity of the WWW and information technologies, very large databases of images and video sequences must be processed automatically. This is the case for image database query and video surveillance. Currently used systems are limited in their use because objects in images are not modeled adequately. Textures, colours and 2D shapes of objects do not characterize them sufficiently well. Models are sensitive to viewpoint, and textures and colours are given much more importance than they receive in reality.
Motivation: The goal of this project is to design a representation model, a software and tools that permit the recognition and the comparison of multi-part manufactured objects in the foreground of images.
Approach: The approach used to reach our objectives is inspired by two theories in cognitive psychology. The first theory stated by Biederman has been developed using experiments with human subjects. These experiments have shown that humans are perceiving objects as a hierarchy of grouped primitives. The second theory demonstrates that if straight line segments randomly oriented are shown to humans asked to group them in pairs, humans group them naturally by length, proximity, orientation, and level of overlap. These are the laws of perceptual grouping. The theoretical representation model used in this project is based on the first theory, whereas, the algorithms used to build the model are based on the second theory. Hence, the theoretical representation model is an attributed graph where the nodes are simple volumetric primitives and the arcs reflect the spatial arrangements of the volumetric primitives. The second theory is used in the algorithms for building the model. The volumetric primitives projections can be viewed as straight line segments and circular arcs that perceptually form groups. Hence, the algorithms for creating models are grouping lines in accordance with the laws of perceptual grouping.
Challenges: Comparing and querying images require a software with abilities similar to a human operator. The use of colours, textures, and interest points in the images are not enough, because two chairs might be, for example, of different colours, dimensions and proportions. Query of images and video sequences must be done at the basic semantic level of identity of objects (e.g. chair, lamp, table, human, etc.), while taking into account the context in which the object is found. To this day, no software or algorithms have this functionnality. The incapacity of designing such algorithms comes from the difficulty of dividing an image into its constitutive objects, from the difficulty of modeling shapes generically and from the difficulty of abstracting reflections, textures, shadows and distortions caused by the image capture process.
Applications: Query of databases of manufactured object images, video surveillance, robotic vision systems.
Expected results: The expected results of this project are the proposal and design of a theoretical representation model usable in practice to describe manufactured objects, the design of algorithms for comparing the described object, the testing of the implemented algorithms, and an evaluation of the results obtained to grasp the difficulties specific to our approach.
Calendar: May 1999 - December 2003
Support: This work is supported by a postgraduate scholarship from the Natural Sciences and Engineering Research Council of Canada (NSERC) and from le Fonds pour la Formation de Chercheurs et l’Aide à la Recherche (FCAR).
Web reference: /~bilodeau/plastique.html
Last modification: Sep 28 2007 2:01PM by bilodeau


©2002-. Computer Vision and Systems Laboratory. All rights reserved