MergerMarket Interview with Sumit Sharma

Getting computer vision systems to recognise reality

Getting computer vision systems to recognise reality

Enabling cognitive computer vision systems to emulate human capabilities is the driving force behind the VAMPIRE project, as will be demonstrated at the IST2004 event with its object localisation via hybrid tracking methods, context-aware scene augmentation and interactive object learning.

To date, computer vision systems have been unable to emulate the full capabilities of the human visual system. The human eye-brain combination has proved able to categorise previously unseen objects with ease, using background knowledge and context. We recognise a pig as a pig because of the shape of its body and because we see it in a farmyard or field.

Context and background knowledge essential

However the VAMPIRE project seeks to enable cognitive computer vision systems to develop similar capabilities. Project participants are working on the thesis that learning and cognitive capabilities in vision systems cannot be realised without using Visual Active Memory (VAM) processes, which provide the context and background knowledge for learning and categorising objects and behaviours despite a dynamically changing environment.

The aim of VAMPIRE (due to complete in July 2005) is to develop and test this hypothesis that Visual Active Memory is instrumental to cognition. One of the scenarios used to test the hypothesis is to use two static cameras to provide visual input for an action recognition system within an office environment. The resulting algorithms for object and action-recognition enable the system to develop a consistent interpretation of the scene.

Other scenarios include locating the position of augmented-reality (AR) users, interactive object learning (a user interactively teaches the system to recognise new objects), and scene augmentation based on visually-detected events (the user touches a book in front of him, whereupon the system displays information about the book within his visual field).

Providing assistance while it learns

What is innovative about the project is that it tightly couples object acquisition and recognition processes while including the human brain in the processing loop. The system aims to gather knowledge that enables it to learn, at the same time as it provides information to the human user. It makes use of a general memory infrastructure that stores the visual event, learns new concepts and retrieves past events in order to provide the necessary object categorisation.

VAMPIRE has also developed several innovations in aspects of computer vision such as tracking of objects in real-time, hybrid tracking integrating inertial and visual cues, use of attention cues, acquisition of object models from a very limited range of example images, and categorisation from contextual reasoning.

First mobile demonstrators on show

The project has already shown the first mobile AR demonstrators to show object localisation via hybrid tracking methods, as well as context-aware scene augmentation and interactive object learning. All these abilities are to be demonstrated on the VAMPIRE project stand at IST 2004. A project-related workshop has also generated positive feedback from industry on several potential application areas, including quality assurance in manufacturing and teaching systems.