A Framework for Visual Information Retrieval
A Framework for Visual Information Retrieval
Disciplines
Computer Sciences (100%)
Keywords
-
Visual Information Retrieval,
Content-based Image Retrieval,
Similarity Measurement,
Content-based Video Retrieval,
MPEG-7,
Media Processing
The project "VizIR - A Framework for Visual Information Retrieval", submitted by the Institute of Software Technology and Interactive Systems at the Vienna University of Technology, aims at three major goals: (1) Integration of past visual information retrieval research results with our current research work on similarity modeling, (semantic) feature extraction and query acceleration. We have developed a process-oriented similarity model that is based on psychological insights about human similarity perception as well as information retrieval methods. The major idea is that visual similarity is more than distance measurement of numerical feature vectors. In feature design we have developed a concept for semantic feature modeling. Additionally, we are integrating and evaluating the visual MPEG-7 descriptors and developing suitable descriptor schemes for visual querying. (2) Implementation of an asset framework for content-based retrieval of visual media (image, video). Assets include class frameworks for feature extraction, querying methods and user interface components as well as benchmarking algorithms, test sets and documentation. This asset framework has to be open, portable, extendible and well- documented. Open means that the VizIR outcome (including source-code and API documentation) will be regularly released to the public and interested researchers are invited (in publications, etc.) to use this toolbox. VizIR is portable, because it is fully based on Java and the JavaSDK. Where platform-dependent packages are used (database, media handling), they are encapsulated in wrapper classes to guarantee that these components can be replaced without having to change the framework API. VizIR is designed to be extendible: users can add feature extraction methods, query engines, indexing methods, user interface components, etc. Finally, well-documented APIs and components are guaranteed through using Javadoc and state-of-the-art software development processes and tools. (3) Cooperation with other visual information retrieval research groups. In VizIR, we are using innovations from other groups like the Multimedia Retrieval Markup Language, test sets, etc. and contributing to other project (e.g. Benchathlon, an initiative to design benchmarks for visual information retrieval). The VizIR project has already been started in Autumn 2001. With funding from the FWF we hope to accelerate project progress and maximize the scientific output.
The VizIR project aims at perceptual understanding of audio-visual media. The scientific results of the project share funded by the FWF can be split into four major areas: feature modelling, similarity modelling, software design in multimedia information retrieval, and visualisation and interaction. Feature Modelling. We contributed novel research results in the areas of feature audiovisual extraction, standardisation and evaluation. Firstly, we described a novel paradigm for the modelling of semantic features where high-level features are based on hierarchies of low-level features and enriched by domain, user and system knowledge. Furthermore, we have introduced a system-based approach for quantitative analysis of feature vectors. We analysed the content-based visual MPEG-7 descriptors and could show that they suffer from several shortcomings (such as high redundancy, sensitivity to noise etc.). Similarity Modelling. The VizIR-related work on modelling of human visual similarity perception has been laid down in several publications. We have investigated the facets of visual retrieval models from various points of view. The potentials and performance of multi-feature approaches have been analysed. A considerable amount of work has been spent on the analysis of successful distance (similarity) measurement methods, including the powerful psychological perception measures. The results of these studies have been incorporated in the VizIR querying components. Retrieval Software Design. The introduction of state of the art software design has been one of the major issues of the VizIR project. We analysed the most prominent retrieval architectures and laid down the VizIR software architecture. VizIR propagates a two layer structure with a back-end for data management and a front-end for visualisation and interaction. All components are loosely coupled and interact through well-defined ports and XML-based service descriptions. Management of media data and feature data is based on a transparent object- oriented database layer. Media handling, delivery and visualisation are based on powerful though lightweight interfaces. Visualisation and Interaction. The VizIR project introduces a novel 3D user interface approach to visual information retrieval. Furthermore, we introduced a novel paradigm for video visualisation and interaction. The Video Browser allows for parallel visualisation of the temporal and content structure of video streams. The implementation is based on lightweight components and W3C standards. Additionally, we have published an idea paper that describes a highly experimental retrieval process that utilises rich visualisation techniques, on-line interaction and real-time feedback. Eventually, a joint publication with partners from an European project suggests a novel way to exploit content-based retrieval technology for multimedia authoring.
- Technische Universität Wien - 100%
Research Output
- 66 Citations
- 2 Publications
-
2006
Title Discrimination and Retrieval of Animal Sounds DOI 10.1109/mmmc.2006.1651344 Type Conference Proceeding Abstract Author Mitrovic D Pages 1-5 -
2003
Title VizIR—a framework for visual information retrieval DOI 10.1016/s1045-926x(03)00035-1 Type Journal Article Author Eidenberger H Journal Journal of Visual Languages & Computing Pages 443-469