Situated Vision to Perceive Object Shape and Affordances
Situated Vision to Perceive Object Shape and Affordances
DACH: Österreich - Deutschland - Schweiz
Disciplines
Electrical Engineering, Electronics, Information Engineering (40%); Computer Sciences (60%)
Keywords
-
Computer Vision,
Cognitive Vision,
Robotics,
Attention,
Shape,
Affordances
The objective is to provide models and methods to detect, recognize, and categorize the 3D shape of everyday objects and their affordances in homes. The planned innovations are: (1) We propose the Situated Vision paradigm and develop 3D visual perception capabilities from the view of a robot, its task, and the environment it operates in. (2) We show the generality of the Situated Vision approach by evaluating the performance on different robots at the project partners and in different environments. The Situated Vision approach is inspired by recent work in cognitive science, neuroscience and interdisciplinary work in EU projects: it fuses qualitative and quantitative cues to extract and group 3D shape elements and relate them to affordance categories. Cognitive mechanisms such as situation-based visual attention and task-oriented visual search let the robot execute primitive actions to exploit the perceived affordances. Perception integrates quantitative and qualitative shape information from multiple 2D and 3D measurements. The analysis of the shapes is used to find instances of semantic 3D concepts, such as providing support to objects that can be used to find semantic entities and to learn affordance categories. The system will be tested in three typical home scenarios and clutter in the form of five or more objects around the target object. Four renowned research teams combine their experience to show that the combination of attention (Uni Bonn), categorization (RWTH Aachen), shape perception (TU Wien) and learning (IDIAP) will bring about a big step forward in cognitive robotics.
The scope of the project in the context of research at TU Wien has been the building of algorithms towards the detection of aspects of objects that contribute to the functionality of the object. These aspects are expected to be determined using color and 3D images of the objects. We call these functional aspects affordances or affordance features. For example, when a person observes the concave container part of a cup, this aspect of the cup enables the person to make a determination that the cup can hold or contain solid or liquid substances within the concave cylindrical boundaries of the cup. Such information can also be used to label the object under study as a cup, a mug or a pot, as the case might be. Hence the cup affords contain-ability. In other words, the affordance feature in question is contain-ability. Similarly, it is possible to determine various geometries associated with objects and determine the functionality these geometries give to the objects. At TU Wien, we have developed computer vision algorithms that take color and 3D images as input, determine the parts of the objects in the scene, ascertain the geometries comprising the parts and finally determine the affordances of affordance features that these geometries provide to the object under study. For the first time in the field of affordances, we have been able to develop a concrete and rigorous approach to defining affordances and generating algorithms that help estimate the affordances. The resulting effort is a database that describes over 200 objects commonly found in indoor environments, in terms of their functions (or affordances) using a combination of only 35 affordance feature types. Research in such a function based approach towards recognizing objects is still nascent and our contributions in cooperation with our project partners should enable improve the scope of research in this field significantly. It should be noted that such a cognitive approach to understanding objects and scenes containing the objects based on their functionality is the key towards building robots and artificial intelligence systems that are capable of solving cognitive problems, approaching more realistic and human- like performance. Such technology is envisaged to be deployed in robotic assistants in both domestic and industrial scenarios. Furthermore, the concept of characterizing objects by their function and associating shapes and design of objects by their function is intricately tied to the design philosophy of form follows function that has implications in not just product design but also in architecture and civil engineering.
- Technische Universität Wien - 100%
- Bastian Leibe, RWTH Aachen - Germany
- Barbara Caputo, Politecnico di Torino - Italy
Research Output
- 525 Citations
- 20 Publications
-
2014
Title Attention-Driven Object Detection and Segmentation of Cluttered Table Scenes using 2.5D Symmetry DOI 10.1109/icra.2014.6907584 Type Conference Proceeding Abstract Author Potapova E Pages 4946-4952 -
2014
Title 4D Space-Time Mereotopogeometry-Part Connectivity Calculus for Visual Object Representation DOI 10.1109/icpr.2014.740 Type Conference Proceeding Abstract Author Varadarajan K Pages 4316-4321 -
2013
Title Interactive object modelling based on piecewise planar surface patches DOI 10.1016/j.cviu.2013.01.010 Type Journal Article Author Prankl J Journal Computer Vision and Image Understanding Pages 718-731 Link Publication -
2013
Title Geometric data abstraction using B-splines for range image segmentation DOI 10.1109/icra.2013.6630569 Type Conference Proceeding Abstract Author Morwald T Pages 148-153 -
2013
Title Multimodal Cue Integration through Hypotheses Verification for RGB-D Object Recognition and 6DOF Pose Estimation DOI 10.1109/icra.2013.6630859 Type Conference Proceeding Abstract Author Aldoma A Pages 2104-2111 -
2012
Title Segmentation of Unknown Objects in Indoor Environments DOI 10.1109/iros.2012.6385661 Type Conference Proceeding Abstract Author Richtsfeld A Pages 4791-4796 -
2012
Title Real-time Quadric Fitting for Point Cloud Parametrization using Particle Convergence. Type Conference Proceeding Abstract Author Varadarajan Km -
2012
Title Segmentation of Unknown Objects in Indoor Environments. Type Conference Proceeding Abstract Author Richtsfeld A -
2012
Title A Global Hypotheses Verification Method for 3D Object Recognition DOI 10.1007/978-3-642-33712-3_37 Type Book Chapter Author Aldoma A Publisher Springer Nature Pages 511-524 Link Publication -
2014
Title Find my mug: Efficient object search with a mobile robot using semantic segmentation DOI 10.48550/arxiv.1404.5765 Type Preprint Author Wolf D -
2014
Title Learning of perceptual grouping for object segmentation on RGB-D data DOI 10.1016/j.jvcir.2013.04.006 Type Journal Article Author Richtsfeld A Journal Journal of Visual Communication and Image Representation Pages 64-73 Link Publication -
2011
Title Object Part Segmentation and Classification in Range Images for Grasping DOI 10.1109/icar.2011.6088647 Type Conference Proceeding Abstract Author Varadarajan K Pages 21-27 -
2011
Title Object Part Segmentation and Classification in Range Images for Grasping. Type Conference Proceeding Abstract Author Varadarajan Km -
2011
Title k-TR: Karmic Tabula Rasa - A Theory of Visual Perception. Type Conference Proceeding Abstract Author Varadarajan Km Conference Conference of the International Society of Psychophysics - ISP, Herzliya, Israel -
2013
Title Localizing and Segmenting Objects with 3D Objectness. Type Conference Proceeding Abstract Author Aldoma Buchaca A Conference Computer Vision Winter Workshop (CVWW), 2013 -
2013
Title Parallel Deep Learning with Suggestive Activation for Object Category Recognition DOI 10.1007/978-3-642-39402-7_36 Type Book Chapter Author Varadarajan K Publisher Springer Nature Pages 354-363 -
2013
Title Probabilistic Cue Integration for Real-Time Object Pose Tracking DOI 10.1007/978-3-642-39402-7_26 Type Book Chapter Author Prankl J Publisher Springer Nature Pages 254-263 -
2013
Title AfNet: The Affordance Network DOI 10.1007/978-3-642-37331-2_39 Type Book Chapter Author Varadarajan K Publisher Springer Nature Pages 512-523 -
2013
Title Gaussian-weighted Jensen–Shannon divergence as a robust fitness function for multi-model fitting DOI 10.1007/s00138-013-0513-1 Type Journal Article Author Zhou K Journal Machine Vision and Applications Pages 1107-1119 Link Publication -
2013
Title MRF Guided Anisotropic Depth Diffusion for Kinect Range Image Enhancement DOI 10.1007/978-3-642-37484-5_19 Type Book Chapter Author Varadarajan K Publisher Springer Nature Pages 223-235