Autonomous Learning of the Meaning of Objects
Autonomous Learning of the Meaning of Objects
Disciplines
Electrical Engineering, Electronics, Information Engineering (60%); Computer Sciences (40%)
Keywords
-
Robot learning,
Web mining,
Object recognition,
Ontology,
Active segmentation
When working with and for humans, robots and autonomous systems must know about the objects involved in human activities, e.g. the parts and tools in manufacturing, the professional items used in service applications, and the objects of daily life in assisted living. While great progress has been made in object instance and class recognition, a robot is always limited to knowing about the objects it has been trained to recognize. The goal of ALOOF is to enable robots to exploit the vast amount of knowledge on the Web in order to learn about previously unseen objects and to use this knowledge when acting in the real world. We will develop techniques to allow robots to use the Web to not just learn the appearance of new objects, but also their properties including where they might be found in the robots environment. To achieve our goal, we will provide a mechanism for translating between the representations robots use in their real-world experience and those found on the Web. Our proposed translation mechanism is a meta-modal representation (i.e. a representation which contains and structures representations from other modalities), composed of meta-modal entities and relations between them. A single entity represents a single object type, and is composed of modal features extracted from robot sensors or the Web. The combined features are linked to the semantic properties associated with each entity. The robots collection of meta-modal entities is organized into a structured ontology, supporting formal reasoning. This representation is complemented with methods for detecting gaps in the knowledge of the robot (i.e. unknown objects and properties), and for planning how to fill these gaps. As the robots main source of new knowledge will be the Web, we will also contribute techniques for extracting relevant knowledge from Web resources using novel machine reading and computer vision algorithms. Our scenario for evaluating project progress consists of an open-ended domestic setting where robots have to find objects. Our measure of progress will be how many knowledge gaps (i.e. situations where the robot has incomplete information about objects), can be resolved autonomously given specific prior knowledge. We will integrate the results on multiple mobile robots including the MetraLabs SCITOS robot, and the home service robot HOBBIT.
When working with and for humans, robots and autonomous systems must know about the objects involved in human activities, e.g. the parts and tools in manufacturing, the professional items used in service applications, and the objects of daily life in assisted living. While great progress has been made in object instance and class recognition, a robot is always limited to knowing about the objects it has been trained to recognize. The goal of ALOOF is to enable robots to exploit the vast amount of knowledge on the Web in order to learn about previously unseen objects and to use this knowledge when acting in the real world. We developed techniques to allow robots to use the Web to not just learn the appearance of new objects, but also their properties including where they might be found in the robots environment. To achieve our goal, we provided algorithms for mapping between the perceptual representations robots use in their real-world experience and those found on the Web. During the project, partner TU Wien with the other partners of the project achieved several results in robot perception and learning from visual localities across Web and situated data, producing an integrated system where all such components are combined together to fill in knowledge gaps detected by robot platforms. Specifically, key results have been achieved in: deep learning architectures able to learn how to leverage over global and local properties of Web and situated visual data in an unsupervised, data-driven fashion, bridging the gap between the visual representations in these two domains; a coherent, integrated framework for robotic knowledge gap detection that triggers semantic and perceptual Web mining; Perceptual and semantic algorithms for automatic Web mining, able to create very large scale collection of structured, annotated data consisting of text, RGB images and depth maps without any need for manual annotators; Algorithms for the extraction of semantic information from scene segmentation with 3DEntangled Forests (3DEF); A robot system integrating the components highlighted above, able to fill autonomously situated knowledge gaps on the basis of Web information collected autonomously. The results achieved in the project match the objectives planned in the technical annex, i.e., robot systems able to autonomously learn about unknown entities their object type, their 3D shape and visual appearance, what other objects are likely to be close to them and qualitative spatial relations between them.
- Technische Universität Wien - 100%
- Barbara Caputo, Politecnico di Torino - Italy
- Nick Hawes, The University of Birmingham
Research Output
- 97 Citations
- 3 Publications
-
2018
Title User Experience Results of Setting Free a Service Robot for Older Adults at Home DOI 10.5772/intechopen.70453 Type Book Chapter Author Vincze M Publisher IntechOpen Link Publication -
2019
Title An Empirical Evaluation of Ten Depth Cameras DOI 10.1109/mra.2018.2852795 Type Journal Article Author Halmetschlager-Funek G Journal IEEE Robotics & Automation Magazine Pages 67-77 Link Publication -
2018
Title Recognizing Objects in-the-Wild: Where do we Stand? DOI 10.1109/icra.2018.8460985 Type Conference Proceeding Abstract Author Loghmani M Pages 2170-2177 Link Publication