ArTificial Language uNdersTanding In robotS ATLANTIS
ArTificial Language uNdersTanding In robotS ATLANTIS
ERA-Net: CHIST ERA
Disciplines
Electrical Engineering, Electronics, Information Engineering (20%); Computer Sciences (50%); Linguistics and Literature (30%)
Keywords
-
Language Grounding,
Multi-Modal Language Learning,
Construction Grammar,
Multi-Modal Language Understanding,
Ontogenetic Ritualisation,
Multi-Modal Object Reference
ATLANTIS attempts to understand and model the very first stages in grounded language learning, as we see in children until the age of three: how pointing or other symbolic gestures emerge from the ontogenetic ritualization of instrumental actions, how words are learned very fast in contextualized language games, and how the first grammatical constructions emerge from concrete sentences. This requires a global, computational theory of symbolic development that informs us about what forces motivate language development, what strategies are exploited in learner and caregiver interactions to come up with more complex compositional meanings, how new grammatical structures and novel interaction patterns are formed, and how the multitude of developmental pathways observed in humans lead to a full system of multi-modal communication skills. This ambitious aim is feasible because there have been very significant advances in humanoid robotics and in the development of sensory-motor competence recently, and the time is ripe to push all this to a higher level of symbolic intelligence, going beyond simple sensory-motor loops or pattern-based intelligence towards grounded semantics, and incremental, long-term, autonomous language learning.
The overall goal of the multinational project ATLANTIS was to understand and model the very first stages in grounded language learning, as we see in children until the age of three. This includes how pointing or other symbolic gestures emerge from the ontogenetic ritualization of instrumental actions, how words are learned very fast in contextualized language games, and how the first grammatical constructions emerge from concrete sentences. This requires studying questions such as, what strategies are exploited in learner and caregiver interactions to come up with more complex compositional meanings, how new grammatical structures and novel interaction patterns are formed, and how the multitude of developmental pathways observed in humans lead to a full system of multi-modal communication skills. This way, the project enables future robots to learn -- from humans in everyday situations -- basic elements of natural language and of the relations between natural language utterances and concrete situations. Moreover, computational modelling of theoretical insights is an appropriate means for their validation, which in turn helps refining the initial hypotheses, and as an iterative process supports scientific understanding. The Austrian sub-project was concerned with investigating and modelling how humans in their communication refer to objects in the world. Imagine somebody is saying give me the green one over there. Which information does the addressee need to identify which object the speaker is referring to? At least, there must be an indication where over there is, for instance, by means of a pointing gesture of the speaker, or a directional eye gaze or a head movement. When the direction where to look for the object in question is clear, the addressee may look for a green object, and if there is only one, the addressee will assume it is the one the speaker has referred to. In order to do this kind of complex reasoning, an infant first of all needs to have already learned mappings between objects, locations and actions on the one hand and words on the other hand. Moreover, it needs to understand that pointing and looking may have communicative functions. All this comes very natural to humans, but is a tremendous challenge for modelling on a computer. In order to understand the underlying mechanisms a good deal of basic research is required. The funding provided by the FWF enabled the Austrian Research Institute for Artificial Intelligence to investigate and develop models for multi-modal object reference resolution based on insights from human tutoring situations, as well as to implement a component that enables a robot to learn words for objects during tutoring situations.
- Ann Nowe, Vrije Universiteit Brussel - Belgium
- Luc Steels, Vrije Universiteit Brussel - Belgium
- Thierry Poibeau, Ecole Normale Supérieure - France
- Remi Van Trijp, SONY CLS-Paris - France
Research Output
- 4 Citations
- 1 Publications
-
2018
Title Grounded Word Learning on a Pepper Robot DOI 10.1145/3267851.3267903 Type Conference Proceeding Abstract Author Hirschmanner M Pages 351-352