Obugrian Database: Text Corpora and Dictionaries
Obugrian Database: Text Corpora and Dictionaries
DACH: Österreich - Deutschland - Schweiz
Disciplines
Linguistics and Literature (100%)
Keywords
-
Endangered Finno-Ugric Languages,
Morphological and Syntactic Analysis,
Ob-Ugric Languages and Dialects,
Concordance Dictionaries,
Text Corpus,
Information Portal
The goal of this joint project of the Universities of Munich and Vienna (DFG-FWF) is to compile comprehensive databases of not yet described and less described dialects of two related Ob-Ugric (< Ugric < Finno-Ugric) languages Khanty (Ostyak) and Mansi (Vogul), both endangered. The dialectal modules planned are Western Mansi, extinct, and Yugan Khanty; they will enrich the already existing resources on Ob-Ugric languages created within the framework of the ESF EuroBABEL project (containing materials on Kazym Khanty and Surgut Khanty as well as Northern Mansi and partly Eastern Mansi). The modules will include all existing published texts of the extinct dialect as well as texts and fieldwork materials for Yugan Khanty. As the texts are published in different transcriptions, the first step will be their phonological analysis and transliteration into the IPA. The text corpus will be translated into English, for Yugan Khanty also in German, fully glossed (in FLEx), so that dictionaries of lexemes and morphemes for each dialect will be compiled with a concordance function; the text corpus will be additionally provided with syntactic annotations. On the base of allomorphic analysis, paradigms and position slot models for parts of speech will be compiled, resulting in grammatical descriptions that were absent up to now. For Yugan Khanty phonetic analysis will be done on the basis of fieldwork recordings (audio and video, phonetic tagging with ELAN). Compiling bilingual dictionaries for these two dialects will also enlarge the base of the Ob-Ugric onomaseological dictionary; for etymological information lexicon entries will be linked with the Uralonet database. An additional goal is sharing the analysed data with The Language Archive of the Max-Planck-Institute for Psycholinguistics (Nijmegen/The Netherlands).
This international research project was a joint DFG/FWF project of the University of Vienna together with the Ludwig-Maximilians-Universität of Munich from July 2014 to June 2017.In this project work was done on the two Ob-Ugrian languages Mansi (above all in Vienna) and Chanty (solely in Munich). The speakers of the Mansi language are a minority people in the Russian Federation, living in northwestern Siberia. According to the last census in 2010 they number 12,269 people of whom only a part speaks Mansi as their mother tongue. The language falls into several dialect groups of which most are already extinct.The task of the Viennese team was to compile and process a corpus of Western Mansi. This branch of the Mansi language which is extinct had five dialects. Although the field of Finno-Ugrian linguistics has two large text collections at its disposal, both of which contain Western Mansi material, only little attention had been paid to these dialects. The two text collections contain texts from four of the five Western Mansi dialects in differing amounts. At the beginning of the project the team set itself the task of dealing with and describing only one dialect, the Pelym dialect, but by the end of the project all the Western Mansi dialects contained in the collections had been dealt with.All the texts were transcribed in IPA (International Phonetic Alphabet), digitalized, analyzed morphologically (with software developed specifically for linguistic purposes) and translated into English. Annotations were written relating to problematic and/or uncertain text passages and a corpus-related lexicon was created. For the three dialects of Western Mansi for which a sufficient number of texts was available three detailed grammars based on the analysis of the texts were written. The western branch of Mansi was described for the very first time to this extent, 106 texts from four dialects being analyzed and dealt with in the manner described above. Through this work the Viennese team produced a Ph.D. dissertation and a Masters thesis which were submitted at the same time as the end of the project.The University of Munich was responsible for the technical implementation, thus everything relating to this was undertaken and realized by them. The results of the project were made available to the public on the project homepage.
- Universität Wien - 100%