Models of phoneme and grapheme frequencies
Models of phoneme and grapheme frequencies
Wissenschaftsdisziplinen
Mathematik (50%); Sprach- und Literaturwissenschaften (50%)
Keywords
-
Quantitative Linguistics,
Mathematical Models,
Phoneme Frequencies,
Grapheme Frequencies,
Probability Distributions
The proposed project is interdisciplinary, joining mathematics (mainly probability theory and statistics) and linguistics. It focuses on the level of phonemes and graphemes and their frequency behavior (we are interested in phoneme and grapheme frequencies as systems, not in frequencies of individual units). The goal of the project is not only to derive good fitting mathematical models (usually given by discrete probability distributions) for grapheme and phoneme frequencies. This will be the first necessary step, but we have the ambition to arrive not only at description of the phenomena, but also at their explanation. Hence, we shall also aim at interpreting model parameters, formulating testable hypotheses and their testing. We suppose that there exists a general (not language specific) model. The proposed research program combines theoretical investigations with empirical work and computer programming. Theoretical (yet unsolved) mathematical/statistical problems which are relevant for the project include sample size problems, development of new distribution models which take into account steps almost always observed in phoneme/grapheme frequencies, parameter estimation in these models and study of their mutual relations. Empirical work consists mainly of creating or completing data bases from Slavic languages, developing tools for data mining from these data bases and preparing samples for analyses. Naturally, given huge samples sizes and relatively demanding numerical procedures for parameter estimations etc., use of computers cannot be avoided. We shall use ready software as well as write new one where necessary. A cooperation among members of the research team (which will be a unique blend of linguists who are competent users of mathematical/statistical methods and a mathematician who has experience in linguistic research) promises high chances of achieving project goals.
- Universität Graz - 100%