Information-Theoretic Markov Aggregation
Information-Theoretic Markov Aggregation
Disciplines
Electrical Engineering, Electronics, Information Engineering (40%); Computer Sciences (40%); Mathematics (20%)
Keywords
-
(hidden) Markov models,
State Space Aggregation,
Information Theory,
Lumpability,
Information Bottleneck,
Coarse Graining
Markov models are important mathematical models for random processes and are used in many scientific disciplines. They act as tools in communications, genetics, systems biology, speech communication, and machine learning. In some of these application areas, however, the resulting models are too large to be useful. On the one hand it might become impossible to simulate a given model efficiently, on the other hand the available data might not suffice to determine the model parameters. Hence, in these scientific disciplines it is necessary to simplify the mathematical models. The goal of this research stay is to aggregate Markov models with information-theoretic methods. Aggregation means that states of the original model are grouped together to a single state in speech communication, for example, the words research, apply, and hope could be grouped together to verbs. While the simplification of Markov models became increasingly popular in the last few years, information-theoretic methods are still used sparingly. But what are information-theoretic methods? Information theory deals with the transmission of information and its mathematical fundamentals. To aggregate Markov models with information-theoretic methods means, loosely speaking, to simplify the model while preserving as much information as possible. We will not only develop theoretical results during the research stay, but we will also design algorithms which simplify a given Markov model. The application areas of such algorithms are manifold: In concrete terms, we hope to find a way to simplify the Viterbi algorithm, an important algorithm in communications. Aside from that we will apply our methods in speech communication (especially during the return phase in Austria). Who knows, maybe some of our results will even find their way into the source code of Siri, Google Now, or Microsoft Cortana!
How can you simplify a complex mathematical model without losing too much information? This is precisely the question that was addressed during this Schroedinger Fellowship. We placed our focus on Markov models, which are popular models for sequences of random objects in fields as diverse as language, biology, speech communication, and machine learning. Specifically, the goal of this Schroedinger Fellowship was to aggregate Markov models with information- theoretic methods. Aggregation in this context means that objects are grouped together to clusters - in natural language processing, for example, the objects ro research, to apply, and to hope would be grouped together to the cluster verbs. While aggregation is an old idea, combining it with information- theoretic methods has become popular only recently. But what are information-theoretic methods? Information theory deals with the transmission of information and its fundamental limits. To aggregate a Markov model with information-theoretic methods means, loosely speaking, to simplify the mathematical model while preserving as much information as possible. Although the project - the name says it all - was of a mainly theoretical nature, we have found several interesting practical applications of the developed theory and algorithms. For example, Markov aggregation techniques can be used to cluster documents, thus revealing topics and groups of words that indicate them. Another application is the grouping of movies based on user ratings. The results of this grouping can subsequently be used to recommend movies to users based on the ratings of other users with similar interests. Finally, Markov aggregation can be used to detect communities in dramatic plays, connecting this Schroedinger Fellowship to literary network analysis, an exciting field of research in the digital humanities. Who knows? Perhaps it is such methods that will tell whether Romeo and Juliet`s marriage would have lasted, had their story ended differently.
- Technische Universität München - 100%
- Technische Universität Graz - 100%
Research Output
- 97 Citations
- 13 Publications
-
2022
Title Understanding Neural Networks and Individual Neuron Importance via Information-Ordered Cumulative Ablation DOI 10.1109/tnnls.2021.3088685 Type Journal Article Author Amjad R Journal IEEE Transactions on Neural Networks and Learning Systems Pages 7842-7852 Link Publication -
2016
Title Information-Theoretic Analysis of Memoryless Deterministic Systems DOI 10.3390/e18110410 Type Journal Article Author Geiger B Journal Entropy Pages 410 Link Publication -
2016
Title Graph-Based Lossless Markov Lumpings DOI 10.1109/isit.2016.7541856 Type Conference Proceeding Abstract Author Geiger B Pages 3033-3037 Link Publication -
2016
Title Greedy Algorithms for Optimal Distribution Approximation DOI 10.3390/e18070262 Type Journal Article Author Geiger B Journal Entropy Pages 262 Link Publication -
2018
Title The Fractality of Polar and Reed–Muller Codes †DOI 10.3390/e20010070 Type Journal Article Author Geiger B Journal Entropy Pages 70 Link Publication -
2018
Title Co-Clustering via Information-Theoretic Markov Aggregation DOI 10.1109/tkde.2018.2846252 Type Journal Article Author Blchl C Journal IEEE Transactions on Knowledge and Data Engineering Pages 720-732 Link Publication -
2018
Title A Rate-Distortion Approach to Caching DOI 10.1109/tit.2017.2768058 Type Journal Article Author Timo R Journal IEEE Transactions on Information Theory Pages 1957-1976 Link Publication -
2018
Title Information Loss in Deterministic Signal Processing Systems DOI 10.1007/978-3-319-59533-7 Type Book Author Geiger B Publisher Springer Nature -
2017
Title A sufficient condition for a unique invariant distribution of a higher-order Markov chain DOI 10.1016/j.spl.2017.07.006 Type Journal Article Author Geiger B Journal Statistics & Probability Letters Pages 49-56 Link Publication -
2017
Title Semi-supervised cross-entropy clustering with information bottleneck constraint DOI 10.1016/j.ins.2017.07.016 Type Journal Article Author Smieja M Journal Information Sciences Pages 254-271 Link Publication -
2017
Title On the Information Dimension Rate of Stochastic Processes DOI 10.1109/isit.2017.8006656 Type Conference Proceeding Abstract Author Geiger B Pages 888-892 Link Publication -
2017
Title Secret-key Binding to Physical Identifiers with Reliability Guarantees DOI 10.1109/icc.2017.7996732 Type Conference Proceeding Abstract Author Günlü O Pages 1-6 -
2017
Title Divergence Scaling of Fixed-Length, Binary-Output, One-to-One Distribution Matching DOI 10.1109/isit.2017.8007095 Type Conference Proceeding Abstract Author Schulte P Pages 3075-3079 Link Publication