DACH: Österreich - Deutschland - Schweiz
Disciplines
Biology (30%); Computer Sciences (70%)
Keywords
-
Genomics,
Phylogeny,
Sequence Evolution,
Molecular Systematics,
Evolution
We propose to conduct research on two phenomena that can get us lost in tree space when conducting phylogenetic inferences. One is that of gene tree versus species tree discordance that requires reconciliation and the second phenomenon is the existence of terraces in tree spaces requiring further scrutiny. Thus, our overarching goal is to conduct research to better understand why we are lost in tree space and how we can better navigate through tree space in a more targeted as well as computationally efficient manner. The specific projects, we propose to build upon the highly successful collaboration between the two labs from the two preceding funding periods as well as on the respective experience accumulated by the junior researchers that were funded through the preceding grant. More specifically, we will develop methods and algorithms and make them available as open source tools to (i) sample, enumerate, and summarize trees residing on a terrace in tree space, (ii) more efficiently search tree space and evaluate tree topologies in the presence of terraces under maximum likelihood and parsimony, and (iii) conduct scalable, efficient, and accurate gene tree species tree reconciliations. Biological significance: The biological significance of our work is underlined by the fact that only a handful of easy to use likelihood-based gene tree species tree reconciliation tools exist. Despite the fact, that we only have a prototype implementation of GeneRax available at present that lacks numerous desirable features, it is already being used by some early adopters. Given the large user base of RAxML-NG and IQ-TREE, every improvement in their search efficiency means that thousands of CPU hours can be saved. In addition, as shown in (Dobrin, Zwickl, and Sanderson 2018) a plethora of current phylogenomic datasets contains terraces. In other words, this is not an exotic theoretical property of search spaces, but a real problem with empirical data that needs to be addressed and better studied. If our initial findings on quasi-terraces are confirmed (see WP 1.1) the existence of terraces will affect a substantially larger fraction of empirical phylogenetic analyses as the occurrence of terrace-like structures will not depend on a specific branch linkage model. As terraces occur in the presence of missing sequences one could assume that sequencing complete genomes would solve the problem entirely. However, this is not the case. Since biological diversity of species and gene deletions are responsible for the fact that not all of the genes are present in all organisms, missing sequences are inherent property of large phylogenomic alignments. Therefore, missing data remains an important issue to be systematically accounted for by phylogenomic software. This is specially important if we attempt to resolve the Tree of Live comprising extremely diverse species with genomes containing different collections of genes.
We have developed bioinformatics methods to better understand family tree reconstruction methods. A particular focus was the problem of multiple equally good solutions, which we were able to analyze using a specially developed software tool, even for large data sets. We also looked at potential applications of neural networks in phylogeny. We were able to show that neural networks can be used efficiently to determine sequence evolution and can also help to distinguish different possible family trees better than previous methods. This opens up a new field of application for future research.
- Universität Wien - 100%
Research Output
- 9951 Citations
- 9 Publications
- 6 Datasets & models
- 1 Software
-
2025
Title When the Past Fades: Detecting Phylogenetic Signal with SatuTe. DOI 10.1093/molbev/msaf090 Type Journal Article Author Manuel C Journal Molecular biology and evolution -
2024
Title Learning From an Artificial Neural Network in Phylogenetics. DOI 10.1109/tcbb.2024.3352268 Type Journal Article Author Leuchtenberger Af Journal IEEE/ACM transactions on computational biology and bioinformatics Pages 278-288 -
2024
Title Gentrius: Generating Trees Compatible With a Set of Unrooted Subtrees and its Application to Phylogenetic Terraces. DOI 10.1093/molbev/msae219 Type Journal Article Author Chernomor O Journal Molecular biology and evolution -
2024
Title Training and Interpretation of Artificial Neural Networks in Phylogenetics Type PhD Thesis Author Alina Leuchtenberger Link Publication -
2022
Title Molecular archaeology of human cognitive traits DOI 10.1016/j.celrep.2022.111287 Type Journal Article Author Kaczanowska J Journal Cell Reports Pages 111287 Link Publication -
2020
Title IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era DOI 10.1093/molbev/msaa015 Type Journal Article Author Minh B Journal Molecular Biology and Evolution Pages 1530-1534 Link Publication -
2023
Title Parallel Inference of Phylogenetic Stands with Gentrius. Type Conference Proceeding Abstract Author Anastasis Togkousidis Conference 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pages 139-143 Link Publication -
2023
Title ModelRevelator: Fast phylogenetic model estimation via deep learning. DOI 10.1016/j.ympev.2023.107905 Type Journal Article Author Burgstaller-Muehlbacher S Journal Molecular phylogenetics and evolution Pages 107905 -
2020
Title Distinguishing Felsenstein Zone from Farris Zone Using Neural Networks DOI 10.1093/molbev/msaa164 Type Journal Article Author Leuchtenberger A Journal Molecular Biology and Evolution Pages 3632-3641 Link Publication
-
2023
Link
Title Gentrius Parallelization Type Data analysis technique Public Access Link Link -
2023
Link
Title Gentrius Algorithm Type Data analysis technique Public Access Link Link -
2023
Link
Title ModelRevelator Type Data analysis technique Public Access Link Link -
2020
Link
Title FarFelDiscerner Type Computer model/algorithm Public Access Link Link -
2024
Link
Title SatuTe Type Data analysis technique Public Access Link Link -
2024
Link
Title EvoNAPS Type Database/Collection of data Public Access Link Link