Gaussian Graphical Models: Theory and Applications
Gaussian Graphical Models: Theory and Applications
Disciplines
Mathematics (100%)
Keywords
-
Algebraic statistics,
Causal Interference,
Gaussian graphical models,
Occam's razor,
Maximum likelihood estimaton,
Convex optimization
My research goals are to study graphical models in statistics and develop practical algorithms that allow application of these models to scientifically important novel problems. The proposed project lies at the interface of mathematical statistics, convex optimization and applied algebraic geometry, three areas which have developed a strong interplay in recent years. My goals over the next six years are to apply my expertise in graphical models to expand and deepen the connections between mathematical statistics, convex optimization and applied algebraic geometry through my research program and my educational activities. My proposal consists of three long-term projects all at the interface of the three areas. The first project will develop methods for determining causal relationships between variables based on observational data, without the need for randomized controlled trials. Using the framework of directed Gaussian graphical models, I will develop methods that incorporate prior knowledge and allow feedback loops; this is particularly important for biological applications, where causal feedback in pathways is common. I will apply the results obtained to learn time-varying gene regulatory networks from time series gene expression data throughout the development of Drosophila. I will also analyze an emerging data set from the Genotype-Tissue Expression (GTEx) project with the aim of inferring tissue- and person-specific gene regulatory networks. This project could not be more timely, since the GTEx project is now in the scale-up phase of donor collection and tissue analysis. The second project is about maximum likelihood estimation in Gaussian models with linear constraints on the covariance matrix. Such models arise in many applications, including stationary stochastic processes from repeated time series data, phylogenetic tree reconstruction, and network tomography models used to analyze the structure of connections in the Internet. I will develop scalable methods for learning the structure in such models and apply the new methodology to infer phylogenetic trees based on intron lengths and to infer the path data takes through the Internet. The third project is about Bayesian model selection in Gaussian graphical models. The G- Wishart distribution is extremely attractive because it is the conjugate prior for this model; however, it had fallen short of its promise because its normalizing constant seemed intractable. In a recent collaboration I solved this 20-year old problem and found a closed-form formula for the normalizing constant. We will turn this theoretical result into a practical procedure for computing these normalizing constants developing methodology for Bayesian graphical model selection with thousands of nodes and apply this new methodology to weather forecasting.
My research is centered around the study of graphical models in statistics and the development of practical algorithms that allow application of these models to scientifically important novel problems. This project was at the interface of mathematical statistics, convex optimization and applied algebraic geometry, three areas which have developed a strong interplay in recent years. My goals were to apply my expertise in graphical models to expand and deepen the connections between mathematical statistics, convex optimization and applied algebraic geometry through my research program and my educational activities. My proposal consisted of three long-term projects all at the interface of the three areas. The first project developed methods for determining causal relationships between variables based on observational data, without the need for randomized controlled trials. Using the framework of directed Gaussian graphical models, I developed methods that incorporate prior knowledge; this is particularly important for biological applications. I applied the results obtained to learn gene regulatory networks from expression data. In particular, I analyzed single-cell gene expression data from the newly developed drop-seq technology to learn cell-type specific gene regulatory networks. The second project was about maximum likelihood estimation in Gaussian models with linear constraints on the covariance matrix. Such models arise in many applications, including stationary stochastic processes from repeated time series data, phylogenetic tree reconstruction, and network tomography models used to analyze the structure of connections in the Internet. I developed scalable methods for learning the structure in such models and applied the new methodology to infer phylogenetic trees. The third project was about Bayesian model selection in Gaussian graphical models. The G-Wishart distribution is extremely attractive because it is the conjugate prior for this model; however, it had fallen short of its promise because its normalizing constant seemed intractable. We solved this 2 0- year old problem and found a closed- form formula for the normalizing constant. We are currently applying these results to whether forecasting applications, where methodologies for Bayesian graphical model selection with thousands of nodes are needed.
Research Output
- 291 Citations
- 9 Publications
-
2018
Title Generalized Permutohedra from Probabilistic Graphical Models DOI 10.1137/16m107894x Type Journal Article Author Mohammadi F Journal SIAM Journal on Discrete Mathematics Pages 64-93 Link Publication -
2016
Title Exponential varieties DOI 10.1112/plms/pdv066 Type Journal Article Author Michalek M Journal Proceedings of the London Mathematical Society Pages 27-56 Link Publication -
2016
Title Extremal positive semidefinite matrices whose sparsity pattern is given by graphs without K5 minors DOI 10.1016/j.laa.2016.07.026 Type Journal Article Author Solus L Journal Linear Algebra and its Applications Pages 247-275 Link Publication -
2016
Title Geometric control and modeling of genome reprogramming DOI 10.1080/19490992.2016.1201620 Type Journal Article Author Uhler C Journal BioArchitecture Pages 1-9 Link Publication -
2017
Title Total positivity in Markov structures DOI 10.1214/16-aos1478 Type Journal Article Author Fallat S Journal The Annals of Statistics Pages 1152-1184 Link Publication -
2017
Title Orientation and repositioning of chromosomes correlate with cell geometry–dependent gene expression DOI 10.1091/mbc.e16-12-0825 Type Journal Article Author Wang Y Journal Molecular Biology of the Cell Pages 1997-2009 Link Publication -
2018
Title Exact formulas for the normalizing constants of Wishart distributions for graphical models DOI 10.1214/17-aos1543 Type Journal Article Author Uhler C Journal The Annals of Statistics Pages 90-118 Link Publication -
0
Title Consistency guarantees for Permutation-based causal inference algorithms. Type Other Author Solus L -
2016
Title Maximum Likelihood Estimation for Linear Gaussian Covariance Models DOI 10.1111/rssb.12217 Type Journal Article Author Zwiernik P Journal Journal of the Royal Statistical Society Series B: Statistical Methodology Pages 1269-1292 Link Publication