In the first part of the talk I present a novel constructive approach
called HdBCS to generate large-scale undirected Gaussian graphical models based on a sparse representation of the joint distribution of covariates via sets of linear regressions. I discuss the validity of mystochastic search algorithm and show how to estimate various dependence measures (e.g., Kendall's tau, Spearman's rho) by taking into account model uncertainty. I briefly introduce GraphExplore \--- a JAVA application for presenting, visualizing and interrogating large complex networks of interactions. I illustrate the use of HdBCS and GraphExplore to efficiently mine across multiple microarray data sets.Next I develop a comprehensive framework for combining ordered categorical and continuous covariates into parsimonious predictive models for categorical variables with two or more levels. I emphasize
the importance of parallel computing in exploring huge spaces with thousands of covariates. The final example of the talk involves the identification of multivariate patterns of association among gene expression profiles, SNPs and clinical data that are predictive of atherosclerosis burden in human target tissues.
Abstract
called HdBCS to generate large-scale undirected Gaussian graphical models based on a sparse representation of the joint distribution of covariates via sets of linear regressions. I discuss the validity of mystochastic search algorithm and show how to estimate various dependence measures (e.g., Kendall's tau, Spearman's rho) by taking into account model uncertainty. I briefly introduce GraphExplore \--- a JAVA application for presenting, visualizing and interrogating large complex networks of interactions. I illustrate the use of HdBCS and GraphExplore to efficiently mine across multiple microarray data sets.Next I develop a comprehensive framework for combining ordered categorical and continuous covariates into parsimonious predictive models for categorical variables with two or more levels. I emphasize
the importance of parallel computing in exploring huge spaces with thousands of covariates. The final example of the talk involves the identification of multivariate patterns of association among gene expression profiles, SNPs and clinical data that are predictive of atherosclerosis burden in human target tissues.