Bayesian Integrative models

Statistical Tests for Large Tree-Structured Data

We develop a general statistical framework for the analysis and inference of large tree-structured data, with a focus on developing asymptotic goodness-of-fit tests. We first propose a consistent statistical model for binary trees, from which we …

Prediction-Oriented Marker Selection (PROMISE): With Application to High-Dimensional Regression

In personalized medicine, biomarkers are used to select therapies with the highest likelihood of success based on an individual patient’s biomarker/genomic profile. Two goals are to choose important biomarkers that accurately predict treatment …

Sparse Multi-Dimensional Graphical Models: A Unified Bayesian Framework

Multi-dimensional data constituted by measurements along multiple axes have emerged across many scientific areas such as genomics and cancer surveillance. A common objective is to investigate the conditional dependencies among the variables along …

PCAN: Probabilistic correlation analysis of two non-normal data sets

Most cancer research now involves one or more assays profiling various biological molecules, e.g., messenger RNA and micro RNA, in samples collected on the same individuals. The main interest with these genomic data sets lies in the identification of …

A Semiparametric Bayesian Model for Comparing DNA Copy Numbers

We propose a two-step method for the analysis of copy number data. We first define the partitions of genome aberrations and conditional on the partitions we introduce a semiparametric Bayesian model for the analysis of multiple samples from patients …

DEMARCATE: Density-based magnetic resonance image clustering for assessing tumor heterogeneity in cancer

Tumor heterogeneity is a crucial area of cancer research wherein inter- and intra-tumor differences are investigated to assess and monitor disease development and progression, especially in cancer. The proliferation of imaging and linked genomic data …

Integrative Bayesian analysis of neuroimaging-genetic data with application to cocaine dependence

Neuroimaging and genetic studies provide distinct and complementary information about the structural and biological aspects of a disease. Integrating the two sources of data facilitates the investigation of the links between genetic variability and …

Latent feature decompositions for integrative analysis of diverse high-throughput genomic data

A general method for regressing a continuous response upon large groups of diverse genetic covariates via dimension reduction is developed and exemplified. It is shown that allowing latent features derived from different covariate groups to interact …

Latent feature decompositions for integrative analysis of multi-platform genomic data

Increased availability of multi-platform genomics data on matched samples has sparked research efforts to discover how diverse molecular features interact both within and between platforms. In addition, simultaneous measurements of genetic and …

Bayesian Variable Selection in Linear Regression in One Pass for Large Datasets

Bayesian models are generally computed with Markov Chain Monte Carlo (MCMC) methods. The main disadvantage of MCMC methods is the large number of iterations they need to sample the posterior distributions of model parameters, especially for large …