Integrating multi-platform genomic data using hierarchical Bayesian relevance vector machines


We present a statistical framework, hierarchical relevance vector machine (H-RVM), for improved prediction of scalar outcomes using interacting high-dimensional input covariates from different sources. We illustrate our methodology for integrating genomic data from multiple platforms to predict observed clinical phenotypes. H-RVM is a hierarchical Bayesian generalization of the relevance vector machine and its learning algorithm is a special case of the computationally efficient variational method of hierarchic kernel learning frame-work. We apply H-RVM to data from the Cancer Genome Atlas based Glioblastoma study to predict imaging-based tumor volume by integrating gene and miRNA expression data and show that H-RVM performs much better in prediction as compared to competing methods.

EURASIP J Bioinform Syst Biol