Latent feature decompositions for integrative analysis of diverse high-throughput genomic data


A general method for regressing a continuous response upon large groups of diverse genetic covariates via dimension reduction is developed and exemplified. It is shown that allowing latent features derived from different covariate groups to interact aids in prediction when interactions subsist among the original covariates. A means of selecting a subset of relevant covariates from the original set is proposed, and a simulation study is performed to demonstrate the effectiveness of the procedure for prediction and variable selection. The procedure is applied to a high-dimensional lung cancer data set to model the effects of gene expression, copy number variation, and methylation on a drug response.

IEEE/ACM Trans Comput Biol Bioinform