Functional data analyses

A Nonparametric Bayesian Technique for High-Dimensional Regression

This paper proposes a nonparametric Bayesian framework called VariScan for simultaneous clustering, variable selection, and prediction in high-throughput regression settings. Poisson-Dirichlet processes are utilized to detect lower-dimensional latent …

Bayesian hierarchical structured variable selection methods with application to MIP studies in breast cancer.

The analysis of alterations that may occur in nature when segments of chromosomes are copied (known as copy number alterations) has been a focus of research to identify genetic markers of cancer. One high-throughput technique recently adopted is the …

Bayesian Joint Selection of Genes and Pathways: Applications in Multiple Myeloma Genomics. Cancer

It is well-established that the development of a disease, especially cancer, is a complex process that results from the joint effects of multiple genes involved in various molecular signaling pathways. In this article, we propose methods to discover …

Bayesian Variable Selection in Linear Regression in One Pass for Large Datasets

Bayesian models are generally computed with Markov Chain Monte Carlo (MCMC) methods. The main disadvantage of MCMC methods is the large number of iterations they need to sample the posterior distributions of model parameters, especially for large …

Bayesian disease classification using copy number data

DNA copy number variations (CNVs) have been shown to be associated with cancer development and progression. The detection of these CNVs has the potential to impact the basic knowledge and treatment of many types of cancers, and can play a role in the …

Bayesian ensemble methods for survival prediction in gene expression data

We propose a Bayesian ensemble method for survival prediction in high-dimensional gene expression data. We specify a fully Bayesian hierarchical approach based on an ensemble ‘sum-of-trees’ model and illustrate our method using three popular survival …

Estimating Shared Copy Number Aberrations for Array CGH Data: the Linear-Median Method

Existing methods for estimating copy number variations in array comparative genomic hybridization (aCGH) data are limited to estimations of the gain/loss of chromosome regions for single sample analysis. We propose the linear-median method for …

Bayesian Random Segmentation Models to Identify Shared Copy Number Aberrations for Array CGH Data

Array-based comparative genomic hybridization (aCGH) is a high-resolution high-throughput technique for studying the genetic basis of cancer. The resulting data consists of log fluorescence ratios as a function of the genomic DNA location and …

Fast PCA and Bayesian Variable Selection for Large Data Sets Based on SQL and UDFs

Large amounts of data are stored in relational DBMSs. However, statistical analysis is frequently performed outside the DBMS using statistical tools, such as the well-known R package, leading to slow processing when data sets cannot fit in main …

Reduced rank mixed effects models for spatially correlated hierarchical functional data.

Hierarchical functional data are widely seen in complex studies where sub-units are nested within units, which in turn are nested within treatment groups. We propose a general framework of functional mixed effects model for such data: within unit and …