Special Seminar
Thierry Chekouo, PhD
Postdoctoral Fellow, Department of Biostatistics, The University of Texas, MD Anderson Cancer Center
A Bayesian approach for the integrative analysis of omics data: A kidney cancer case study
ALL ARE WELCOME
Abstract:
Integration of genomic data from multiple platforms has the capability to increase precision, accuracy, and statistical power in the identification of prognostic biomarkers. A fundamental problem faced in many multi-platform studies is unbalanced sample sizes due to the inability to obtain measurements from all the platforms for all the patients in the study. We have developed a novel Bayesian approach that integrates multi-regression models to identify a small set of biomarkers that can accurately predict time-to-event outcomes. This method fully exploits the amount of available information across platforms and does not exclude any of the subjects from the analysis. Moreover, interactions between platforms can be incorporated through prior distributions to inform the selection of biomarkers and additionally improve biomarker selection accuracy. Through simulations, we demonstrate the utility of our method and compare its performance to that of methods that do not borrow information across regression models. Motivated by The Cancer Genome Atlas kidney renal cell carcinoma dataset, our methodology provides novel insights missed by non-integrative models.
Bio:
Dr Thierry Chekouo is currently a postdoctoral fellow at The University of Texas MD Anderson cancer center under the mentorship of Dr Kim-Anh Do and Dr Francesco Stingo. He obtained a PhD in Statistics in 2012 at the University of Montreal under the supervision of Dr. Alejandro Murua. He also obtained a Master's degree in Mathematics in 2004 at the University of Yaounde I (Cameroon) and a Master's degree in Statistics and Economics in 2007 at the Upper National School of Statistics and Applied Economics in Abidjan, Cote d'Ivoire. His research interests are in developing new statistical frameworks for analyzing datasets characterized by high dimensionality and complex structures such as high-throughput genomic, proteomic and imaging data.