Bayesian Hidden Markov models with linear time decoding for the analysis of cancer genomes
Chris Holmes (University of Oxford)
Wednesday 17th April, 2013 16:00-17:00 Maths 203
Cancer genomes often exhibit large scale structural variation whereby stretches of DNA are duplicated or deleted relative to the inherited germline genome. These so called copy number aberrations (CNAs) are known to be key drivers of tumour formation and progression. We have developed Bayesian Hidden Markov models (HMMs) for detecting CNAs in data from high-throughput genotyping arrays and whole-genome sequencing platforms. We pay particular attention to the problem of computationally efficient genome segmentation, relating to scientific questions of the type, "find me the most probable deletion events in this genome", or "find me the most probable duplication events". To this aim we have devised linear time, in the length of the sequence, posterior decoding algorithms that can retrieve the optimal segmentation for a fixed number of events (transitions) under a HMM. The methods are illustrated on real-world examples from studies of chronic lymphocytic leukemia and breast cancer.