Segmenting eukaryotic genomes with the generalized Gibbs sampler
Contribuinte(s) |
D. M. Waterman S. Istrail |
---|---|
Data(s) |
01/01/2006
|
Resumo |
Eukaryotic genomes display segmental patterns of variation in various properties, including GC content and degree of evolutionary conservation. DNA segmentation algorithms are aimed at identifying statistically significant boundaries between such segments. Such algorithms may provide a means of discovering new classes of functional elements in eukaryotic genomes. This paper presents a model and an algorithm for Bayesian DNA segmentation and considers the feasibility of using it to segment whole eukaryotic genomes. The algorithm is tested on a range of simulated and real DNA sequences, and the following conclusions are drawn. Firstly, the algorithm correctly identifies non-segmented sequence, and can thus be used to reject the null hypothesis of uniformity in the property of interest. Secondly, estimates of the number and locations of change-points produced by the algorithm are robust to variations in algorithm parameters and initial starting conditions and correspond to real features in the data. Thirdly, the algorithm is successfully used to segment human chromosome 1 according to GC content, thus demonstrating the feasibility of Bayesian segmentation of eukaryotic genomes. The software described in this paper is available from the author's website (www.uq.edu.au/similar to uqjkeith/) or upon request to the author. |
Identificador | |
Idioma(s) |
eng |
Publicador |
Mary Ann Liebert Inc |
Palavras-Chave | #Eukaryotic Genomes #Genome Segmentation #Gc Content #Functional Non-coding Rna #Bayesian Modelling #Markov Chain Monte Carlo #Generalized Gibbs Sampler #Mathematics, Interdisciplinary Applications #Biochemical Research Methods #Biotechnology & Applied Microbiology #Computer Science, Interdisciplinary Applications #Statistics & Probability #Dna-sequence Segmentation #Isochore Chromosome Maps #Models #Complex #Rnas #01 Mathematical Sciences |
Tipo |
Journal Article |