EGASP: the human ENCODE Genome Annotation Assessment Project
Contribuinte(s) |
Universitat Pompeu Fabra |
---|---|
Data(s) |
02/07/2013
|
Resumo |
Background: We present the results of EGASP, a community experiment to assess the state-ofthe-art in genome annotation within the ENCODE regions, which span 1% of the human genomesequence. The experiment had two major goals: the assessment of the accuracy of computationalmethods to predict protein coding genes; and the overall assessment of the completeness of thecurrent human genome annotations as represented in the ENCODE regions. For thecomputational prediction assessment, eighteen groups contributed gene predictions. Weevaluated these submissions against each other based on a ‘reference set’ of annotationsgenerated as part of the GENCODE project. These annotations were not available to theprediction groups prior to the submission deadline, so that their predictions were blind and anexternal advisory committee could perform a fair assessment.Results: The best methods had at least one gene transcript correctly predicted for close to 70%of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into accountalternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotidelevel, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programsrelying on mRNA and protein sequences were the most accurate in reproducing the manuallycurated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could beverified.Conclusions: This is the first such experiment in human DNA, and we have followed thestandards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe theresults presented here contribute to the value of ongoing large-scale annotation projects and shouldguide further experimental methods when being scaled up to the entire human genome sequence. |
Identificador | |
Idioma(s) |
eng |
Publicador |
BioMed Central |
Direitos |
info:eu-repo/semantics/openAccess © 2006 BioMed Central Ltd.The electronic version of this article is the complete one and can be found online at <a href="http://genomebiology.com/2006/7/S1/S2">http://genomebiology.com/2006/7/S1/S2</a> |
Palavras-Chave | #Bioinformàtica #Genomes #Biologia molecular #ENCODE GASP #Alternative Splicing #Animals #Computational Biology #Genetic Databases #Genes #Human Genome #Genomics #Humans #Mice #RNA Sequence Analysis #DNA |
Tipo |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion |