DEVELOPMENT OF NOVEL METHODS TO MINIMIZE THE IMPACT OF SEQUENCING ERRORS IN THE NEXT-GENERATION SEQUENCING DATA ANALYSIS


Autoria(s): Zheng, Xiaofeng
Data(s)

01/05/2013

Resumo

Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (

Formato

application/pdf

Identificador

http://digitalcommons.library.tmc.edu/utgsbs_dissertations/338

http://digitalcommons.library.tmc.edu/cgi/viewcontent.cgi?article=1373&context=utgsbs_dissertations

Publicador

DigitalCommons@The Texas Medical Center

Fonte

UT GSBS Dissertations and Theses (Open Access)

Palavras-Chave #next-generation sequencing #sequencing error #error correction #SNP detection #Bioinformatics #Biostatistics
Tipo

text