A score system for quality evaluation of RNA sequence tags: an improvement for gene expression profiling


Autoria(s): PINHEIRO, Daniel G; GALANTE, Pedro AF; SOUZA, Sandro J de; ZAGO, Marco A; SILVA JR, Wilson A
Contribuinte(s)

UNIVERSIDADE DE SÃO PAULO

Data(s)

18/04/2012

18/04/2012

2009

Resumo

Background: High-throughput molecular approaches for gene expression profiling, such as Serial Analysis of Gene Expression (SAGE), Massively Parallel Signature Sequencing (MPSS) or Sequencing-by-Synthesis (SBS) represent powerful techniques that provide global transcription profiles of different cell types through sequencing of short fragments of transcripts, denominated sequence tags. These techniques have improved our understanding about the relationships between these expression profiles and cellular phenotypes. Despite this, more reliable datasets are still necessary. In this work, we present a web-based tool named S3T: Score System for Sequence Tags, to index sequenced tags in accordance with their reliability. This is made through a series of evaluations based on a defined rule set. S3T allows the identification/selection of tags, considered more reliable for further gene expression analysis. Results: This methodology was applied to a public SAGE dataset. In order to compare data before and after filtering, a hierarchical clustering analysis was performed in samples from the same type of tissue, in distinct biological conditions, using these two datasets. Our results provide evidences suggesting that it is possible to find more congruous clusters after using S3T scoring system. Conclusion: These results substantiate the proposed application to generate more reliable data. This is a significant contribution for determination of global gene expression profiles. The library analysis with S3T is freely available at http://gdm.fmrp.usp.br/s3t/.S3T source code and datasets can also be downloaded from the aforementioned website.

Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq)

Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP)

Center for Research on Cell-based Therapy (CEPID/FAPESP)

Ludwig Institute for Cancer Research

(NIH) Fogarty International center, NIH[5D43TW007015-02]

Identificador

BMC BIOINFORMATICS, v. 10, 2009

1471-2105

http://producao.usp.br/handle/BDPI/15299

10.1186/1471-2105-10-170

http://dx.doi.org/10.1186/1471-2105-10-170

Idioma(s)

eng

Publicador

BIOMED CENTRAL LTD

Relação

BMC Bioinformatics

Direitos

openAccess

Copyright BIOMED CENTRAL LTD

Palavras-Chave #SERIAL ANALYSIS #HUMAN GENOME #SAGE #IDENTIFICATION #TRANSCRIPTOME #LONGSAGE #DATABASE #Biochemical Research Methods #Biotechnology & Applied Microbiology #Mathematical & Computational Biology
Tipo

article

original article

publishedVersion