Identification of unannotated exons of low abundance transcripts in Drosophila melanogaster and cloning of a new serine protease gene upregulated upon injury


Autoria(s): Maia, Rafaela M; Valente, Valeria ; Cunha, Marco AV; Sousa, Josane F; Araujo, Daniela D; Silva, Wilson A; Zago, Marco A; Dias-Neto, Emmanuel ; Souza, Sandro J; Simpson, Andrew JG; Monesi, Nadia ; Ramos, Ricardo GP; Espreafico, Enilza M; Paçó-Larson, Maria L
Contribuinte(s)

UNIVERSIDADE DE SÃO PAULO

Data(s)

26/08/2013

26/08/2013

01/07/2007

Resumo

Abstract Background The sequencing of the D.melanogaster genome revealed an unexpected small number of genes (~ 14,000) indicating that mechanisms acting on generation of transcript diversity must have played a major role in the evolution of complex metazoans. Among the most extensively used mechanisms that accounts for this diversity is alternative splicing. It is estimated that over 40% of Drosophila protein-coding genes contain one or more alternative exons. A recent transcription map of the Drosophila embryogenesis indicates that 30% of the transcribed regions are unannotated, and that 1/3 of this is estimated as missed or alternative exons of previously characterized protein-coding genes. Therefore, the identification of the variety of expressed transcripts depends on experimental data for its final validation and is continuously being performed using different approaches. We applied the Open Reading Frame Expressed Sequence Tags (ORESTES) methodology, which is capable of generating cDNA data from the central portion of rare transcripts, in order to investigate the presence of hitherto unnanotated regions of Drosophila transcriptome. Results Bioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury. Conclusion Using the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data.

We thank Fernanda S. Zanola, Adriana A. Marques, Cristiane Ayres Ferreira, Alexandre C. de Oliveira, Benedita O. de Souza and Cirlei A.V. Saraiva for their dedicated technical assistance, Valdir Mazzucatto for the maintenance of the Drosophila room and Juçara Parra for administrative assistance. The FAPESP/Ludwig Institute for Cancer Research Consortium, as well as a FAPESP grant (MLPL) and stipends from FAEPAFMRP supported the work. Sequencing and bioinformatics analysis were carried out at the Centro de Terapia Celular supoported by FAPESP. RMM received a fellowship from CAPES. VV and JFS were supported by FAPESP fellowships.

We thank Fernanda S. Zanola, Adriana A. Marques, Cristiane Ayres Ferreira, Alexandre C. de Oliveira, Benedita O. de Souza and Cirlei A.V. Saraiva for their dedicated technical assistance, Valdir Mazzucatto for the maintenance of the Drosophila room and Juçara Parra for administrative assistance. The FAPESP/Ludwig Institute for Cancer Research Consortium, as well as a FAPESP grant (MLPL) and stipends from FAEPA-FMRP supported the work. Sequencing and bio-informatics analysis were carried out at the Centro de Terapia Celular supoported by FAPESP. RMM received a fellowship from CAPES. VV and JFS were supported by FAPESP fellowships.

Identificador

BMC Genomics. 2007 Jul 24;8(1):249

1471-2164

http://www.producao.usp.br/handle/BDPI/32790

10.1186/1471-2164-8-249

http://www.biomedcentral.com/1471-2164/8/249

Idioma(s)

eng

Relação

BMC Genomics

Direitos

openAccess

Maia et al; licensee BioMed Central Ltd. - This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Tipo

article

original article