16 resultados para Annotation protéique
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
The identification and annotation of protein-coding genes is one of the primary goals of whole-genome sequencing projects, and the accuracy of predicting the primary protein products of gene expression is vital to the interpretation of the available data and the design of downstream functional applications. Nevertheless, the comprehensive annotation of eukaryotic genomes remains a considerable challenge. Many genomes submitted to public databases, including those of major model organisms, contain significant numbers of wrong and incomplete gene predictions. We present a community-based reannotation of the Aspergillus nidulans genome with the primary goal of increasing the number and quality of protein functional assignments through the careful review of experts in the field of fungal biology. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Background: Cutaneous mycoses are common human infections among healthy and immunocompromised hosts, and the anthropophilic fungus Trichophyton rubrum is the most prevalent microorganism isolated from such clinical cases worldwide. The aim of this study was to determine the transcriptional profile of T. rubrum exposed to various stimuli in order to obtain insights into the responses of this pathogen to different environmental challenges. Therefore, we generated an expressed sequence tag (EST) collection by constructing one cDNA library and nine suppression subtractive hybridization libraries. Results: The 1388 unigenes identified in this study were functionally classified based on the Munich Information Center for Protein Sequences (MIPS) categories. The identified proteins were involved in transcriptional regulation, cellular defense and stress, protein degradation, signaling, transport, and secretion, among other functions. Analysis of these unigenes revealed 575 T. rubrum sequences that had not been previously deposited in public databases. Conclusion: In this study, we identified novel T. rubrum genes that will be useful for ORF prediction in genome sequencing and facilitating functional genome analysis. Annotation of these expressed genes revealed metabolic adaptations of T. rubrum to carbon sources, ambient pH shifts, and various antifungal drugs used in medical practice. Furthermore, challenging T. rubrum with cytotoxic drugs and ambient pH shifts extended our understanding of the molecular events possibly involved in the infectious process and resistance to antifungal drugs.
Resumo:
Cuticle renewal is a complex biological process that depends on the cross talk between hormone levels and gene expression. This study characterized the expression of two genes encoding cuticle proteins sharing the four conserved amino acid blocks of the Tweedle family, AmelTwdl1 and AmelTwdl2, and a gene encoding a cuticle peroxidase containing the Animal haem peroxidase domain, Ampxd, in the honey bee. Gene sequencing and annotation validated the formerly predicted tweedle genes, and revealed a novel gene, Ampxd, in the honey bee genome. Expression of these genes was studied in the context of the ecdysteroid-coordinated pupal-to-adult molt, and in different tissues. Higher transcript levels were detected in the integument after the ecdysteroid peak that induces apolysis, coinciding with the synthesis and deposition of the adult exoskeleton and its early differentiation. The effect of this hormone was confirmed in vivo by tying a ligature between the thorax and abdomen of early pupae to prevent the abdominal integument from coming in contact with ecdysteroids released from the prothoracic gland. This procedure impaired the natural increase in transcript levels in the abdominal integument. Both tweedle genes were expressed at higher levels in the empty gut than in the thoracic integument and trachea of pharate adults. In contrast, Ampxd transcripts were found in higher levels in the thoracic integument and trachea than in the gut. Together, the data strongly suggest that these three genes play roles in ecdysteroid-dependent exoskeleton construction and differentiation and also point to a possible role for the two tweedle genes in the formation of the cuticle (peritrophic membrane) that internally lines the gut.
Resumo:
Background: Melanoma progression occurs through three major stages: radial growth phase (RGP), confined to the epidermis; vertical growth phase (VGP), when the tumor has invaded into the dermis; and metastasis. In this work, we used suppression subtractive hybridization (SSH) to investigate the molecular signature of melanoma progression, by comparing a group of metastatic cell lines with an RGP-like cell line showing characteristics of early neoplastic lesions including expression of the metastasis suppressor KISS1, lack of alpha v beta 3-integrin and low levels of RHOC. Methods: Two subtracted cDNA collections were obtained, one (RGP library) by subtracting the RGP cell line (WM1552C) cDNA from a cDNA pool from four metastatic cell lines (WM9, WM852, 1205Lu and WM1617), and the other (Met library) by the reverse subtraction. Clones were sequenced and annotated, and expression validation was done by Northern blot and RT-PCR. Gene Ontology annotation and searches in large-scale melanoma expression studies were done for the genes identified. Results: We identified 367 clones from the RGP library and 386 from the Met library, of which 351 and 368, respectively, match human mRNA sequences, representing 288 and 217 annotated genes. We confirmed the differential expression of all genes selected for validation. In the Met library, we found an enrichment of genes in the growth factors/receptor, adhesion and motility categories whereas in the RGP library, enriched categories were nucleotide biosynthesis, DNA packing/repair, and macromolecular/vesicular trafficking. Interestingly, 19% of the genes from the RGP library map to chromosome 1 against 4% of the ones from Met library. Conclusion: This study identifies two populations of genes differentially expressed between melanoma cell lines from two tumor stages and suggests that these sets of genes represent profiles of less aggressive versus metastatic melanomas. A search for expression profiles of melanoma in available expression study databases allowed us to point to a great potential of involvement in tumor progression for several of the genes identified here. A few sequences obtained here may also contribute to extend annotated mRNAs or to the identification of novel transcripts.
Resumo:
The molecular pathogenesis of myelodysplastic syndromes (MDS) is poorly understood. In order to expand our knowledge of genetic defects in MDS, we determined the overall profile of genes expressed in bone marrow from patients with refractory anemia with excess blasts ( RAEB) by serial analysis of gene expression ( SAGE). The present report describes a partial transcriptome of RAEB bone marrow derived from 56,694 sequenced tags that provides information about expressed gene products. This is the first attempt to determine an overall profile of gene expression specifically in RAEB at diagnosis using SAGE, which should be useful in the understanding of the physiopathology of MDS and in identifying the genes involved.
Resumo:
Mycoplasma suis, the causative agent of porcine infectious anemia, has never been cultured in vitro and mechanisms by which it causes disease are poorly understood. Thus, the objective herein was to use whole genome sequencing and analysis of M. suis to define pathogenicity mechanisms and biochemical pathways. M. suis was harvested from the blood of an experimentally infected pig. Following DNA extraction and construction of a paired end library, whole-genome sequencing was performed using GS-FLX (454) and Titanium chemistry. Reads on paired-end constructs were assembled using GS De Novo Assembler and gaps closed by primer walking; assembly was validated by PFGE. Glimmer and Manatee Annotation Engine were used to predict and annotate protein-coding sequences (CDS). The M. suis genome consists of a single, 742,431 bp chromosome with low G+C content of 31.1%. A total of 844 CDS, 3 single copies, unlinked rRNA genes and 32 tRNAs were identified. Gene homologies and GC skew graph show that M. suis has a typical Mollicutes oriC. The predicted metabolic pathway is concise, showing evidence of adaptation to blood environment. M. suis is a glycolytic species, obtaining energy through sugars fermentation and ATP-synthase. The pentose-phosphate pathway, metabolism of cofactors and vitamins, pyruvate dehydrogenase and NAD(+) kinase are missing. Thus, ribose, NADH, NADPH and coenzyme A are possibly essential for its growth. M. suis can generate purines from hypoxanthine, which is secreted by RBCs, and cytidine nucleotides from uracil. Toxins orthologs were not identified. We suggest that M. suis may cause disease by scavenging and competing for host nutrients, leading to decreased life-span of RBCs. In summary, genome analysis shows that M. suis is dependent on host cell metabolism and this characteristic is likely to be linked to its pathogenicity. The prediction of essential nutrients will aid the development of in vitro cultivation systems.
Resumo:
Efficient automatic protein classification is of central importance in genomic annotation. As an independent way to check the reliability of the classification, we propose a statistical approach to test if two sets of protein domain sequences coming from two families of the Pfam database are significantly different. We model protein sequences as realizations of Variable Length Markov Chains (VLMC) and we use the context trees as a signature of each protein family. Our approach is based on a Kolmogorov-Smirnov-type goodness-of-fit test proposed by Balding et at. [Limit theorems for sequences of random trees (2008), DOI: 10.1007/s11749-008-0092-z]. The test statistic is a supremum over the space of trees of a function of the two samples; its computation grows, in principle, exponentially fast with the maximal number of nodes of the potential trees. We show how to transform this problem into a max-flow over a related graph which can be solved using a Ford-Fulkerson algorithm in polynomial time on that number. We apply the test to 10 randomly chosen protein domain families from the seed of Pfam-A database (high quality, manually curated families). The test shows that the distributions of context trees coming from different families are significantly different. We emphasize that this is a novel mathematical approach to validate the automatic clustering of sequences in any context. We also study the performance of the test via simulations on Galton-Watson related processes.
Resumo:
Background: NADPH-cytochrome- P450 oxidoreductase (CPR) is a ubiquitous enzyme that belongs to a family of diflavin oxidoreductases and is required for activity of the microsomal cytochrome-P450 monooxygenase system. CPR gene-disruption experiments have demonstrated that absence of this enzyme causes developmental defects both in mouse and insect. Results: Annotation of the sequenced genome of D. discoideum revealed the presence of three genes (redA, redB and redC) that encode putative members of the diflavin oxidoreductase protein family. redA transcripts are present during growth and early development but then decline, reaching undetectable levels after the mound stage. redB transcripts are present in the same levels during growth and development while redC expression was detected only in vegetative growing cells. We isolated a mutant strain of Dictyostelium discoideum following restriction enzyme-mediated integration (REMI) mutagenesis in which redA was disrupted. This mutant develops only to the mound stage and accumulates a bright yellow pigment. The mound-arrest phenotype is cell-autonomous suggesting that the defect occurs within the cells rather than in intercellular signaling. Conclusion: The developmental arrest due to disruption of redA implicates CPR in the metabolism of compounds that control cell differentiation.
Resumo:
Sequencing technologies and new bioinformatics tools have led to the complete sequencing of various genomes. However, information regarding the human transcriptome and its annotation is yet to be completed. The Human Cancer Genome Project, using ORESTES (open reading frame EST sequences) methodology, contributed to this objective by generating data from about 1.2 million expressed sequence tags. Approximately 30 of these sequences did not align to ESTs in the public databases and were considered no-match ORESTES. On the basis that a set of these ESTs could represent new transcripts, we constructed a cDNA microarray. This platform was used to hybridize against 12 different normal or tumor tissues. We identified 3421 transcribed regions not associated with annotated transcripts, representing 83.3 of the platform. The total number of differentially expressed sequences was 1007. Also, 28 of analyzed sequences could represent noncoding RNAs. Our data reinforces the knowledge of the human genome being pervasively transcribed, and point out molecular marker candidates for different cancers. To reinforce our data, we confirmed, by real-time PCR, the differential expression of three out of eight potentially tumor markers in prostate tissues. Lists of 1007 differentially expressed sequences, and the 291 potentially noncoding tumor markers were provided.
Resumo:
Study Design: Data mining of single nucleotide polymorphisms (SNPs) in gene pathways related to spinal cord injury (SCI). Objectives: To identify gene polymorphisms putatively implicated with neuronal damage evolution pathways, potentially useful to SCI study. Setting: Departments of Psychiatry and Orthopedics, Faculdade de Medicina, Universidade de Sao Paulo, Brazil. Methods: Genes involved with processes related to SCI, such as apoptosis, inflammatory response, axonogenesis, peripheral nervous system development and axon ensheathment, were determined by evaluating the `Biological Process` annotation of Gene Ontology (GO). Each gene of these pathways was mapped using MapViewer, and gene coordinates were used to identify their polymorphisms in the SNP database. As a proof of concept, the frequency of subset of SNPs, located in four genes (ALOX12, APOE, BDNF and NINJ1) was evaluated in the DNA of a group of 28 SCI patients and 38 individuals with no SC lesions. Results: We could identify a total of 95 276 SNPs in a set of 588 genes associated with the selected GO terms, including 3912 nucleotide alterations located in coding regions of genes. The five non-synonymous SNPs genotyped in our small group of patients, showed a significant frequency, reinforcing their potential use for the investigation of SCI evolution. Conclusion: Despite the importance of SNPs in many aspects of gene expression and protein activity, these gene alterations have not been explored in SCI research. Here we describe a set of potentially useful SNPs, some of which could underlie the genetic mechanisms involved in the post trauma spinal cord damage.
Resumo:
In highly eusocial insects, such as the honey bee, Apis mellifera, the reproductive bias has become embedded in morphological caste differences. These are most expressively denoted in ovary size, with adult queens having large ovaries consisting of 150-200 ovarioles each, while workers typically have only 1-20 ovarioles per ovary. This morphological differentiation is a result of hormonal signals triggered by the diet change in the third larval instar, which eventually generate caste-specific gene expression patterns. To reveal these we produced differential gene expression libraries by Representational Difference Analysis (RDA) for queen and worker ovaries in a developmental stage when cell death is a prominent feature in the ovarioles of workers, whereas all ovarioles are maintained and extend in length in queens. In the queen library, 48% of the gene set represented homologs of known Drosophila genes, whereas in the worker ovary, the largest set (59%) were ESTs evidencing novel genes, not even computationally predicted in the honey bee genome. Differential expression was confirmed by quantitative RT-PCR for a selected gene set, denoting major differences for two queen and two worker library genes. These included two unpredicted genes located in chromosome 11 (Group11.35 and Group11.31, respectively) possibly representing long non-coding RNAs. Being candidates as modulators of ovary development, their expression and functional analysis should be a focal point for future studies. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
The cellular and molecular characteristics of a cell line (BME26) derived from embryos of the cattle tick Rhipicephalus (Boophilus) microplus were studied. The cells contained glycogen inclusions, numerous mitochondria, and vesicles with heterogeneous electron densities dispersed throughout the cytoplasm. Vesicles contained lipids and sequestered palladium meso-porphyrin (Pd-mP) and rhodamine-hemoglobin, suggesting their involvement in the autophagic and endocytic pathways. The cells phagocytosed yeast and expressed genes encoding the antimicrobial peptides (microplusin and defensin). A cDNA library was made and 898 unique mRNA sequences were obtained. Among them, 556 sequences were not significantly similar to any sequence found in public databases. Annotation using Gene Ontology revealed transcripts related to several different functional classes. We identified transcripts involved in immune response such as ferritin, serine proteases, protease inhibitors,. antimicrobial peptides, heat shock protein, glutathione S-transferase, peroxidase, and NADPH oxidase. BME26 cells transfected with a plasmid carrying a red fluorescent protein reporter gene (DsRed2) transiently expressed DsRed2 for up to 5 weeks. We conclude that BME26 can be used to experimentally analyze diverse biological processes that occur in R. (B.) microplus such as the innate immune response to tick-borne pathogens. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
While watching TV, viewers use the remote control to turn the TV set on and off, change channel and volume, to adjust the image and audio settings, etc. Worldwide, research institutes collect information about audience measurement, which can also be used to provide personalization and recommendation services, among others. The interactive digital TV offers viewers the opportunity to interact with interactive applications associated with the broadcast program. Interactive TV infrastructure supports the capture of the user-TV interaction at fine-grained levels. In this paper we propose the capture of all the user interaction with a TV remote control-including short term and instant interactions: we argue that the corresponding captured information can be used to create content pervasively and automatically, and that this content can be used by a wide variety of services, such as audience measurement, personalization and recommendation services. The capture of fine grained data about instant and interval-based interactions also allows the underlying infrastructure to offer services at the same scale, such as annotation services and adaptative applications. We present the main modules of an infrastructure for TV-based services, along with a detailed example of a document used to record the user-remote control interaction. Our approach is evaluated by means of a proof-of-concept prototype which uses the Brazilian Digital TV System, the Ginga-NCL middleware.
Resumo:
The literature reports research efforts allowing the editing of interactive TV multimedia documents by end-users. In this article we propose complementary contributions relative to end-user generated interactive video, video tagging, and collaboration. In earlier work we proposed the watch-and-comment (WaC) paradigm as the seamless capture of an individual`s comments so that corresponding annotated interactive videos be automatically generated. As a proof of concept, we implemented a prototype application, the WACTOOL, that supports the capture of digital ink and voice comments over individual frames and segments of the video, producing a declarative document that specifies both: different media stream structure and synchronization. In this article, we extend the WaC paradigm in two ways. First, user-video interactions are associated with edit commands and digital ink operations. Second, focusing on collaboration and distribution issues, we employ annotations as simple containers for context information by using them as tags in order to organize, store and distribute information in a P2P-based multimedia capture platform. We highlight the design principles of the watch-and-comment paradigm, and demonstrate related results including the current version of the WACTOOL and its architecture. We also illustrate how an interactive video produced by the WACTOOL can be rendered in an interactive video environment, the Ginga-NCL player, and include results from a preliminary evaluation.
Resumo:
Laryngeal squamous cell carcinoma is very common in head and neck cancer, with high mortality rates and poor prognosis. In this study, we compared expression profiles of clinical samples from 13 larynx tumors and 10 non-neoplastic larynx tissues using a custom-built cDNA microarray containing 331 probes for 284 genes previously identified by informatics analysis of EST databases as markers of head and neck tumors. Thirty-five genes showed statistically significant differences (SNR >= 11.01, p <= 0.001) in the expression between tumor and non-tumor larynx tissue samples. Functional annotation indicated that these genes are involved in cellular processes relevant to the cancer phenotype, such as apoptosis, cell cycle, DNA repair, proteolysis, protease inhibition, signal transduction and transcriptional regulation. Six of the identified transcripts map to intronic regions of protein-coding genes and may comprise non-annotated exons or as yet uncharacterized long ncRNAs with a regulatory role in the gene expression program of larynx tissue. The differential expression of 10 of these genes (ADCY6, AES, AL2SCR3, CRR9, CSTB, DUSP1, MAP3K5, PLAT, UBL1 and ZNF706) was independently confirmed by quantitative real-time RT-PCR. Among these, the CSTB gene product has cysteine protease inhibitor activity that has been associated with an antimetastatic function. Interestingly, CSTB showed a low expression in the tumor samples analyzed (p<0.0001). The set of genes identified here contribute to a better understanding of the molecular basis of larynx cancer, and provide candidate markers for improving diagnosis, prognosis and treatment of this carcinoma.