894 resultados para SEQUENCE DATABASES
Resumo:
Lunasin is a peptide from soybean seeds which has been demonstrated to have anticancer properties. It has also been reported in cereal seeds: wheat, rye, barley and Triticale. However, extensive searches of transcriptome and DNA sequence databases for wheat and other cereals have failed to identify sequences encoding either the lunasin peptide or a precursor protein. This raises the question of the origin of the lunasin reported in cereal grain.
Resumo:
Background: MS-based proteomics was applied to the analysis of the medicinal plant Artemisia annua, exploiting a recently published contig sequence database (Graham et al. (2010) Science 327, 328–331) and other genomic and proteomic sequence databases for comparison. A. annua is the predominant natural source of artemisinin, the precursor for artemisinin-based combination therapies (ACTs), which are the WHO-recommended treatment for P. falciparum malaria. Results: The comparison of various databases containing A. annua sequences (NCBInr/viridiplantae, UniProt/ viridiplantae, UniProt/A. annua, an A. annua trichome Trinity contig database, the above contig database and another A. annua EST database) revealed significant differences in respect of their suitability for proteomic analysis, showing that an organism-specific database that has undergone extensive curation, leading to longer contig sequences, can greatly increase the number of true positive protein identifications, while reducing the number of false positives. Compared to previously published data an order-of-magnitude more proteins have been identified from trichome-enriched A. annua samples, including proteins which are known to be involved in the biosynthesis of artemisinin, as well as other highly abundant proteins, which suggest additional enzymatic processes occurring within the trichomes that are important for the biosynthesis of artemisinin. Conclusions: The newly gained information allows for the possibility of an enzymatic pathway, utilizing peroxidases, for the less well understood final stages of artemisinin’s biosynthesis, as an alternative to the known non-enzymatic in vitro conversion of dihydroartemisinic acid to artemisinin. Data are available via ProteomeXchange with identifier PXD000703.
Resumo:
This article contains raw and processed data related to research published by Bryant et al. [1]. Data was obtained by MS-based proteomics, analysing trichome-enriched, trichome-depleted and whole leaf samples taken from the medicinal plant Artemisia annua and searching the acquired MS/MS data against a recently published contig database [2] and other genomic and proteomic sequence databases for comparison. The processed data shows that an order-of-magnitude more proteins have been identified from trichome-enriched Artemisia annua samples in comparison to previously published data. Proteins known to have a role in the biosynthesis of artemisinin and other highly abundant proteins were found which imply additional enzymatically driven processes occurring within the trichomes that are significant for the biosynthesis of artemisinin.
Resumo:
Protein–ligand binding site prediction methods aim to predict, from amino acid sequence, protein–ligand interactions, putative ligands, and ligand binding site residues using either sequence information, structural information, or a combination of both. In silico characterization of protein–ligand interactions has become extremely important to help determine a protein’s functionality, as in vivo-based functional elucidation is unable to keep pace with the current growth of sequence databases. Additionally, in vitro biochemical functional elucidation is time-consuming, costly, and may not be feasible for large-scale analysis, such as drug discovery. Thus, in silico prediction of protein–ligand interactions must be utilized to aid in functional elucidation. Here, we briefly discuss protein function prediction, prediction of protein–ligand interactions, the Critical Assessment of Techniques for Protein Structure Prediction (CASP) and the Continuous Automated EvaluatiOn (CAMEO) competitions, along with their role in shaping the field. We also discuss, in detail, our cutting-edge web-server method, FunFOLD for the structurally informed prediction of protein–ligand interactions. Furthermore, we provide a step-by-step guide on using the FunFOLD web server and FunFOLD3 downloadable application, along with some real world examples, where the FunFOLD methods have been used to aid functional elucidation.
Resumo:
Homology-driven proteomics is a major tool to characterize proteomes of organisms with unsequenced genomes. This paper addresses practical aspects of automated homology-driven protein identifications by LC-MS/MS on a hybrid LTQ orbitrap mass spectrometer. All essential software elements supporting the presented pipeline are either hosted at the publicly accessible web server, or are available for free download. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The first reference map of the proteome of pooled normal dog tears was created using 2-dimensional polyacrylamide gel electrophoresis and the identity of a number of the major species determined using matrix-assisted laser desorption time of flight mass spectrometry (MALDI-TOF) and peptide mass fingerprint matching on protein sequence databases. In order to understand the changes in protein expression in the tear film of dogs with cancer, tears from such animals were similarly examined. A number of differences were found between the tears of healthy dogs and the dogs with cancer. Differences were found in levels of actin and albumin and in an unidentified protein which may be analogous to human lacryglobulin. These findings suggest that it may be possible to develop tear film analysis to provide a simple non-invasive test for the diagnosis and/or management of canine cancers. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The influenza virus has been a challenge to science due to its ability to withstand new environmental conditions. Taking into account the development of virus sequence databases, computational approaches can be helpful to understand virus behavior over time. Furthermore, they can suggest new directions to deal with influenza. This work presents triplet entropy analysis as a potential phylodynamic tool to quantify nucleotide organization of viral sequences. The application of this measure to segments of hemagglutinin (HA) and neuraminidase (NA) of H1N1 and H3N2 virus subtypes has shown some variability effects along timeline, inferring about virus evolution. Sequences were divided by year and compared for virus subtype (H1N1 and H3N2). The nonparametric Mann-Whitney test was used for comparison between groups. Results show that differentiation in entropy precedes differentiation in GC content for both groups. Considering the HA fragment, both triplet entropy as well as GC concentration show intersection in 2009, year of the recent pandemic. Some conclusions about possible flu evolutionary lines were drawn. © 2013 Elsevier B.V.
Resumo:
Snake venom proteomes/peptidomes are highly complex and maintenance of their integrity within the gland lumen is crucial for the expression of toxin activities. There has been considerable progress in the field of venom proteomics, however, peptidomics does not progress as fast, because of the lack of comprehensive venom sequence databases for analysis of MS data. Therefore, in many cases venom peptides have to be sequenced manually by MS/MS analysis or Edman degradation. This is critical for rare snake species, as is the case of Bothrops cotiara (BC) and B. fonsecai (BF), which are regarded as near threatened with extinction. In this study we conducted a comprehensive analysis of the venom peptidomes of BC, BF, and B. jararaca (BJ) using a combination of solid-phase extraction and reversed-phase HPLC to fractionate the peptides, followed by nano-liquid chromatography-tandem MS (LC-MS/MS) or direct infusion electrospray ionization-(ESI)-MS/MS or MALDI-MS/MS analyses. We detected marked differences in the venom peptidomes and identified peptides ranging from 7 to 39 residues in length by de novo sequencing. Forty-four unique sequences were manually identified, out of which 30 are new peptides, including 17 bradykinin-potentiating peptides, three poly-histidine-poly-glycine peptides and interestingly, 10 L-amino acid oxidase fragments. Some of the new bradykinin-potentiating peptides display significant bradykinin potentiating activity. Automated database search revealed fragments from several toxins in the peptidomes, mainly from L-amino acid oxidase, and allowed the determination of the peptide bond specificity of proteinases and amino acid occurrences for the P4-P4' sites. We also demonstrate that the venom lyophilization/resolubilization process greatly increases the complexity of the peptidome because of the imbalance caused to the venom proteome and the consequent activity of proteinases on venom components. The use of proteinase inhibitors clearly showed different outcomes in the peptidome characterization and suggested that degradomic-peptidomic analysis of snake venoms is highly sensitive to the conditions of sampling procedures. Molecular & Cellular Proteomics 11: 10.1074/mcp.M112.019331, 1245-1262, 2012.
Resumo:
Cogo K, de Andrade A, Labate CA, Bergamaschi CC, Berto LA, Franco GCN, Goncalves RB, Groppo FC. Proteomic analysis of Porphyromonas gingivalis exposed to nicotine and cotinine. J Periodont Res 2012; 47: 766775. (c) 2012 John Wiley & Sons A/S Background and Objective: Smokers are more predisposed than nonsmokers to infection with Porphyromonas gingivalis, one of the most important pathogens involved in the onset and development of periodontitis. It has also been observed that tobacco, and tobacco derivatives such as nicotine and cotinine, can induce modifications to P. gingivalis virulence. However, the effect of the major compounds derived from cigarettes on expression of protein by P.gingivalis is poorly understood. Therefore, this study aimed to evaluate and compare the effects of nicotine and cotinine on the P.gingivalis proteomic profile. Material and Methods: Total proteins of P gingivalis exposed to nicotine and cotinine were extracted and separated by two-dimensional electrophoresis. Proteins differentially expressed were successfully identified through liquid chromatography-mass spectrometry and primary sequence databases using MASCOT search engine, and gene ontology was carried out using DAVID tools. Results: Of the approximately 410 protein spots that were reproducibly detected on each gel, 23 were differentially expressed in at least one of the treatments. A particular increase was seen in proteins involved in metabolism, virulence and acquisition of peptides, protein synthesis and folding, transcription and oxidative stress. Few proteins showed significant decreases in expression; those that did are involved in cell envelope biosynthesis and proteolysis and also in metabolism. Conclusion: Our results characterized the changes in the proteome of P.gingivalis following exposure to nicotine and cotinine, suggesting that these substances may modulate, with minor changes, protein expression. The present study is, in part, a step toward understanding the potential smokepathogen interaction that may occur in smokers with periodontitis.
Resumo:
Abstract Background MicroRNAs (miRNAs) are small regulatory RNAs, some of which are conserved in diverse plant genomes. Therefore, computational identification and further experimental validation of miRNAs from non-model organisms is both feasible and instrumental for addressing miRNA-based gene regulation and evolution. Sugarcane (Saccharum spp.) is an important biofuel crop with publicly available expressed sequence tag and genomic survey sequence databases, but little is known about miRNAs and their targets in this highly polyploid species. Results In this study, we have computationally identified 19 distinct sugarcane miRNA precursors, of which several are highly similar with their sorghum homologs at both nucleotide and secondary structure levels. The accumulation pattern of mature miRNAs varies in organs/tissues from the commercial sugarcane hybrid as well as in its corresponding founder species S. officinarum and S. spontaneum. Using sugarcane MIR827 as a query, we found a novel MIR827 precursor in the sorghum genome. Based on our computational tool, a total of 46 potential targets were identified for the 19 sugarcane miRNAs. Several targets for highly conserved miRNAs are transcription factors that play important roles in plant development. Conversely, target genes of lineage-specific miRNAs seem to play roles in diverse physiological processes, such as SsCBP1. SsCBP1 was experimentally confirmed to be a target for the monocot-specific miR528. Our findings support the notion that the regulation of SsCBP1 by miR528 is shared at least within graminaceous monocots, and this miRNA-based post-transcriptional regulation evolved exclusively within the monocots lineage after the divergence from eudicots. Conclusions Using publicly available nucleotide databases, 19 sugarcane miRNA precursors and one new sorghum miRNA precursor were identified and classified into 14 families. Comparative analyses between sugarcane and sorghum suggest that these two species retain homologous miRNAs and targets in their genomes. Such conservation may help to clarify specific aspects of miRNA regulation and evolution in the polyploid sugarcane. Finally, our dataset provides a framework for future studies on sugarcane RNAi-dependent regulatory mechanisms.
Resumo:
Membrane proteins are a large and important class of proteins. They are responsible for several of the key functions in a living cell, e.g. transport of nutrients and ions, cell-cell signaling, and cell-cell adhesion. Despite their importance it has not been possible to study their structure and organization in much detail because of the difficulty to obtain 3D structures. In this thesis theoretical studies of membrane protein sequences and structures have been carried out by analyzing existing experimental data. The data comes from several sources including sequence databases, genome sequencing projects, and 3D structures. Prediction of the membrane spanning regions by hydrophobicity analysis is a key technique used in several of the studies. A novel method for this is also presented and compared to other methods. The primary questions addressed in the thesis are: What properties are common to all membrane proteins? What is the overall architecture of a membrane protein? What properties govern the integration into the membrane? How many membrane proteins are there and how are they distributed in different organisms? Several of the findings have now been backed up by experiments. An analysis of the large family of G-protein coupled receptors pinpoints differences in length and amino acid composition of loops between proteins with and without a signal peptide and also differences between extra- and intracellular loops. Known 3D structures of membrane proteins have been studied in terms of hydrophobicity, distribution of secondary structure and amino acid types, position specific residue variability, and differences between loops and membrane spanning regions. An analysis of several fully and partially sequenced genomes from eukaryotes, prokaryotes, and archaea has been carried out. Several differences in the membrane protein content between organisms were found, the most important being the total number of membrane proteins and the distribution of membrane proteins with a given number of transmembrane segments. Of the properties that were found to be similar in all organisms, the most obvious is the bias in the distribution of positive charges between the extra- and intracellular loops. Finally, an analysis of homologues to membrane proteins with known topology uncovered two related, multi-spanning proteins with opposite predicted orientations. The predicted topologies were verified experimentally, providing a first example of "divergent topology evolution".
Resumo:
The Poxviruses are a family of double stranded DNA (dsDNA) viruses that cause disease in many species, both vertebrate and invertebrate. Their genomes range in size from 135 to 365 kbp and show conservation in both organization and content. In particular, the central genomic regions of the chordopoxvirus subfamily (those capable of infecting vertebrates) contain 88 genes which are present in all the virus species characterised to date and which mostly occur in the same order and orientation. In contrast, however, the terminal regions of the genomes frequently contain genes that are species or genera-specific and that are not essential for the growth of the virus in vitro but instead often encode factors with important roles in vivo including modulation of the host immune response to infection and determination of the host range of the virus. The Parapoxviruses (PPV), of which Orf virus is the prototypic species, represent a genus within the chordopoxvirus subfamily of Poxviridae and are characterised by their ability to infect ruminants and humans. The genus currently contains four recognised species of virus, bovine papular stomatitis virus (BPSV) and pseudocowpox virus (PCPV) both of which infect cattle, orf virus (OV) that infects sheep and goats, and parapoxvirus of red deer in New Zealand (PVNZ). The ORFV genome has been fully sequenced, as has that of BPSV, and is ~138 kb in length encoding ~132 genes. The vast majority of these genes allow the virus to replicate in the cytoplasm of the infected host cell and therefore encode proteins involved in replication, transcription and metabolism of nucleic acids. These genes are well conserved between all known genera of poxviruses. There is however another class of genes, located at either end of the linear dsDNA genome, that encode proteins which are non-essential for replication and generally dictate host range and virulence of the virus. The non-essential genes are often the most variable within and between species of virus and therefore are potentially useful for diagnostic purposes. Given their role in subverting the host-immune response to infection they are also targets for novel therapeutics. The function of only a relatively small number of these proteins has been elucidated and there are several genes whose function still remains obscure principally because there is little similarity between them and proteins of known function in current sequence databases. It is thought that by selectively removing some of the virulence genes, or at least neutralising the proteins in some way, current vaccines could be improved. The evolution of poxviruses has been proposed to be an adaptive process involving frequent events of gene gain and loss, such that the virus co-evolves with its specific host. Gene capture or horizontal gene transfer from the host to the virus is considered an important source of new viral genes including those likely to be involved in host range and those enabling the virus to interfere with the host immune response to infection. Given the low rate of nucleotide substitution, recombination can be seen as an essential evolutionary driving force although it is likely underestimated. Recombination in poxviruses is intimately linked to DNA replication with both viral and cellular proteins participate in this recombination-dependent replication. It has been shown, in other poxvirus genera, that recombination between isolates and perhaps even between species does occur, thereby providing another mechanism for the acquisition of new genes and for the rapid evolution of viruses. Such events may result in viruses that have a selective advantage over others, for example in re-infections (a characteristic of the PPV), or in viruses that are able to jump the species barrier and infect new hosts. Sequence data related to viral strains isolated from goats suggest that possible recombination events may have occurred between OV and PCPV (Ueda et al. 2003). The recombination events are frequent during poxvirus replication and comparative genomic analysis of several poxvirus species has revealed that recombinations occur frequently on the right terminal region. Intraspecific recombination can occur between strains of the same PPV species, but also interspecific recombination can happen depending on enough sequence similarity to enable recombination between distinct PPV species. The most important pre-requisite for a successful recombination is the coinfection of the individual host by different virus strains or species. Consequently, the following factors affecting the distribution of different viruses to shared target cells need to be considered: dose of inoculated virus, time interval between inoculation of the first and the second virus, distance between the marker mutations, genetic homology. At present there are no available data on the replication dynamics of PPV in permissive and non permissive hosts and reguarding co-infetions there are no information on the interference mechanisms occurring during the simultaneous replication of viruses of different species. This work has been carried out to set up permissive substrates allowing the replication of different PPV species, in particular keratinocytes monolayers and organotypic skin cultures. Furthermore a method to isolate and expand ovine skin stem cells was has been set up to indeep further aspects of viral cellular tropism during natural infection. The study produced important data to elucidate the replication dynamics of OV and PCPV virus in vitro as well as the mechanisms of interference that can arise during co-infection with different viral species. Moreover, the analysis carried on the genomic right terminal region of PCPV 1303/05 contributed to a better knowledge of the viral genes involved in host interaction and pathogenesis as well as to locate recombination breakpoints and genetic homologies between PPV species. Taken together these data filled several crucial gaps for the study of interspecific recombinations of PPVs which are thought to be important for a better understanding of the viral evolution and to improve the biosafety of antiviral therapy and PPV-based vectors.
Resumo:
Complete NotI, SfiI, XbaI and BlnI cleavage maps of Escherichia coli K-12 strain MG1655 were constructed. Techniques used included: CHEF pulsed field gel electrophoresis; transposon mutagenesis; fragment hybridization to the ordered $\lambda$ library of Kohara et al.; fragment and cosmid hybridization to Southern blots; correlation of fragments and cleavage sites with EcoMap, a sequence-modified version of the genomic restriction map of Kohara et al.; and correlation of cleavage sites with DNA sequence databases. In all, 105 restriction sites were mapped and correlated with the EcoMap coordinate system.^ NotI, SfiI, XbaI and BlnI restriction patterns of five commonly used E. coli K-12 strains were compared to those of MG1655. The variability between strains, some of which are separated by numerous steps of mutagenic treatment, is readily detectable by pulsed-field gel electrophoresis. A model is presented to account for the difference between the strains on the basis of simple insertions, deletions, and in one case an inversion. Insertions and deletions ranged in size from 1 kb to 86 kb. Several of the larger features have previously been characterized and some of the smaller rearrangements can potentially account for previously reported genetic features of these strains.^ Some aspects of the frequency and distribution of NotI, SfiI, XbaI and BlnI cleavage sites were analyzed using a method based on Markov chain theory. Overlaps of Dam and Dcm methylase sites with XbaI and SfiI cleavage sites were examined. The one XbaI-Dam overlap in the database is in accord with the expected frequency of this overlap. The occurrence of certain types of SfiI-Dcm overlaps are overrepresented. Of the four subtypes of SfiI-Dcm overlap, only one has a partial inhibitory effect on the activity of SfiI. Recognition sites for all four enzymes are rarer than expected based on oligonucleotide frequency data, with this effect being much stronger for XbaI and BlnI than for NotI and SfiI. The latter two enzyme sites are rare mainly due to apparent negative selection against GGCC (both) and CGGCCG (NotI). The former two enzyme sites are rare mainly due to effects of the VSP repair system on certain di-tri- and tetranucleotides, most notably CTAG. Models are proposed to explain several of the anomalies of oligonucleotide distribution in E. coli, and the biological significance of the systems that produce these anomalies is discussed. ^
Resumo:
Retinitis pigmentosa (RP) is a name given to a group of inherited retinal dystrophies that lead to progressive photoreceptor degeneration, and thus, visual impairment. It is evident at both the clinical and the molecular level that these are heterogeneous disorders, with wide variation in severity, mode of inheritance, and phenotype. The genetics of RP are not simple; the disease can be inherited in dominant, recessive, X-linked, and digenic modes. Autosomal dominant RP (adRP) results from mutations in at least ten mapped loci, but there may be dozens of genetic loci where mutations can cause RP. To date, there are over a hundred genes known to cause retinal degenerative diseases, and less than half of these have been cloned (RetNet). Among the dozens of retinitis pigmentosa loci known to exist, only a few have been identified and the remainders are inferred from linkage studies. Today, the genes for seven of the twelve-adRP loci have been identified, and these are rhodopsin, peripherin/RDS, NRL, ROM1, CRX, RP13 and RP1. My research projects involved a combination of the continued search for genes involved in retinal dystrophies, as well the investigation into the role of peripherin/RDS and RP1 in the disease etiology of autosomal dominant RP. ^ Most of the mutations leading to inherited retinal disorders have been identified in predominately retina expressed genes like rhodopsin, peripherin/RDS, and RP1. Expressed sequence tags (ESTs) that were retina-specific were culled from sequence databases and, together with laboratory analysis, were analyzed as potential candidate genes for retinal dystrophies. Thirteen of the fifty-five identified retina-specific ESTs mapped to within candidate regions for inherited retinopathies. One of these is RP1L1, a homologue of RP1 and a potential cause of adRP. ^ Once a disease-associated gene has been identified, elucidating the role of that gene in the visual process is essential for understanding what happens when the process is defective as it is in adRP. My next projects involved investigating the role of a novel 5′ donor +3 splice site mutation on the mRNA of peripherin/RDS in adRP affected individuals, and comparative sequencing in RP1 to define conserved regions of the protein. Comparative sequencing is a powerful way to delineate critical regions of a sequence because different regions of a gene have different functions, and each region is subject to different levels of functional or structural constraints. Establishing a framework of conserved domains is beneficial not only for structural or functional studies, but can also aid in determining the potential effects of mutations. With the completion of sequencing of human genome, and other organisms such as Saccharomyces cerevisiae, Caenorhabditis elegans , and Drosophila, the facility of comparative sequencing will only increase in the future. Comparative sequencing has already become an established procedure for pinpointing conserved regions of a protein, and is an efficient way to target regions of a protein for experimental and/or evolutionary analysis. ^