84 resultados para sequence database
Resumo:
Only a small fraction of spectra acquired in LC-MS/MS runs matches peptides from target proteins upon database searches. The remaining, operationally termed background, spectra originate from a variety of poorly controlled sources and affect the throughput and confidence of database searches. Here, we report an algorithm and its software implementation that rapidly removes background spectra, regardless of their precise origin. The method estimates the dissimilarity distance between screened MS/MS spectra and unannotated spectra from a partially redundant background library compiled from several control and blank runs. Filtering MS/MS queries enhanced the protein identification capacity when searches lacked spectrum to sequence matching specificity. In sequence-similarity searches it reduced by, on average, 30-fold the number of orphan hits, which were not explicitly related to background protein contaminants and required manual validation. Removing high quality background MS/MS spectra, while preserving in the data set the genuine spectra from target proteins, decreased the false positive rate of stringent database searches and improved the identification of low-abundance proteins.
Resumo:
Phylogenetic analyses of chloroplast DNA sequences, morphology, and combined data have provided consistent support for many of the major branches within the angiosperm, clade Dipsacales. Here we use sequences from three mitochondrial loci to test the existing broad scale phylogeny and in an attempt to resolve several relationships that have remained uncertain. Parsimony, maximum likelihood, and Bayesian analyses of a combined mitochondrial data set recover trees broadly consistent with previous studies, although resolution and support are lower than in the largest chloroplast analyses. Combining chloroplast and mitochondrial data results in a generally well-resolved and very strongly supported topology but the previously recognized problem areas remain. To investigate why these relationships have been difficult to resolve we conducted a series of experiments using different data partitions and heterogeneous substitution models. Usually more complex modeling schemes are favored regardless of the partitions recognized but model choice had little effect on topology or support values. In contrast there are consistent but weakly supported differences in the topologies recovered from coding and non-coding matrices. These conflicts directly correspond to relationships that were poorly resolved in analyses of the full combined chloroplast-mitochondrial data set. We suggest incongruent signal has contributed to our inability to confidently resolve these problem areas. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
Many of the controversies around the concept of homology rest on the subjectivity inherent to primary homology propositions. Dynamic homology partially solves this problem, but there has been up to now scant application of it outside of the molecular domain. This is probably because morphological and behavioural characters are rich in properties, connections and qualities, so that there is less space for conflicting character delimitations. Here we present a new method for the direct optimization of behavioural data, a method that relies on the richness of this database to delimit the characters, and on dynamic procedures to establish character state identity. We use between-species congruence in the data matrix and topological stability to choose the best cladogram. We test the methodology using sequences of predatory behaviour in a group of spiders that evolved the highly modified predatory technique of spitting glue onto prey. The cladogram recovered is fully compatible with previous analyses in the literature, and thus the method seems consistent. Besides the advantage of enhanced objectivity in character proposition, the new procedure allows the use of complex, context-dependent behavioural characters in an evolutionary framework, an important step towards the practical integration of the evolutionary and ecological perspectives on diversity. (C) The Willi Hennig Society 2010.
Resumo:
Caspases are central players in proteolytic pathways that regulate cellular processes Such as apoptosis and differentiation. To accelerate the discovery of novel caspase substrates we developed a method combining in silico screening and in vitro validation. With this approach, we identified TAH15 as a novel caspase Substrate in a trial Study. We find that TAF15 was specifically cleaved by caspases-3 and -7. Site-directed mutagenesis revealed the consensus sequence (106)DQPD/Y(110) as the only site recognized by these caspases. Surprisingly, TAF15 was cleaved at more than one site in staurosporine-treated Jurkat cells. In addition, we generated two oncogenic TAF15-CIZ/NMP4-fused proteins which have been found in acute myeloid leukemia and demonstrate that caspases-3 and -7 cleave the fusion proteins at one single site. Broad application of this combination approach should expedite identification of novel caspase-interacting proteins and provide new insights into the regulation of caspase pathways leading to cell death in normal and cancer cells. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
The genome of the most virulent among 22 Brazilian geographical isolates of Spodoptera frugiperda nucleopolyhedrovirus, isolate 19 (SfMNPV-1 9), was completely sequenced and shown to comprise 132 565 bp and 141 open reading frames (ORFs). A total of 11 ORFs with no homology to genes in the GenBank database were found. Of those, four had typical baculovirus; promoter motifs and polyadenylation sites. Computer-simulated restriction enzyme cleavage patterns of SfMNPV-1 9 were compared with published physical maps of other SfMNPV isolates. Differences were observed in terms of the restriction profiles and genome size. Comparison of SfMNPV-1 9 with the sequence of the SfMNPV isolate 3AP2 indicated that they differed due to a 1427 bp deletion, as well as by a series of smaller deletions and point mutations. The majority of genes of SfMNPV-1 9 were conserved in the closely related Spodoptera exigua NPV (SeMNPV) and Agrotis segetum NPV (AgseMNPV-A), but a few regions experienced major changes and rearrangements. Synthenic maps for the genomes of group 11 NPVs revealed that gene collinearity was observed only within certain clusters. Analysis of the dynamics of gene gain and loss along the phylogenetic tree of the NPVs showed that group 11 had only five defining genes and supported the hypothesis that these viruses form ten highly divergent ancient lineages. Crucially, more than 60% of the gene gain events followed a power-law relation to genetic distance among baculoviruses, indicative of temporal organization in the gene accretion process.
Resumo:
Human parvovirus B19 is the only member of the genus Erythrovirus that causes human disease. Recent findings of several strains with considerable sequence divergence from B19 have suggested a new classification for parvovirus genotypes as 1 (B19), 2 (A-6 and LaLi) and 3 (V9). In their overall DNA sequence, the three genotypes differ by similar to 10%. Here, we report the isolation of a genotype-3-related strain named BR543 during a prospective study conducted in Sao Paulo, Brazil. Analysis of the nearly full-length genome sequence of BR543 indicates that this B19 variant sequence clusters with Gh2768, a strain from Ghana belonging to subtype 3b, and showed mostly synonymous substitutions.
Resumo:
Paracoccidioidomycosis (PCM) is a systemic granulomatous disease caused by the dimorphic fungus Paracoccidioides brasiliensis. Anti-PCM vaccine formulations based on the secreted fungal cell wall protein (gp43) or the derived P10 sequence containing a CD4(+) T-cell-specific epitope have shown promising results. In the present study, we evaluated new anti-PCM vaccine formulations based on the intranasal administration of P. brasiliensis gp43 or the P10 peptide in combination with the Salmonella enterica FliC flagellin, an innate immunity agonist binding specifically to the Toll-like receptor 5, in a murine model. BALB/c mice immunized with gp43 developed high-specific-serum immunoglobulin G1 responses and enhanced interleukin-4 (IL-4) and IL-10 levels. On the other hand, mice immunized with recombinant purified flagellins genetically fused with P10 at the central hypervariable domain, either flanked or not by two lysine residues, or the synthetic P10 peptide admixed with purified FliC elicited a prevailing Th1-type immune response based on lung cell-secreted type 1 cytokines. Mice immunized with gp43 and FliC and intratracheally challenged with P. brasiliensis yeast cells had increased fungal proliferation and lung tissue damage. In contrast, mice immunized with the chimeric flagellins and particularly those immunized with P10 admixed with FliC reduced P. brasiliensis growth and lung damage. Altogether, these results indicate that S. enterica FliC flagellin modulates the immune response to P. brasiliensis P10 antigen and represents a promising alternative for the generation of anti-PCM vaccines.
Resumo:
Plasmodium falciparum, the causative agent of human malaria, invades host erythrocytes using several proteins on the surface of the invasive merozoite, which have been proposed as potential vaccine candidates. Members of the multi-gene PfRh family are surface antigens that have been shown to play a central role in directing merozoites to alternative erythrocyte receptors for invasion. Recently, we identified a large structural polymorphism, a 0.58 Kb deletion, in the C-terminal region of the PfRh2b gene, present at a high frequency in parasite populations from Senegal. We hypothesize that this region is a target of humoral immunity. Here, by analyzing 371 P. falciparum isolates we show that this major allele is present at varying frequencies in different populations within Senegal, Africa, and throughout the world. For allelic dimorphisms in the asexual stage antigens, Msp-2 and EBA-175, we find minimal geographic differentiation among parasite populations from Senegal and other African localities, suggesting extensive gene flow among these populations and/or immune-mediated frequency-dependent balancing selection. In contrast, we observe a higher level of inter-population divergence (as measured by F(st)) for the PfRh2b deletion, similar to that observed for SNPs from the sexual stage Pfs45/48 loci, which is postulated to be under directional selection. We confirm that the region containing the PfRh2b polymorphism is a target of humoral immune responses by demonstrating antibody reactivity of endemic sera. Our analysis of inter-population divergence suggests that in contrast to the large allelic dimorphisms in EBA-175 and Msp-2, the presence or absence of the large PfRh2b deletion may not elicit frequency-dependent immune selection, but may be under positive immune selection, having important implications for the development of these proteins as vaccine candidates. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Pfs230, surface protein of gametocyte/gamete of the human malaria parasite, Plasmodium falciparum, is a prime candidate of malaria transmission-blocking vaccine. Plasmodium vivax has an ortholog of Pfs230 (Pvs230), however, there has been no study in any aspects on Pvs230 to date. To investigate whether Pvs230 can be a vivax malaria transmission-blocking vaccine, we performed evolutionary and population genetic analysis of the Pvs230 gene (pvs230: PVX_003905). Our analysis of Pvs230 and its orthologs in eight Plasmodium species revealed two distinctive parts: an interspecies variable part (IVP) containing species-specific oligopeptide repeats at the N-terminus and a 7.5 kb interspecies conserved part (ICP) containing 14 cysteine-rich domains. Pvs230 was closely related to its orthologs, Pks230 and Pcys230, in monkey malaria parasites. Analysis of 113 pvs230 sequences obtained from worldwide, showed that nucleotide diversity is remarkably low in the non-repeat 8-kb region of pvs230 (theta pi = 0.00118) with 77 polymorphic nucleotide sites, 40 of which results in amino acid replacements. A signature of purifying selection but not of balancing selection was seen on pvs230. Functional and/or structural constraints may limit the level of polymorphism in pvs230. The observed limited polymorphism in pvs230 should ground for utilization of Pvs230 as an effective transmission-blocking vaccine. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Immune evasion by Plasmodium falciparum is favored by extensive allelic diversity of surface antigens. Some of them, most notably the vaccine-candidate merozoite surface protein (MSP)-1, exhibit a poorly understood pattern of allelic dimorphism, in which all observed alleles group into two highly diverged allelic families with few or no inter-family recombinants. Here we describe contrasting levels and patterns of sequence diversity in genes encoding three MSP-1-associated surface antigens of P. falciparum, ranging from an ancient allelic dimorphism in the Msp-6 gene to a near lack of allelic divergence in Msp-9 to a more classical multi-allele polymorphism in Msp-7 Other members of the Msp-7 gene family exhibit very little polymorphism in non-repetitive regions. A comparison of P. falciparum Msp-6 sequences to an orthologous sequence from P. reichenowi provided evidence for distinct evolutionary histories of the 5` and 3` segments of the dimorphic region in PfMsp-6, consistent with one dimorphic lineage having arisen from recombination between now-extinct ancestral alleles. In addition. we uncovered two surprising patterns of evolution in repetitive sequence. Firsts in Msp-6, large deletions are associated with (nearly) identical sequence motifs at their borders. Second, a comparison of PfMsp-9 with the P. reichenowi ortholog indicated retention of a significant inter-unit diversity within an 18-base pair repeat within the coding region of P. falciparum, but homogenization in P. reichenowi. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.
Resumo:
Leptospirosis is a world spread zoonosis caused by members of the genus Leptospira. Although leptospires were identified as the causal agent of leptospirosis almost 100 years ago, little is known about their biology, which hinders the development of new treatment and prevention strategies. One of the several aspects of the leptospiral biology not yet elucidated is the process by which outer membrane proteins (OMPs) traverse the periplasm and are inserted into the outer membrane. The crystal structure determination of the conserved hypothetical protein LIC12922 from Leptospira interrogans revealed a two domain protein homologous to the Escherichia coli periplasmic chaperone SurA. The LIC12922 NC-domain is structurally related to the chaperone modules of E. coli SurA and trigger factor, whereas the parvulin domain is devoid of peptidyl prolyl cis-trans isomerase activity. Phylogenetic analyses suggest a relationship between LIC12922 and the chaperones PrsA, PpiD and SurA. Based on our structural and evolutionary analyses, we postulate that LIC12922 is a periplasmic chaperone involved in OMPs biogenesis in Leptospira spp. Since LIC12922 homologs were identified in all spirochetal genomes sequenced to date, this assumption may have implications for the OMPs biogenesis studies not only in leptospires but in the entire Phylum Spirochaetes. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
Ureaplasma diversum infection in bulls may result in seminal vesiculitis, balanoposthitis and alterations in spermatozoids. In cows, it can cause placentitis, fetal alveolitis, abortion and the birth of weak calves. U. diversum ATCC 49782 (serogroups A), ATCC 49783 (serogroup C) and 34 field isolates were used for this study. These microorganisms were submitted to Polymerase Chain Reaction for 16S gene sequence determination using Tact High Fidelity and the products were purified and bi-directionally sequenced. Using the sequence obtained, a fragment containing four hypervariable regions was selected and nucleotide polymorphisms were identified based on their position within the 16S rRNA gene. Forty-four single nucleotide polymorphisms (SNP) were detected. The genotypic variability of the 16S rRNA gene of U. diversum isolates shows that the taxonomy classification of these organisms is likely much more complex than previously described and that 16S rRNA gene sequencing may be used to suggest an epidemiologic pattern of different origin strains. (c) 2011 Elsevier B.V. All rights reserved.
Resumo:
This paper explores the structural continuum in CATH and the extent to which superfamilies adopt distinct folds. Although most superfamilies are structurally conserved, in some of the most highly populated superfamilies (4% of all superfamilies) there is considerable structural divergence. While relatives share a similar fold in the evolutionary conserved core, diverse elaborations to this core can result in significant differences in the global structures. Applying similar protocols to examine the extent to which structural overlaps occur between different fold groups, it appears this effect is confined to just a few architectures and is largely due to small, recurring super-secondary motifs (e.g., alpha beta-motifs, alpha-hairpins). Although 24% of superfamilies overlap with superfamilies having different folds, only 14% of nonredundant structures in CATH are involved in overlaps. Nevertheless, the existence of these overlaps suggests that, in some regions of structure space, the fold universe should be seen as more continuous.
Resumo:
The study of pharmacokinetic properties (PK) is of great importance in drug discovery and development. In the present work, PK/DB (a new freely available database for PK) was designed with the aim of creating robust databases for pharmacokinetic studies and in silico absorption, distribution, metabolism and excretion (ADME) prediction. Comprehensive, web-based and easy to access, PK/DB manages 1203 compounds which represent 2973 pharmacokinetic measurements, including five models for in silico ADME prediction (human intestinal absorption, human oral bioavailability, plasma protein binding, bloodbrain barrier and water solubility).