913 resultados para Sequence Motifs


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Unmethylated CpG dinucleotides in particular base contexts (CpG-S motifs) are relatively common in bacterial DNA but are rare in vertebrate DNA. B cells and monocytes have the ability to detect such CpG-S motifs that trigger innate immune defenses with production of Th1-like cytokines. Despite comparable levels of unmethylated CpG dinucleotides, DNA from serotype 12 adenovirus is immune-stimulatory, but serotype 2 is nonstimulatory and can even inhibit activation by bacterial DNA. In type 12 genomes, the distribution of CpG-flanking bases is similar to that predicted by chance. However, in type 2 adenoviral DNA the immune stimulatory CpG-S motifs are outnumbered by a 15- to 30-fold excess of CpG dinucleotides in clusters of direct repeats or with a C on the 5′ side or a G on the 3′ side. Synthetic oligodeoxynucleotides containing these putative neutralizing (CpG-N) motifs block immune activation by CpG-S motifs in vitro and in vivo. Eliminating 52 of the 134 CpG-N motifs present in a DNA vaccine markedly enhanced its Th1-like function in vivo, which was increased further by the addition of CpG-S motifs. Thus, depending on the CpG motif, prokaryotic DNA can be either immune-stimulatory or neutralizing. These results have important implications for understanding microbial pathogenesis and molecular evolution and for the clinical development of DNA vaccines and gene therapy vectors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last decade, two tools, one drawn from information theory and the other from artificial neural networks, have proven particularly useful in many different areas of sequence analysis. The work presented herein indicates that these two approaches can be joined in a general fashion to produce a very powerful search engine that is capable of locating members of a given nucleic acid sequence family in either local or global sequence searches. This program can, in turn, be queried for its definition of the motif under investigation, ranking each base in context for its contribution to membership in the motif family. In principle, the method used can be applied to any binding motif, including both DNA and RNA sequence families, given sufficient family size.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method for discovering conserved sequence motifs from families of aligned protein sequences. The method has been implemented as a computer program called emotif (http://motif.stanford.edu/emotif). Given an aligned set of protein sequences, emotif generates a set of motifs with a wide range of specificities and sensitivities. emotif also can generate motifs that describe possible subfamilies of a protein superfamily. A disjunction of such motifs often can represent the entire superfamily with high specificity and sensitivity. We have used emotif to generate sets of motifs from all 7,000 protein alignments in the blocks and prints databases. The resulting database, called identify (http://motif.stanford.edu/identify), contains more than 50,000 motifs. For each alignment, the database contains several motifs having a probability of matching a false positive that range from 10−10 to 10−5. Highly specific motifs are well suited for searching entire proteomes, while generating very few false predictions. identify assigns biological functions to 25–30% of all proteins encoded by the Saccharomyces cerevisiae genome and by several bacterial genomes. In particular, identify assigned functions to 172 of proteins of unknown function in the yeast genome.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

HLA-DR13 has been associated with resistance to two major infectious diseases of humans. To investigate the peptide binding specificity of two HLA-DR13 molecules and the effects of the Gly/Val dimorphism at position 86 of the HLA-DR beta chain on natural peptide ligands, these peptides were acid-eluted from immunoaffinity-purified HLA-DRB1*1301 and -DRB1*1302, molecules that differ only at this position. The eluted peptides were subjected to pool sequencing or individual peptide sequencing by tandem MS or Edman microsequencing. Sequences were obtained for 23 peptides from nine source proteins. Three pool sequences for each allele and the sequences of individual peptides were used to define binding motifs for each allele. Binding specificities varied only at the primary hydrophobic anchor residue, the differences being a preference for the aromatic amino acids Tyr and Phe in DRB1*1302 and a preference for Val in DRB1*1301. Synthetic analogues of the eluted peptides showed allele specificity in their binding to purified HLA-DR, and Ala-substituted peptides were used to identify the primary anchor residues for binding. The failure of some peptides eluted from DRB1*1302 (those that use aromatic amino acids as primary anchors) to bind to DRB1*1301 confirmed the different preferences for peptide anchor residues conferred by the Gly-->Val change at position 86. These data suggest a molecular basis for the differential associations of HLA-DRB1*1301 and DRB1*1302 with resistance to severe malaria and clearance of hepatitis B virus infection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results: We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion: The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Odorant receptor (OR) genes constitute with 1200 members the largest gene family in the mouse genome. A mature olfactory sensory neuron (OSN) is thought to express just one OR gene, and from one allele. The cell bodies of OSNs that express a given OR gene display a mosaic pattern within a particular region of the main olfactory epithelium. The mechanisms and cis-acting DNA elements that regulate the expression of one OR gene per OSN - OR gene choice - remain poorly understood. Here, we describe a reporter assay to identify minimal promoters for OR genes in transgenic mice, which are produced by the conventional method of pronuclear injection of DNA. The promoter transgenes are devoid of an OR coding sequence, and instead drive expression of the axonal marker tau-β-galactosidase. For four mouse OR genes (M71, M72, MOR23, and P3) and one human OR gene (hM72), a mosaic, OSN-specific pattern of reporter expression can be obtained in transgenic mice with contiguous DNA segments of only ~300 bp that are centered around the transcription start site (TSS). The ~150bp region upstream of the TSS contains three conserved sequence motifs, including homeodomain (HD) binding sites. Such HD binding sites are also present in the H and P elements, DNA sequences that are known to strongly influence OR gene expression. When a 19mer encompassing a HD binding site from the P element is multimerized nine times and added upstream of a MOR23 minigene that contains the MOR23 coding region, we observe a dramatic increase in the number of transgene-expressing founders and lines and in the number of labeled OSNs. By contrast, a nine times multimerized 19mer with a mutant HD binding site does not have these effects. We hypothesize that HD binding sites in the H and P elements and in OR promoters modulate the probability of OR gene choice.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

BACKGROUND: Today, recognition and classification of sequence motifs and protein folds is a mature field, thanks to the availability of numerous comprehensive and easy to use software packages and web-based services. Recognition of structural motifs, by comparison, is less well developed and much less frequently used, possibly due to a lack of easily accessible and easy to use software. RESULTS: In this paper, we describe an extension of DeepView/Swiss-PdbViewer through which structural motifs may be defined and searched for in large protein structure databases, and we show that common structural motifs involved in stabilizing protein folds are present in evolutionarily and structurally unrelated proteins, also in deeply buried locations which are not obviously related to protein function. CONCLUSIONS: The possibility to define custom motifs and search for their occurrence in other proteins permits the identification of recurrent arrangements of residues that could have structural implications. The possibility to do so without having to maintain a complex software/hardware installation on site brings this technology to experts and non-experts alike.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Immune evasion by Plasmodium falciparum is favored by extensive allelic diversity of surface antigens. Some of them, most notably the vaccine-candidate merozoite surface protein (MSP)-1, exhibit a poorly understood pattern of allelic dimorphism, in which all observed alleles group into two highly diverged allelic families with few or no inter-family recombinants. Here we describe contrasting levels and patterns of sequence diversity in genes encoding three MSP-1-associated surface antigens of P. falciparum, ranging from an ancient allelic dimorphism in the Msp-6 gene to a near lack of allelic divergence in Msp-9 to a more classical multi-allele polymorphism in Msp-7 Other members of the Msp-7 gene family exhibit very little polymorphism in non-repetitive regions. A comparison of P. falciparum Msp-6 sequences to an orthologous sequence from P. reichenowi provided evidence for distinct evolutionary histories of the 5` and 3` segments of the dimorphic region in PfMsp-6, consistent with one dimorphic lineage having arisen from recombination between now-extinct ancestral alleles. In addition. we uncovered two surprising patterns of evolution in repetitive sequence. Firsts in Msp-6, large deletions are associated with (nearly) identical sequence motifs at their borders. Second, a comparison of PfMsp-9 with the P. reichenowi ortholog indicated retention of a significant inter-unit diversity within an 18-base pair repeat within the coding region of P. falciparum, but homogenization in P. reichenowi. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Local protein structure prediction efforts have consistently failed to exceed approximately 70% accuracy. We characterize the degeneracy of the mapping from local sequence to local structure responsible for this failure by investigating the extent to which similar sequence segments found in different proteins adopt similar three-dimensional structures. Sequence segments 3-15 residues in length from 154 different protein families are partitioned into neighborhoods containing segments with similar sequences using cluster analysis. The consistency of the sequence-to-structure mapping is assessed by comparing the local structures adopted by sequence segments in the same neighborhood in proteins of known structure. In the 154 families, 45% and 28% of the positions occur in neighborhoods in which one and two local structures predominate, respectively. The sequence patterns that characterize the neighborhoods in the first class probably include virtually all of the short sequence motifs in proteins that consistently occur in a particular local structure. These patterns, many of which occur in transitions between secondary structural elements, are an interesting combination of previously studied and novel motifs. The identification of sequence patterns that consistently occur in one or a small number of local structures in proteins should contribute to the prediction of protein structure from sequence.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent experiments have exposed significant discrepancies between experimental data and predictive models for DNA structure. These results strongly suggest that DNA structural parameters incorporated in the models are not always sufficient to account for the influence of sequence context and of specific ion effects. In an attempt to evaluate these two effects, we have investigated repetitive DNA sequences with the sequence motif GAGAG.CTCTC located in different helical phasing arrangements with respect to poly(A) tracts and GGGCCC.GGGCCC sequence motifs. Methods used are ligase-mediated cyclization and gel mobility experiments along with DNase I cutting and chemical probe studies. The results provide new evidence for curvature in poly(A) tracts. They also show that the sequence context in which bending and flexible sequence elements are found is an important aspect of sequence-dependent DNA conformation. Although dinucleotide models generally have good predictive power, this work demonstrates that in some instances sequence elements larger than the dinucleotide must be taken into account, and hence it provides a starting point for the appropriate modification and refinement of existing structural models for DNA.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The planctomycetes are a phylum of bacteria that have a unique cell compartmentalisation and yeast-like budding cell division and peptidoglycan-less proteinaceous cell walls. We wished to further our understanding of these unique organisms at the molecular level by searching for conserved amino acid sequence motifs and domains in the proteins encoded by Rhodopirellula baltica. Using BLAST and single-linkage clustering, we have discovered several new protein domains and sequence motifs in this planctomycete. R. baltica has multiple members of the newly discovered GEFGR protein family and the ASPIC C-terminal domain family, whilst most other organisms for which whole genome sequence is available have no more than one. Many of the domains and motifs appear to be restricted to the planctomycetes. It is possible that these protein domains and motifs may have been lost or replaced in other phyla, or they may have undergone multiple duplication events in the planctomycete lineage. One of the novel motifs probably represents a novel N-terminal export signal peptide. With their unique cell biology, it may be that the planctomycete cell compartmentalisation plan in particular needs special membrane transport mechanisms. The discovery of these new domains and motifs, many of which are associated with secretion and cell-surface functions, will help to stimulate experimental work and thus enhance further understanding of this fascinating group of organisms. (C) 2004 Federation of European Microbiological Societies. Published by Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Transposon mutagenesis and complementation studies previously identified a gene (xabB) for a large (526 kDa) polyketide-peptide synthase required for biosynthesis of albicidin antibiotics and phytotoxins in the sugarcane leaf scald pathogen Xanthomonas albilineans. A cistron immediately downstream from xabB encodes a polypeptide of 343 aa containing three conserved motifs characteristic of a family of S-adenosyl-L-methionine (SAM)-dependent O-methyltransferases. Insertional mutagenesis and complementation indicate that the product of this cistron (designated xabC) is essential for albicidin production, and that there is no other required downstream cistron. The xab promoter region is bidirectional, and insertional mutagenesis of the first open reading frame (ORF) in the divergent gene also blocks albicidin biosynthesis. This divergent ORF (designated thp) encodes a protein of 239 aa displaying high similarity to several IS21-like transposition helper proteins. The thp cistron is not located in a recognizable transposon, and is probably a remnant from a past transposition event that may have contributed to the development of the albicidin biosynthetic gene cluster. Failure of 'in trans' complementation of rhp indicates that a downstream cistron transcribed with thp is required for albicidin biosynthesis. (C) 2000 Elsevier Science B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Leucine-rich repeats (LRRs) are 20-29-residue sequence motifs present in a number of proteins with diverse functions. The primary function of these motifs appears to be to provide a versatile structural framework for the formation of protein-protein interactions. The past two years have seen an explosion of new structural information on proteins with LRRs. The new structures represent different LRR subfamilies and proteins with diverse functions, including GTPase-activating protein rna 1 p from the ribonuclease-inhibitor-like subfamily; spliceosomal protein U2A', Rab geranylgeranyltransferase, internalin B, dynein light chain 1 and nuclear export protein TAP from the SDS22-like subfamily; Skp2 from the cysteine-containing subfamily; and YopM from the bacterial subfamily. The new structural information has increased our understanding of the structural determinants of LRR proteins and our ability to model such proteins with unknown structures, and has shed new light on how these proteins participate in protein-protein interactions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The NS5 protein of the flavivirus Kunjin (KUN) contains conserved sequence motifs characteristic of RNA-dependent RNA polymerase (RdRp) activity. To investigate this activity in vitro, recombinant NS5 proteins with C-terminal (NS5CHis) and N-terminal (NS5NHis) hexahistidine tags were produced in baculovirus-infected insect cells and purified to near homogeneity by nickel affinity chromatography. Purified NS5CHis exhibited RdRp activity with both specific (9 kb KUN replicon) and non-specific (8.3 kb Semliki Forest virus replicon) RNA templates; this activity did not require the presence of additional viral and/or cellular cofactors. RdRp activity of purified NS5NHis protein was reduced in comparison to NS5CHis, while purified NS5NHis incorporating a GDD -> GVD mutation within the polymerase active site (NS5GVD) lacked RdRp activity. RNase A digestion of the RdRp reaction products indicated that they were double-stranded and of a similar size to the KUN replicative form produced in Vero cells, thus demonstrating that the KUN NS5 protein has an intrinsic, albeit low and non-specific RdRp activity in vitro, similar to that reported for recombinant RdRp of other flaviviruses. However, in contrast to RNA polymerases of other Flavivirus species, purified KUN NS5 polymerase produced a single, full-length replicon RNA product, thus demonstrating efficient processivity. (C) 2001 Elsevier Science B.V. All rights reserved.