992 resultados para Sequence Motifs


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Unmethylated CpG dinucleotides in particular base contexts (CpG-S motifs) are relatively common in bacterial DNA but are rare in vertebrate DNA. B cells and monocytes have the ability to detect such CpG-S motifs that trigger innate immune defenses with production of Th1-like cytokines. Despite comparable levels of unmethylated CpG dinucleotides, DNA from serotype 12 adenovirus is immune-stimulatory, but serotype 2 is nonstimulatory and can even inhibit activation by bacterial DNA. In type 12 genomes, the distribution of CpG-flanking bases is similar to that predicted by chance. However, in type 2 adenoviral DNA the immune stimulatory CpG-S motifs are outnumbered by a 15- to 30-fold excess of CpG dinucleotides in clusters of direct repeats or with a C on the 5′ side or a G on the 3′ side. Synthetic oligodeoxynucleotides containing these putative neutralizing (CpG-N) motifs block immune activation by CpG-S motifs in vitro and in vivo. Eliminating 52 of the 134 CpG-N motifs present in a DNA vaccine markedly enhanced its Th1-like function in vivo, which was increased further by the addition of CpG-S motifs. Thus, depending on the CpG motif, prokaryotic DNA can be either immune-stimulatory or neutralizing. These results have important implications for understanding microbial pathogenesis and molecular evolution and for the clinical development of DNA vaccines and gene therapy vectors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last decade, two tools, one drawn from information theory and the other from artificial neural networks, have proven particularly useful in many different areas of sequence analysis. The work presented herein indicates that these two approaches can be joined in a general fashion to produce a very powerful search engine that is capable of locating members of a given nucleic acid sequence family in either local or global sequence searches. This program can, in turn, be queried for its definition of the motif under investigation, ranking each base in context for its contribution to membership in the motif family. In principle, the method used can be applied to any binding motif, including both DNA and RNA sequence families, given sufficient family size.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a method for discovering conserved sequence motifs from families of aligned protein sequences. The method has been implemented as a computer program called emotif (http://motif.stanford.edu/emotif). Given an aligned set of protein sequences, emotif generates a set of motifs with a wide range of specificities and sensitivities. emotif also can generate motifs that describe possible subfamilies of a protein superfamily. A disjunction of such motifs often can represent the entire superfamily with high specificity and sensitivity. We have used emotif to generate sets of motifs from all 7,000 protein alignments in the blocks and prints databases. The resulting database, called identify (http://motif.stanford.edu/identify), contains more than 50,000 motifs. For each alignment, the database contains several motifs having a probability of matching a false positive that range from 10−10 to 10−5. Highly specific motifs are well suited for searching entire proteomes, while generating very few false predictions. identify assigns biological functions to 25–30% of all proteins encoded by the Saccharomyces cerevisiae genome and by several bacterial genomes. In particular, identify assigned functions to 172 of proteins of unknown function in the yeast genome.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

HLA-DR13 has been associated with resistance to two major infectious diseases of humans. To investigate the peptide binding specificity of two HLA-DR13 molecules and the effects of the Gly/Val dimorphism at position 86 of the HLA-DR beta chain on natural peptide ligands, these peptides were acid-eluted from immunoaffinity-purified HLA-DRB1*1301 and -DRB1*1302, molecules that differ only at this position. The eluted peptides were subjected to pool sequencing or individual peptide sequencing by tandem MS or Edman microsequencing. Sequences were obtained for 23 peptides from nine source proteins. Three pool sequences for each allele and the sequences of individual peptides were used to define binding motifs for each allele. Binding specificities varied only at the primary hydrophobic anchor residue, the differences being a preference for the aromatic amino acids Tyr and Phe in DRB1*1302 and a preference for Val in DRB1*1301. Synthetic analogues of the eluted peptides showed allele specificity in their binding to purified HLA-DR, and Ala-substituted peptides were used to identify the primary anchor residues for binding. The failure of some peptides eluted from DRB1*1302 (those that use aromatic amino acids as primary anchors) to bind to DRB1*1301 confirmed the different preferences for peptide anchor residues conferred by the Gly-->Val change at position 86. These data suggest a molecular basis for the differential associations of HLA-DRB1*1301 and DRB1*1302 with resistance to severe malaria and clearance of hepatitis B virus infection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: In protein sequence classification, identification of the sequence motifs or n-grams that can precisely discriminate between classes is a more interesting scientific question than the classification itself. A number of classification methods aim at accurate classification but fail to explain which sequence features indeed contribute to the accuracy. We hypothesize that sequences in lower denominations (n-grams) can be used to explore the sequence landscape and to identify class-specific motifs that discriminate between classes during classification. Discriminative n-grams are short peptide sequences that are highly frequent in one class but are either minimally present or absent in other classes. In this study, we present a new substitution-based scoring function for identifying discriminative n-grams that are highly specific to a class. Results: We present a scoring function based on discriminative n-grams that can effectively discriminate between classes. The scoring function, initially, harvests the entire set of 4- to 8-grams from the protein sequences of different classes in the dataset. Similar n-grams of the same size are combined to form new n-grams, where the similarity is defined by positive amino acid substitution scores in the BLOSUM62 matrix. Substitution has resulted in a large increase in the number of discriminatory n-grams harvested. Due to the unbalanced nature of the dataset, the frequencies of the n-grams are normalized using a dampening factor, which gives more weightage to the n-grams that appear in fewer classes and vice-versa. After the n-grams are normalized, the scoring function identifies discriminative 4- to 8-grams for each class that are frequent enough to be above a selection threshold. By mapping these discriminative n-grams back to the protein sequences, we obtained contiguous n-grams that represent short class-specific motifs in protein sequences. Our method fared well compared to an existing motif finding method known as Wordspy. We have validated our enriched set of class-specific motifs against the functionally important motifs obtained from the NLSdb, Prosite and ELM databases. We demonstrate that this method is very generic; thus can be widely applied to detect class-specific motifs in many protein sequence classification tasks. Conclusion: The proposed scoring function and methodology is able to identify class-specific motifs using discriminative n-grams derived from the protein sequences. The implementation of amino acid substitution scores for similarity detection, and the dampening factor to normalize the unbalanced datasets have significant effect on the performance of the scoring function. Our multipronged validation tests demonstrate that this method can detect class-specific motifs from a wide variety of protein sequence classes with a potential application to detecting proteome-specific motifs of different organisms.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The identification of sequence (amino acids or nucleotides) motifs in a particular order in biological sequences has proved to be of interest. This paper describes a computing server, SSMBS, which can locate anddisplay the occurrences of user-defined biologically important sequence motifs (a maximum of five) present in a specific order in protein and nucleotide sequences. While the server can efficiently locate motifs specified using regular expressions, it can also find occurrences of long and complex motifs. The computation is carried out by an algorithm developed using the concepts of quantifiers in regular expressions. The web server is available to users around the clock at http://dicsoft1.physics.iisc.ernet.in/ssmbs/.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Sequence motifs occurring in a particular order in proteins or DNA have been proved to be of biological interest. In this paper, a new method to locate the occurrences of up to five user-defined motifs in a specified order in large proteins and in nucleotide sequence databases is proposed. It has been designed using the concept of quantifiers in regular expressions and linked lists for data storage. The application of this method includes the extraction of relevant consensus regions from biological sequences. This might be useful in clustering of protein families as well as to study the correlation between positions of motifs and their functional sites in DNA sequences.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The discovery of GH (Glycoside Hydrolase) 19 chitinases in Streptomyces sp. raises the possibility of the presence of these proteins in other bacterial species, since they were initially thought to be confined to higher plants. The present study mainly concentrates on the phylogenetic distribution and homology conservation in GH19 family chitinases. Extensive database searches are performed to identify the presence of GH19 family chitinases in the three major super kingdoms of life. Multiple sequence alignment of all the identified GH19 chitinase family members resulted in the identification of globally conserved residues. We further identified conserved sequence motifs across the major sub groups within the family. Estimation of evolutionary distance between the various bacterial and plant chitinases are carried out to better understand the pattern of evolution. Our study also supports the horizontal gene transfer theory, which states that GH19 chitinase genes are transferred from higher plants to bacteria. Further, the present study sheds light on the phylogenetic distribution and identifies unique sequence signatures that define GH19 chitinase family of proteins. The identified motifs could be used as markers to delineate uncharacterized GH19 family chitinases. The estimation of evolutionary distance between chitinase identified in plants and bacteria shows that the flowering plants are more related to chitinase in actinobacteria than that of identified in purple bacteria. We propose a model to elucidate the natural history of GH19 family chitinases.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Immune evasion by Plasmodium falciparum is favored by extensive allelic diversity of surface antigens. Some of them, most notably the vaccine-candidate merozoite surface protein (MSP)-1, exhibit a poorly understood pattern of allelic dimorphism, in which all observed alleles group into two highly diverged allelic families with few or no inter-family recombinants. Here we describe contrasting levels and patterns of sequence diversity in genes encoding three MSP-1-associated surface antigens of P. falciparum, ranging from an ancient allelic dimorphism in the Msp-6 gene to a near lack of allelic divergence in Msp-9 to a more classical multi-allele polymorphism in Msp-7 Other members of the Msp-7 gene family exhibit very little polymorphism in non-repetitive regions. A comparison of P. falciparum Msp-6 sequences to an orthologous sequence from P. reichenowi provided evidence for distinct evolutionary histories of the 5` and 3` segments of the dimorphic region in PfMsp-6, consistent with one dimorphic lineage having arisen from recombination between now-extinct ancestral alleles. In addition. we uncovered two surprising patterns of evolution in repetitive sequence. Firsts in Msp-6, large deletions are associated with (nearly) identical sequence motifs at their borders. Second, a comparison of PfMsp-9 with the P. reichenowi ortholog indicated retention of a significant inter-unit diversity within an 18-base pair repeat within the coding region of P. falciparum, but homogenization in P. reichenowi. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Interleukins 2 and 15 (IL-2 and IL-15) are highly differentiated but related cytokines with overlapping, yet also distinct functions, and established benefits for medical drug use. The present study identified a gene for an ancient third IL-2/15 family member in reptiles and mammals, interleukin 15-like (IL-15L), which hitherto was only reported in fish. IL-15L genes with intact open reading frames (ORFs) and evidence of transcription, and a recent past of purifying selection, were found for cattle, horse, sheep, pig and rabbit. In human and mouse the IL-15L ORF is incapacitated. Although deduced IL-15L proteins share only ~21 % overall amino acid identity with IL-15, they share many of the IL-15 residues important for binding to receptor chain IL-15Rα, and recombinant bovine IL-15L was shown to interact with IL-15Rα indeed. Comparison of sequence motifs indicates that capacity for binding IL-15Rα is an ancestral characteristic of the IL-2/15/15L family, in accordance with a recent study which showed that in fish both IL-2 and IL-15 can bind IL-15Rα. Evidence reveals that the species lineage leading to mammals started out with three similar cytokines IL-2, IL-15 and IL-15L, and that later in evolution (1) IL-2 and IL-2Rα receptor chain acquired a new and specific binding mode and (2) IL-15L was lost in several but not all groups of mammals. The present study forms an important step forward in understanding this potent family of cytokines, and may help to improve future strategies for their application in veterinarian and human medicine.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Local protein structure prediction efforts have consistently failed to exceed approximately 70% accuracy. We characterize the degeneracy of the mapping from local sequence to local structure responsible for this failure by investigating the extent to which similar sequence segments found in different proteins adopt similar three-dimensional structures. Sequence segments 3-15 residues in length from 154 different protein families are partitioned into neighborhoods containing segments with similar sequences using cluster analysis. The consistency of the sequence-to-structure mapping is assessed by comparing the local structures adopted by sequence segments in the same neighborhood in proteins of known structure. In the 154 families, 45% and 28% of the positions occur in neighborhoods in which one and two local structures predominate, respectively. The sequence patterns that characterize the neighborhoods in the first class probably include virtually all of the short sequence motifs in proteins that consistently occur in a particular local structure. These patterns, many of which occur in transitions between secondary structural elements, are an interesting combination of previously studied and novel motifs. The identification of sequence patterns that consistently occur in one or a small number of local structures in proteins should contribute to the prediction of protein structure from sequence.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent experiments have exposed significant discrepancies between experimental data and predictive models for DNA structure. These results strongly suggest that DNA structural parameters incorporated in the models are not always sufficient to account for the influence of sequence context and of specific ion effects. In an attempt to evaluate these two effects, we have investigated repetitive DNA sequences with the sequence motif GAGAG.CTCTC located in different helical phasing arrangements with respect to poly(A) tracts and GGGCCC.GGGCCC sequence motifs. Methods used are ligase-mediated cyclization and gel mobility experiments along with DNase I cutting and chemical probe studies. The results provide new evidence for curvature in poly(A) tracts. They also show that the sequence context in which bending and flexible sequence elements are found is an important aspect of sequence-dependent DNA conformation. Although dinucleotide models generally have good predictive power, this work demonstrates that in some instances sequence elements larger than the dinucleotide must be taken into account, and hence it provides a starting point for the appropriate modification and refinement of existing structural models for DNA.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The planctomycetes are a phylum of bacteria that have a unique cell compartmentalisation and yeast-like budding cell division and peptidoglycan-less proteinaceous cell walls. We wished to further our understanding of these unique organisms at the molecular level by searching for conserved amino acid sequence motifs and domains in the proteins encoded by Rhodopirellula baltica. Using BLAST and single-linkage clustering, we have discovered several new protein domains and sequence motifs in this planctomycete. R. baltica has multiple members of the newly discovered GEFGR protein family and the ASPIC C-terminal domain family, whilst most other organisms for which whole genome sequence is available have no more than one. Many of the domains and motifs appear to be restricted to the planctomycetes. It is possible that these protein domains and motifs may have been lost or replaced in other phyla, or they may have undergone multiple duplication events in the planctomycete lineage. One of the novel motifs probably represents a novel N-terminal export signal peptide. With their unique cell biology, it may be that the planctomycete cell compartmentalisation plan in particular needs special membrane transport mechanisms. The discovery of these new domains and motifs, many of which are associated with secretion and cell-surface functions, will help to stimulate experimental work and thus enhance further understanding of this fascinating group of organisms. (C) 2004 Federation of European Microbiological Societies. Published by Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Defining the precise promoter DNA sequence motifs where nuclear receptors and other transcription factors bind is an essential prerequisite for understanding how these proteins modulate the expression of their specific target genes. The purpose of this chapter is to provide the reader with a detailed guide with respect to the materials and the key methods required to perform this type of DNA-binding analysis. Irrespective of whether starting with purified DNA-binding proteins or somewhat crude cellular extracts, the tried-and-true procedures described here will enable one to accurately access the capacity of specific proteins to bind to DNA as well as to determine the exact sequences and DNA contact nucleotides involved. For illustrative purposes, we primarily have used the interaction of the androgen receptor with the rat probasin proximal promoter as our model system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The complete nucleotide sequence of genome segment S4 of rice ragged stunt oryzavirus (RRSV, Thai-isolate) was determined. The 3823 bp sequence contains two large open reading frames (ORFs). ORF1, spanning nucleotides 12 to 3776, is capable of encoding a protein of M(r) 141,380 (P4a). The P4a amino acid sequence predicted from the nucleotide sequence contains sequence motifs conserved in RNA-dependent RNA polymerases (RDRPs). When compared for evolutionary relationships with RDRPs of other reoviruses using the amino acid sequences around the conserved GDD motif, P4a was shown to be more related to Nilaparvata lugens reovirus and reovirus serotype 3 than to rice dwarf phytoreovirus, bovine rotavirus or bluetongue virus. The ORF2, spanning nucleotides 491 to 1468, is out of frame with ORF1 and is capable of encoding a protein of 36, 920 (P4b). Coupled in vitro transcription-translation from cloned ORF2 in wheat germ extract confirmed the existence of ORF2 but in vivo production and possible function of P4b is yet to be determined.