166 resultados para Sequence Motifs
em Indian Institute of Science - Bangalore - Índia
Resumo:
The identification of sequence (amino acids or nucleotides) motifs in a particular order in biological sequences has proved to be of interest. This paper describes a computing server, SSMBS, which can locate anddisplay the occurrences of user-defined biologically important sequence motifs (a maximum of five) present in a specific order in protein and nucleotide sequences. While the server can efficiently locate motifs specified using regular expressions, it can also find occurrences of long and complex motifs. The computation is carried out by an algorithm developed using the concepts of quantifiers in regular expressions. The web server is available to users around the clock at http://dicsoft1.physics.iisc.ernet.in/ssmbs/.
Resumo:
Sequence motifs occurring in a particular order in proteins or DNA have been proved to be of biological interest. In this paper, a new method to locate the occurrences of up to five user-defined motifs in a specified order in large proteins and in nucleotide sequence databases is proposed. It has been designed using the concept of quantifiers in regular expressions and linked lists for data storage. The application of this method includes the extraction of relevant consensus regions from biological sequences. This might be useful in clustering of protein families as well as to study the correlation between positions of motifs and their functional sites in DNA sequences.
Resumo:
The discovery of GH (Glycoside Hydrolase) 19 chitinases in Streptomyces sp. raises the possibility of the presence of these proteins in other bacterial species, since they were initially thought to be confined to higher plants. The present study mainly concentrates on the phylogenetic distribution and homology conservation in GH19 family chitinases. Extensive database searches are performed to identify the presence of GH19 family chitinases in the three major super kingdoms of life. Multiple sequence alignment of all the identified GH19 chitinase family members resulted in the identification of globally conserved residues. We further identified conserved sequence motifs across the major sub groups within the family. Estimation of evolutionary distance between the various bacterial and plant chitinases are carried out to better understand the pattern of evolution. Our study also supports the horizontal gene transfer theory, which states that GH19 chitinase genes are transferred from higher plants to bacteria. Further, the present study sheds light on the phylogenetic distribution and identifies unique sequence signatures that define GH19 chitinase family of proteins. The identified motifs could be used as markers to delineate uncharacterized GH19 family chitinases. The estimation of evolutionary distance between chitinase identified in plants and bacteria shows that the flowering plants are more related to chitinase in actinobacteria than that of identified in purple bacteria. We propose a model to elucidate the natural history of GH19 family chitinases.
Resumo:
Background: The hot dog fold has been found in more than sixty proteins since the first report of its existence about a decade ago. The fold appears to have a strong association with fatty acid biosynthesis, its regulation and metabolism, as the proteins with this fold are predominantly coenzyme A-binding enzymes with a variety of substrates located at their active sites. Results: We have analyzed the structural features and sequences of proteins having the hot dog fold. This study reveals that though the basic architecture of the fold is well conserved in these proteins, significant differences exist in their sequence, nature of substrate and oligomerization. Segments with certain conserved sequence motifs seem to play crucial structural and functional roles in various classes of these proteins. Conclusion: The analysis led to predictions regarding the functional classification and identification of possible catalytic residues of a number of hot dog fold-containing hypothetical proteins whose structures were determined in high throughput structural genomics projects.
Resumo:
The existing internet computing resource, Biomolecules Segment Display Device (BSDD), has been updated with several additional useful features. An advanced option is provided to superpose the structural motifs obtained from a search on the Protein Data Bank (PDB) in order to see if the three-dimensional structures adopted by identical or similar sequence motifs are the same. Furthermore, the options to display structural aspects like inter- and intra-molecular interactions, ion-pairs, disulphide bonds, etc. have been provided.The updated resource is interfaced with an up-to-date copy of the public domain PDB as well as 25 and 90% non-redundant protein structures. Further, users can upload the three-dimensional atomic coordinates (PDB format) from the client machine. A free molecular graphics program, JMol, is interfaced with it to display the three-dimensional structures.
Resumo:
The complete genome of the baker's yeast S. cerevisiae was analyzed for the presence of polypurine/polypyrimidine (poly[pu/py]) repeats and their occurrences were classified on the basis of their location within and outside open reading frames (ORFs). The analysis reveals that such sequence motifs are present abundantly both in coding as well as noncoding regions. Clear positional preferences are seen when these tracts occur in noncoding regions. These motifs appear to occur predominantly at a unit nucleosomal length both upstream and downstream of ORFs. Moreover, there is a biased distribution of polypurines in the coding strands when these motifs occur within open reading frames. The significance of the biased distribution is discussed with reference to the occurrence of these motifs in other known mRNA sequences and expressed sequence tags. A model for cis regulation of gene expression is proposed based on the ability of these motifs to form an intermolecular triple helix structure when present within the coding region and/or to modulate nucleosome positioning via enhanced histone affinity when present outside coding regions.
Resumo:
In mealybugs, chromatin condensation is related to both genomic imprinting and sex determination. The paternal chromosomal complement is condensed and genetically inactive in sons but not in daughters. During a study of chromatin organization in Planococcus lilacinus, digestion with micrococcal nuclease showed that 3% to 5% of the male genome is resistant to the enzyme. This Nuclease Resistant Chromatin (NRC) apparently has a nucleosomal organization. Southern hybridization of genomic DNA suggests that NRC sequences are present in both sexes and occur throughout the genome. Cloned NRC DNA is A+T-rich with stretches of adenines similar to those present in mouse alpha-satellite sequences. NRC DNA also contains sequence motifs that are typically associated with the nuclear matrix. Salt-fractionation experiments showed that NRC sequences are matrix associated. These observations are discussed in relation to the unusual cytological features of mealybug chromosomes, including the possible existence of multiple centres of inactivation.
Resumo:
Sixteen million nucleotide sequence of genome of various organisms have been analysed to detect and study the extent of occurrence of simple repetitive sequences. Two sequence motifs (TG/CA)n and (CT/AG)n capable of adopting unusual DNA structures, left handed Z-conformation and triple-helical conformation respectively, are found to be abundant in rodent and human genomes, but almost completely absent in bacterial genome. (TG/CA)n and (CT/AG)n sequences are present mostly in the intron or 5'/3' flanking regions of the genes. The presence of such repeat motifs in genomic sequence of higher eukaryotes has been correlated with their possible functional significance in nucleosome organization, recombination and gene expression.
Resumo:
Unlike most eukaryotes, a kinetochore is fully assembled early in the cell cycle in budding yeasts Saccharomyces cerevisiae and Candida albicans. These kinetochores are clustered together throughout the cell cycle. Kinetochore assembly on point centromeres of S. cerevisiae is considered to be a step-wise process that initiates with binding of inner kinetochore proteins on specific centromere DNA sequence motifs. In contrast, kinetochore formation in C. albicans, that carries regional centromeres of 3-5 kb long, has been shown to be a sequence independent but an epigenetically regulated event. In this study, we investigated the process of kinetochore assembly/disassembly in C. albicans. Localization dependence of various kinetochore proteins studied by confocal microscopy and chromatin immunoprecipitation (ChIP) assays revealed that assembly of a kinetochore is a highly coordinated and interdependent event. Partial depletion of an essential kinetochore protein affects integrity of the kinetochore cluster. Further protein depletion results in complete collapse of the kinetochore architecture. In addition, GFP-tagged kinetochore proteins confirmed similar time-dependent disintegration upon gradual depletion of an outer kinetochore protein (Dam1). The loss of integrity of a kinetochore formed on centromeric chromatin was demonstrated by reduced binding of CENP-A and CENP-C at the centromeres. Most strikingly, Western blot analysis revealed that gradual depletion of any of these essential kinetochore proteins results in concomitant reduction in cellular protein levels of CENP-A. We further demonstrated that centromere bound CENP-A is protected from the proteosomal mediated degradation. Based on these results, we propose that a coordinated interdependent circuitry of several evolutionarily conserved essential kinetochore proteins ensures integrity of a kinetochore formed on the foundation of CENP-A containing centromeric chromatin.
Resumo:
Large numbers of Plasmodium genes have been predicted to have introns. However, little information exists on the splicing mechanisms in this organism. Here, we describe the DExD/DExH-box containing Pre-mRNA processing proteins (Prps), PfPrp2p, PfPrp5p, PfPrp16p, PfPrp22p, PfPrp28p, PfPrp43p and PfBrr2p, present in the Plasmodium falciparum genome and characterized the role of one of these factors, PfPrp16p. It is a member of DEAH-box protein family with nine collinear sequence motifs, a characteristic of helicase proteins. Experiments with the recombinantly expressed and purified PfPrp16 helicase domain revealed binding to RNA, hydrolysis of ATP as well as catalytic helicase activities. Expression of helicase domain with the C-terminal helicase-associated domain (HA2) reduced these activities considerably, indicating that the helicase-associated domain may regulate the PfPrp16 function. Localization studies with the PfPrp16 GFP transgenic lines suggested a role of its N-terminal domain (1-80 amino acids) in nuclear targeting. Immunodepletion of PfPrp16p, from nuclear extracts of parasite cultures, blocked the second catalytic step of an in vitro constituted splicing reaction suggesting a role for PfPrp16p in splicing catalysis. Further we show by complementation assay in yeast that a chimeric yeast-Plasmodium Prp16 protein, not the full length PfPrp16, can rescue the yeast prp16 temperature-sensitive mutant. These results suggest that although the role of Prp16p in catalytic step II is highly conserved among Plasmodium, human and yeast, subtle differences exist with regards to its associated factors or its assembly with spliceosomes. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Gene expression is the most fundamental biological process, which is essential for phenotypic variation. It is regulated by various external (environment and evolution) and internal (genetic) factors. The level of gene expression depends on promoter architecture, along with other external factors. Presence of sequence motifs, such as transcription factor binding sites (TFBSs) and TATA-box, or DNA methylation in vertebrates has been implicated in the regulation of expression of some genes in eukaryotes, but a large number of genes lack these sequences. On the other hand, several experimental and computational studies have shown that promoter sequences possess some special structural properties, such as low stability, less bendability, low nucleosome occupancy, and more curvature, which are prevalent across all organisms. These structural features may play role in transcription initiation and regulation of gene expression. We have studied the relationship between the structural features of promoter DNA, promoter directionality and gene expression variability in S. cerevisiae. This relationship has been analyzed for seven different measures of gene expression variability, along with two different regulatory effect measures. We find that a few of the variability measures of gene expression are linked to DNA structural properties, nucleosome occupancy, TATA-box presence, and bidirectionality of promoter regions. Interestingly, gene responsiveness is most intimately correlated with DNA structural features and promoter architecture.
Resumo:
EcoP15I DNA methyltransferase recognizes the sequence 5'-CAGCAG-3' and transfers a methyl group to N-6 of the second adenine residue in the recognition sequence. All N-6 adenine methyltransferases contain two highly conserved sequences, FxGxG (motif I), postulated to form part of the S-adenosyl-L-methionine binding site and (D/N/S)PP(Y/F) (motif IV) involved in catalysis. We have altered the second glycine residue in motif I to arginine and serine, and substituted tyrosine in motif IV with tryptophan in EcoP15I DNA methyltransferase, using site-directed mutagenesis. The mutant enzymes were overexpressed, purified and characterized by biochemical methods. The mutations in motif I completely abolished AdoMet binding but left target DNA recognition unaltered. Although the mutation in motif IV resulted in loss of enzyme activity, we observed enhanced crosslinking of S-adenosyl-L-methionine and DNA. This implies that DNA and AdoMet binding sites are close to motif IV. Taken together, these results reinforce the importance of motif I in AdoMet binding and motif IV in catalysis. Additionally, limited proteolysis and UV crosslinking experiments with EcoP15I DNA methyltransferase imply that DNA binds in a cleft formed by two domains in the protein. Methylation protection analysis provides evidence for the fact that EcoP15I DNA MTase makes contacts in the major groove of its substrate DNA. Interestingly, hypermethylation of the guanine residue next to the target adenine residue indicates that the protein probably flips out the target adenine residue. (C) 1996 Academic Press Limited
Resumo:
Sialic acids form a large family of 9-carbon monosaccharides and are integral components of glycoconjugates. They are known to bind to a wide range of receptors belonging to diverse sequence families and fold classes and are key mediators in a plethora of cellular processes. Thus, it is of great interest to understand the features that give rise to such a recognition capability. Structural analyses using a non-redundant data set of known sialic acid binding proteins was carried out, which included exhaustive binding site comparisons and site alignments using in-house algorithms, followed by clustering and tree computation, which has led to derivation of sialic acid recognition principles. Although the proteins in the data set belong to several sequence and structure families, their binding sites could be grouped into only six types. Structural comparison of the binding sites indicates that all sites contain one or more different combinations of key structural features over a common scaffold. The six binding site types thus serve as structural motifs for recognizing sialic acid. Scanning the motifs against a non-redundant set of binding sites from PDB indicated the motifs to be specific for sialic acid recognition. Knowledge of determinants obtained from this study will be useful for detecting function in unknown proteins. As an example analysis, a genome-wide scan for the motifs in structures of Mycobacterium tuberculosis proteome identified 17 hits that contain combinations of the features, suggesting a possible function of sialic acid binding by these proteins.
Resumo:
D Regulatory information for transcription initiation is present in a stretch of genomic DNA, called the promoter region that is located upstream of the transcription start site (TSS) of the gene. The promoter region interacts with different transcription factors and RNA polymerase to initiate transcription and contains short stretches of transcription factor binding sites (TFBSs), as well as structurally unique elements. Recent experimental and computational analyses of promoter sequences show that they often have non-B-DNA structural motifs, as well as some conserved structural properties, such as stability, bendability, nucleosome positioning preference and curvature, across a class of organisms. Here, we briefly describe these structural features, the differences observed in various organisms and their possible role in regulation of gene expression.
Resumo:
-helices are amongst the most common secondary structural elements seen in membrane proteins and are packed in the form of helix bundles. These -helices encounter varying external environments (hydrophobic, hydrophilic) that may influence the sequence preferences at their N and C-termini. The role of the external environment in stabilization of the helix termini in membrane proteins is still unknown. Here we analyze -helices in a high-resolution dataset of integral -helical membrane proteins and establish that their sequence and conformational preferences differ from those in globular proteins. We specifically examine these preferences at the N and C-termini in helices initiating/terminating inside the membrane core as well as in linkers connecting these transmembrane helices. We find that the sequence preferences and structural motifs at capping (Ncap and Ccap) and near-helical (N' and C') positions are influenced by a combination of features including the membrane environment and the innate helix initiation and termination property of residues forming structural motifs. We also find that a large number of helix termini which do not form any particular capping motif are stabilized by formation of hydrogen bonds and hydrophobic interactions contributed from the neighboring helices in the membrane protein. We further validate the sequence preferences obtained from our analysis with data from an ultradeep sequencing study that identifies evolutionarily conserved amino acids in the rat neurotensin receptor. The results from our analysis provide insights for the secondary structure prediction, modeling and design of membrane proteins. Proteins 2014; 82:3420-3436. (c) 2014 Wiley Periodicals, Inc.