Biblioteca Digital

208 resultados para Genomic sequence database

Identification of mycobacterial lectins from genomic data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sixty-four sequences containing lectin domains with homologs of known three-dimensional structure were identified through a search of mycobacterial genomes. They appear to belong to the -prism II, the C-type, the Microcystis virdis (MV), and the -trefoil lectin folds. The first three always occur in conjunction with the LysM, the PI-PLC, and the -grasp domains, respectively while mycobacterial -trefoil lectins are unaccompanied by any other domain. Thirty heparin binding hemagglutinins (HBHA), already annotated, have also been included in the study although they have no homologs of known three-dimensional structure. The biological role of HBHA has been well characterized. A comparison between the sequences of the lectin from pathogenic and nonpathogenic mycobacteria provides insights into the carbohydrate binding region of the molecule, but the structure of the molecule is yet to be determined. A reasonable picture of the structural features of other mycobacterial proteins containing one or the other of the four lectin domains can be gleaned through the examination of homologs proteins, although the structure of none of them is available. Their biological role is also yet to be elucidated. The work presented here is among the first steps towards exploring the almost unexplored area of the structural biology of mycobacterial lectins. Proteins 2013. (c) 2012 Wiley Periodicals, Inc.

Tethering preferences of domain families co-occurring in multi-domain proteins

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genomic data of several organisms have revealed the presence of a vast repertoire of multi-domain proteins. The role played by individual domains in a multi-domain protein has a profound influence on the overall function of the protein. In the present analysis an attempt has been made to better understand the tethering preferences of domain families that occur in multi-domain proteins. The analysis has been carried out on an exhaustive dataset of 2 961 898 sequences of proteins from 930 organisms, where 741 274 proteins are comprised of at least two domain families. For every domain family, the number of other domain families with which it co-occurs within a protein in this dataset has been enumerated and is referred to as the tethering number of the domain family. It was found that, in the general dataset, the AAA ATPase family and the family of Ser/Thr kinases have the highest tethering numbers of 450 and 444 respectively. Further analysis reveals significant correlation between the number of members in a family and its tethering number. Positive correlation was also observed for the extent of a sequence and functional diversity within a family and the tethering numbers of domain families. Domain families that are present ubiquitously in diverse organisms tend to have large tethering numbers, while organism/kingdom-specific families have low tethering numbers. Thus, the analysis uncovers how domain families recombine and evolve to give rise to multi-domain proteins.

DoSA: Database of Structural Alignments

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Protein structure alignment is a crucial step in protein structure-function analysis. Despite the advances in protein structure alignment algorithms, some of the local conformationally similar regions are mislabeled as structurally variable regions (SVRs). These regions are not well superimposed because of differences in their spatial orientations. The Database of Structural Alignments (DoSA) addresses this gap in identification of local structural similarities obscured in global protein structural alignments by realigning SVRs using an algorithm based on protein blocks. A set of protein blocks is a structural alphabet that abstracts protein structures into 16 unique local structural motifs. DoSA provides unique information about 159 780 conformationally similar and 56 140 conformationally dissimilar SVRs in 74 705 pairwise structural alignments of homologous proteins. The information provided on conformationally similar and dissimilar SVRs can be helpful to model loop regions. It is also conceivable that conformationally similar SVRs with conserved residues could potentially contribute toward functional integrity of homologues, and hence identifying such SVRs could be helpful in understanding the structural basis of protein function.

Mutation analysis of the SLC4A11 gene in Indian families with congenital hereditary endothelial dystrophy 2 and a review of the literature

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: Congenital hereditary endothelial dystrophy 2 (CHED2) is an autosomal recessive disorder caused by mutations in the solute carrier family 4, sodium borate transporter, member 11 (SLC4A11) gene. The purpose of this study was to identify the genetic cause of CHED2 in six Indian families and catalog all known mutations in the SLC4A11 gene. Methods: Peripheral blood samples were collected from individuals of the families with CHED2 and used in genomic DNA isolation. PCR primers were used to amplify the entire coding region including intron-exon junctions of SLC4A11. Amplicons were subsequently sequenced to identify the mutations. Results: DNA sequence analysis of the six families identified four novel (viz., p.Thr262Ile, p.Gly417Arg, p.Cys611Arg, and p.His724Asp) mutations and one known p.Arg869His homozygous mutation in the SLC4A11 gene. The mutation p.Gly417Arg was identified in two families. Conclusions: This study increases the mutation spectrum of the SLC4A11 gene. A review of the literature showed that the total number of mutations in the SLC4A11 gene described to date is 78. Most of the mutations are missense, followed by insertions-deletions. The present study will be helpful in genetic diagnosis of the families reported here.

The sequence and structure of snake gourd (Trichosanthes anguina) seed lectin, a three-chain nontoxic homologue of type II RIPs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The sequence and structure of snake gourd seed lectin (SGSL), a nontoxic homologue of type II ribosome-inactivating proteins (RIPs), have been determined by mass spectrometry and X-ray crystallography, respectively. As in type II RIPs, the molecule consists of a lectin chain made up of two beta-trefoil domains. The catalytic chain, which is connected through a disulfide bridge to the lectin chain in type II RIPs, is cleaved into two in SGSL. However, the integrity of the three-dimensional structure of the catalytic component of the molecule is preserved. This is the first time that a three-chain RIP or RIP homologue has been observed. A thorough examination of the sequence and structure of the protein and of its interactions with the bound methyl-alpha-galactose indicate that the nontoxicity of SGSL results from a combination of changes in the catalytic and the carbohydrate-binding sites. Detailed analyses of the sequences of type II RIPs of known structure and their homologues with unknown structure provide valuable insights into the evolution of this class of proteins. They also indicate some variability in carbohydrate-binding sites, which appears to contribute to the different levels of toxicity exhibited by lectins from various sources.

RepEx: Repeat extractor for biological sequences

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genomic sequences are far from being random but are made up of systematically ordered and information rich patterns. These repeated sequence patterns have been vastly utilized for their fundamental importance in understanding the genome function and organization. To this end, a comprehensive toolkit, RepEx, has been developed which extracts repeat (inverted, everted and mirror) patterns from the given genome sequence(s) without any constraints. The toolkit can also be used to fetch the inverted repeats present in the protein sequence (s). Further, it is capable of extracting exact and degenerate repeats with a user defined spacer intervals. It is remarkably more precise and sensitive when compared to the existing tools. An example with comprehensive case studies and a performance evaluation of the proposed toolkit has been presented to authenticate its efficiency and accuracy. (C) 2013 Elsevier Inc. All rights reserved.

Archeological and Historical Database on the Medieval Earthquakes of the Central Himalaya: Ambiguities and Inferences

Relevância:

20.00% 20.00%

Publicador:

Common recognition principles across diverse sequence and structural families of sialic acid binding proteins

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sialic acids form a large family of 9-carbon monosaccharides and are integral components of glycoconjugates. They are known to bind to a wide range of receptors belonging to diverse sequence families and fold classes and are key mediators in a plethora of cellular processes. Thus, it is of great interest to understand the features that give rise to such a recognition capability. Structural analyses using a non-redundant data set of known sialic acid binding proteins was carried out, which included exhaustive binding site comparisons and site alignments using in-house algorithms, followed by clustering and tree computation, which has led to derivation of sialic acid recognition principles. Although the proteins in the data set belong to several sequence and structure families, their binding sites could be grouped into only six types. Structural comparison of the binding sites indicates that all sites contain one or more different combinations of key structural features over a common scaffold. The six binding site types thus serve as structural motifs for recognizing sialic acid. Scanning the motifs against a non-redundant set of binding sites from PDB indicated the motifs to be specific for sialic acid recognition. Knowledge of determinants obtained from this study will be useful for detecting function in unknown proteins. As an example analysis, a genome-wide scan for the motifs in structures of Mycobacterium tuberculosis proteome identified 17 hits that contain combinations of the features, suggesting a possible function of sialic acid binding by these proteins.

Phylogeography unplugged: comparative surveys in the genomic era

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In March 2012, the authors met at the National Evolutionary Synthesis Center (NESCent) in Durham, North Carolina, USA, to discuss approaches and cooperative ventures in Indo-Pacific phylogeography. The group emerged with a series of findings: (1) Marine population structure is complex, but single locus mtDNA studies continue to provide powerful first assessment of phylogeographic patterns. (2) These patterns gain greater significance/power when resolved in a diversity of taxa. New analytical tools are emerging to address these analyses with multi-taxon approaches. (3) Genome-wide analyses are warranted if selection is indicated by surveys of standard markers. Such indicators can include discordance between genetic loci, or between genetic loci and morphology. Phylogeographic information provides a valuable context for studies of selection and adaptation. (4) Phylogeographic inferences are greatly enhanced by an understanding of the biology and ecology of study organisms. (5) Thorough, range-wide sampling of taxa is the foundation for robust phylogeographic inference. (6) Congruent geographic and taxonomic sampling by the Indo-Pacific community of scientists would facilitate better comparative analyses. The group concluded that at this stage of technology and software development, judicious rather than wholesale application of genomics appears to be the most robust course for marine phylogeographic studies. Therefore, our group intends to affirm the value of traditional (''unplugged'') approaches, such as those based on mtDNA sequencing and microsatellites, along with essential field studies, in an era with increasing emphasis on genomic approaches.

Specific Sequence of a Beta Turn in Human La Protein May Contribute to Species Specificity of Hepatitis C Virus

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Human La protein is known to be an essential host factor for translation and replication of hepatitis C virus (HCV) RNA. Previously, we have demonstrated that residues responsible for interaction of human La protein with the HCV internal ribosomal entry site (IRES) around the initiator AUG within stem-loop IV form a beta-turn in the RNA recognition motif (RRM) structure. In this study, sequence alignment and mutagenesis suggest that the HCV RNA-interacting beta-turn is conserved only in humans and chimpanzees, the species primarily known to be infected by HCV. A 7-mer peptide corresponding to the HCV RNA-interacting region of human La inhibits HCV translation, whereas another peptide corresponding to the mouse La sequence was unable to do so. Furthermore, IRES-mediated translation was found to be significantly high in the presence of recombinant human La protein in vitro in rabbit reticulocyte lysate. We observed enhanced replication with HCV subgenomic and full-length replicons upon overexpression of either human La protein or a chimeric mouse La protein harboring a human La beta-turn sequence in mouse cells. Taken together, our results raise the possibility of creating an immunocompetent HCV mouse model using human-specific cell entry factors and a humanized form of La protein.

PLIC: protein-ligand interaction clusters

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most of the biological processes are governed through specific protein-ligand interactions. Discerning different components that contribute toward a favorable protein-ligand interaction could contribute significantly toward better understanding protein function, rationalizing drug design and obtaining design principles for protein engineering. The Protein Data Bank (PDB) currently hosts the structure of similar to 68 000 protein-ligand complexes. Although several databases exist that classify proteins according to sequence and structure, a mere handful of them annotate and classify protein-ligand interactions and provide information on different attributes of molecular recognition. In this study, an exhaustive comparison of all the biologically relevant ligand-binding sites (84 846 sites) has been conducted using PocketMatch: a rapid, parallel, in-house algorithm. PocketMatch quantifies the similarity between binding sites based on structural descriptors and residue attributes. A similarity network was constructed using binding sites whose PocketMatch scores exceeded a high similarity threshold (0.80). The binding site similarity network was clustered into discrete sets of similar sites using the Markov clustering (MCL) algorithm. Furthermore, various computational tools have been used to study different attributes of interactions within the individual clusters. The attributes can be roughly divided into (i) binding site characteristics including pocket shape, nature of residues and interaction profiles with different kinds of atomic probes, (ii) atomic contacts consisting of various types of polar, hydrophobic and aromatic contacts along with binding site water molecules that could play crucial roles in protein-ligand interactions and (iii) binding energetics involved in interactions derived from scoring functions developed for docking. For each ligand-binding site in each protein in the PDB, site similarity information, clusters they belong to and description of site attributes are provided as a relational database-protein-ligand interaction clusters (PLIC).

Comparative Study of Protein Unfolding in Aqueous Urea and Dimethyl Sulfoxide Solutions: Surface Polarity, Solvent Specificity, and Sequence of Secondary Structure Melting

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Elucidation of possible pathways between folded (native) and unfolded states of a protein is a challenging task, as the intermediates are often hard to detect. Here, we alter the solvent environment in a controlled manner by choosing two different cosolvents of water, urea, and dimethyl sulfoxide (DMSO) and study unfolding of four different proteins to understand the respective sequence of melting by computer simulation methods. We indeed find interesting differences in the sequence of melting of alpha helices and beta sheets in these two solvents. For example, in 8 M urea solution, beta-sheet parts of a protein are found to unfold preferentially, followed by the unfolding of alpha helices. In contrast, 8 M DMSO solution unfolds alpha helices first, followed by the separation of beta sheets for the majority of proteins. Sequence of unfolding events in four different alpha/beta proteins and also in chicken villin head piece (HP-36) both in urea and DMSO solutions demonstrate that the unfolding pathways are determined jointly by relative exposure of polar and nonpolar residues of a protein and the mode of molecular action of a solvent on that protein.

Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English. Electromagnetic articulography data have also been presently collected from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus used previously in the MOCHA-TIMIT database. In both cases the audio signal was recorded and synchronized with the articulatory data. The database and companion software are freely available to the research community. (C) 2014 Acoustical Society of America.

An extended Shine-Dalgarno sequence in mRNA functionally bypasses a vital defect in initiator tRNA

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Initiator tRNAs are special in their direct binding to the ribosomal P-site due to the hallmark occurrence of the three consecutive G-C base pairs (3GC pairs) in their anticodon stems. How the 3GC pairs function in this role, has remained unsolved. We show that mutations in either the mRNA or 16S rRNA leading to extended interaction between the Shine-Dalgarno (SD) and anti-SD sequences compensate for the vital need of the 3GC pairs in tRNA(fMet) for its function in Escherichia coli. In vivo, the 3GC mutant tRNA(fMet) occurred less abundantly in 70S ribosomes but normally on 30S subunits. However, the extended SD:anti-SD interaction increased its occurrence in 70S ribosomes. We propose that the 3GC pairs play a critical role in tRNA(fMet) retention in ribosome during the conformational changes that mark the transition of 30S preinitiation complex into elongation competent 70S complex. Furthermore, treating cells with kasugamycin, decreasing ribosome recycling factor (RRF) activity or increasing initiation factor 2 (IF2) levels enhanced initiation with the 3GC mutant tRNA(fMet), suggesting that the 70S mode of initiation is less dependent on the 3GC pairs in tRNA(fMet).

An Unusual Ring-Contraction/Rearrangement Sequence for Making Functionalized Di- and Triquinanes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel ring contraction/rearrangement sequence leading to functionalized 2,8-oxymethano-bridged di- and triquinane compounds is observed in the reaction of various substituted 1-methyl-4-isopropenyl-6-oxabicylo3.2.1]octan-8-ones with Lewis acids. The reaction is novel and is unprecedented for the synthesis of di- and triquinane frameworks.

«
1
2
...
6
7
8
9
10
11
12
13
14
»