999 resultados para Backbone-cyclized Proteins Database
Resumo:
Circular proteins are a recently discovered phenomenon. They presumably evolved to confer advantages over ancestral linear proteins while maintaining the intrinsic biological functions of those proteins. In general, these advantages include a reduced sensitivity to proteolytic cleavage and enhanced stability. In one remarkable family of circular proteins, the cyclotides, the cyclic backbone is additionally braced by a knotted arrangement of disulfide bonds that confers additional stability and topological complexity upon the family. This article describes the discovery, structure, function and biosynthesis of the currently known circular proteins. The discovery of naturally occurring circular proteins in the past few years has been complemented by new chemical and biochemical methods to make synthetic circular proteins; these are also briefly described.
Resumo:
A proportion of melanoma,prone individuals in both familial and non,familial contexts has been shown to carry inactivating mutations in either CDKN2A or, rarely, CDK4. CDKN2A is a complex locus that encodes two unrelated proteins from alternately spliced transcripts that are read in different frames. The alpha transcript (exons 1a, 2, and 3) produces the p16INK4A cyclin-dependent kinase inhibitor, while the beta transcript (exons 1beta and 2) is translated as p14ARF, a stabilizing factor of p53 levels through binding to MDM2. Mutations in exon 2 can impair both polypeptides and insertions and deletions in exons 1alpha, 1beta, and 2, which can theoretically generate p16INK4A,p14ARF fusion proteins. No online database currently takes into account all the consequences of these genotypes, a situation compounded by some problematic previous annotations of CDKN2A related sequences and descriptions of their mutations. As an initiative of the international Melanoma Genetics Consortium, we have therefore established a database of germline variants observed in all loci implicated in familial melanoma susceptibility. Such a comprehensive, publicly accessible database is an essential foundation for research on melanoma susceptibility and its clinical application. Our database serves two types of data as defined by HUGO. The core dataset includes the nucleotide variants on the genomic and transcript levels, amino acid variants, and citation. The ancillary dataset includes keyword description of events at the transcription and translation levels and epidemiological data. The application that handles users' queries was designed in the model,view. controller architecture and was implemented in Java. The object-relational database schema was deduced using functional dependency analysis. We hereby present our first functional prototype of eMelanoBase. The service is accessible via the URL www.wmi.usyd.e, du.au:8080/melanoma.html.
Resumo:
We have developed a computational strategy to identify the set of soluble proteins secreted into the extracellular environment of a cell. Within the protein sequences predominantly derived from the RIKEN representative transcript and protein set, we identified 2033 unique soluble proteins that are potentially secreted from the cell. These proteins contain a signal peptide required for entry into the secretory pathway and lack any transmembrane domains or intracellular localization signals. This class of proteins, which we have termed the mouse secretome, included >500 novel proteins and 92 proteins
Resumo:
The pathogenesis-related (PR) protein superfamily is widely distributed in the animal, plant, and fungal kingdoms and is implicated in human brain tumor growth and plant pathogenesis. The precise biological activity of PR proteins, however, has remained elusive. Here we report the characterization, cloning and structural homology modeling of Tex31 from the venom duct of Conus textile. Tex31 was isolated to >95% purity by activity-guided fractionation using a para-nitroanilide substrate based on the putative cleavage site residues found in the propeptide precursor of conotoxin TxVIA. Tex31 requires four residues including a leucine N-terminal of the cleavage site for efficient substrate processing. The sequence of Tex31 was determined using two degenerate PCR primers designed from N-terminal and tryptic digest Edman sequences. A BLAST search revealed that Tex31 was a member of the PR protein superfamily and most closely related to the CRISP family of mammalian proteins that have a cysteine-rich C-terminal tail. A homology model constructed from two PR proteins revealed that the likely catalytic residues in Tex31 fall within a structurally conserved domain found in PR proteins. Thus, it is possible that other PR proteins may also be substrate-specific proteases.
Resumo:
Dissertation presented to obtain the Doutoramento (Ph.D.) degree in Biochemistry at the Instituto de Tecnologia Qu mica e Biol ogica da Universidade Nova de Lisboa
Resumo:
Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1,000,000 hits from 462,500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk.
Resumo:
Amino acids form the building blocks of all proteins. Naturally occurring amino acids are restricted to a few tens of sidechains, even when considering post-translational modifications and rare amino acids such as selenocysteine and pyrrolysine. However, the potential chemical diversity of amino acid sidechains is nearly infinite. Exploiting this diversity by using non-natural sidechains to expand the building blocks of proteins and peptides has recently found widespread applications in biochemistry, protein engineering and drug design. Despite these applications, there is currently no unified online bioinformatics resource for non-natural sidechains. With the SwissSidechain database (http://www.swisssidechain.ch), we offer a central and curated platform about non-natural sidechains for researchers in biochemistry, medicinal chemistry, protein engineering and molecular modeling. SwissSidechain provides biophysical, structural and molecular data for hundreds of commercially available non-natural amino acid sidechains, both in l- and d-configurations. The database can be easily browsed by sidechain names, families or physico-chemical properties. We also provide plugins to seamlessly insert non-natural sidechains into peptides and proteins using molecular visualization software, as well as topologies and parameters compatible with molecular mechanics software.
Resumo:
Lipopolysaccharides (LPS, endotoxins) are main constituents of the outer membranes of Gram-negative bacteria, with the 'endotoxic principle' lipid A anchoring LPS into the membrane. When LPS is removed from the bacteria by the action of the immune system or simply by cell dividing, it may interact strongly with immunocompetent cells such as mononuclear cells. This interaction may lead, depending on the LPS concentration, to beneficial (at low) or pathophysiological (at high concentrations) reactions, the latter frequently causing the septic shock syndrome. There is a variety of endogenous LPS-binding proteins. To this class belong lactoferrin (LF) and hemoglobin (Hb), which have been shown to suppress and enhance the LPS-induced cytokine secretion in mononuclear cells, respectively. To elucidate the interaction mechanisms of endotoxins with these proteins, we have investigated in an infrared reflection-absorption spectroscopy (IRRAS) study the interaction of LPS or lipid A monolayers at the air/water interface with LF and Hb proteins, injected into the aqueous subphase. The data are clearly indicative of completely different interaction mechanisms of the endotoxins with the proteins, with the LF acting only at the LPS backbone, whereas Hb incorporates into the lipid monolayer. These data allow an understanding of the different reactivities in the biomedicinal systems.
Resumo:
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created in 1999 as a means of amalgamating the major protein signature databases into one comprehensive resource. PROSITE, Pfam, PRINTS, ProDom, SMART and TIGRFAMs have been manually integrated and curated and are available in InterPro for text- and sequence-based searching. The results are provided in a single format that rationalises the results that would be obtained by searching the member databases individually. The latest release of InterPro contains 5629 entries describing 4280 families, 1239 domains, 95 repeats and 15 post-translational modifications. Currently, the combined signatures in InterPro cover more than 74% of all proteins in SWISS-PROT and TrEMBL, an increase of nearly 15% since the inception of InterPro. New features of the database include improved searching capabilities and enhanced graphical user interfaces for visualisation of the data. The database is available via a webserver (http://www.ebi.ac.uk/interpro) and anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).
Resumo:
We present strategies for chemical shift assignments of large proteins by magic-angle spinning solid-state NMR, using the 21-kDa disulfide-bond-forming enzyme DsbA as prototype. Previous studies have demonstrated that complete de novo assignments are possible for proteins up to approximately 17 kDa, and partial assignments have been performed for several larger proteins. Here we show that combinations of isotopic labeling strategies, high field correlation spectroscopy, and three-dimensional (3D) and four-dimensional (4D) backbone correlation experiments yield highly confident assignments for more than 90% of backbone resonances in DsbA. Samples were prepared as nanocrystalline precipitates by a dialysis procedure, resulting in heterogeneous linewidths below 0.2 ppm. Thus, high magnetic fields, selective decoupling pulse sequences, and sparse isotopic labeling all improved spectral resolution. Assignments by amino acid type were facilitated by particular combinations of pulse sequences and isotopic labeling; for example, transferred echo double resonance experiments enhanced sensitivity for Pro and Gly residues; [2-(13)C]glycerol labeling clarified Val, Ile, and Leu assignments; in-phase anti-phase correlation spectra enabled interpretation of otherwise crowded Glx/Asx side-chain regions; and 3D NCACX experiments on [2-(13)C]glycerol samples provided unique sets of aromatic (Phe, Tyr, and Trp) correlations. Together with high-sensitivity CANCOCA 4D experiments and CANCOCX 3D experiments, unambiguous backbone walks could be performed throughout the majority of the sequence. At 189 residues, DsbA represents the largest monomeric unit for which essentially complete solid-state NMR assignments have so far been achieved. These results will facilitate studies of nanocrystalline DsbA structure and dynamics and will enable analysis of its 41-kDa covalent complex with the membrane protein DsbB, for which we demonstrate a high-resolution two-dimensional (13)C-(13)C spectrum.
Resumo:
Since the advent of high-throughput DNA sequencing technologies, the ever-increasing rate at which genomes have been published has generated new challenges notably at the level of genome annotation. Even if gene predictors and annotation softwares are more and more efficient, the ultimate validation is still in the observation of predicted gene product( s). Mass-spectrometry based proteomics provides the necessary high throughput technology to show evidences of protein presence and, from the identified sequences, confirmation or invalidation of predicted annotations. We review here different strategies used to perform a MS-based proteogenomics experiment with a bottom-up approach. We start from the strengths and weaknesses of the different database construction strategies, based on different genomic information (whole genome, ORF, cDNA, EST or RNA-Seq data), which are then used for matching mass spectra to peptides and proteins. We also review the important points to be considered for a correct statistical assessment of the peptide identifications. Finally, we provide references for tools used to map and visualize the peptide identifications back to the original genomic information.
Resumo:
The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360,000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set.
Resumo:
Purpose:To describe a novel in silico method to gather and analyze data from high-throughput heterogeneous experimental procedures, i.e. gene and protein expression arrays. Methods:Each microarray is assigned to a database which handles common data (names, symbols, antibody codes, probe IDs, etc.). Links between informations are automatically generated from knowledge obtained in freely accessible databases (NCBI, Swissprot, etc). Requests can be made from any point of entry and the displayed result is fully customizable. Results:The initial database has been loaded with two sets of data: a first set of data originating from an Affymetrix-based retinal profiling performed in an RPE65 knock-out mouse model of Leber's congenital amaurosis. A second set of data generated from a Kinexus microarray experiment done on the retinas from the same mouse model has been added. Queries display wild type versus knock out expressions at several time points for both genes and proteins. Conclusions:This freely accessible database allows for easy consultation of data and facilitates data mining by integrating experimental data and biological pathways.
Resumo:
Evolution of proteins after whole-genome duplicationGene and genome duplication are considered major mechanisms in the creation of newfunctions in genomes, or in the refinement of networks by the division of function amongmore genes. In animals, the best demonstrated whole genome duplication occurred at theorigin of Teleost fishes. This makes fishes an ideal model to study the consequences ofgenome duplication, particularly since we have a good sampling of genome sequences,abundant functional information, and a very well studied outgroup: the tetrapodes (includinghuman). More specifically, I studied the consequences of duplication on proteins usingevolutionary models to infer adaptive events. I analysed the influence of positive selection invertebrate genes, by contrasting singleton genes and duplicated genes. The conclusion of theanalyses was threefold: (i) positive selection affects diverse phylogenetic branches anddiverse gene categories during vertebrate evolution; (ii) it concerns only a small proportion ofsites (1%-5%); and (iii) whole genome duplication had no detectable impact on theprevalence of this positive selection.I also studied evolution at the amino acid level with different methods to detect functionalshifts (covarion process and constant-but-different process). As in my previous research, Ifound similar numbers of functional shifts between duplicates and between orthologs.The accepted framework for studies of molecular evolution is that orthologs share the samefunction, whereas the function of paralogs diverges. This framework gives a special place togene duplication in evolution, as the main mechanism for generating novelty. With myprevious results showing that duplication and speciation are not so different, we investigatedthe literature to question the evidence for similar or divergent evolution of gene function afterduplication relative to speciation genes. This led us to propose a more rigorous design offuture studies of gene duplication.Finally, based on my automated protocol, we built a database of positive selection invertebrates' genes, Selectome. This database is freely available on the web and will helpfuture evolutionary as well as biochemical studies.