986 resultados para protein database


Relevância:

70.00% 70.00%

Publicador:

Resumo:

During 11-12 August 2014, a Protein Bioinformatics and Community Resources Retreat was held at the Wellcome Trust Genome Campus in Hinxton, UK. This meeting brought together the principal investigators of several specialized protein resources (such as CAZy, TCDB and MEROPS) as well as those from protein databases from the large Bioinformatics centres (including UniProt and RefSeq). The retreat was divided into five sessions: (1) key challenges, (2) the databases represented, (3) best practices for maintenance and curation, (4) information flow to and from large data centers and (5) communication and funding. An important outcome of this meeting was the creation of a Specialist Protein Resource Network that we believe will improve coordination of the activities of its member resources. We invite further protein database resources to join the network and continue the dialogue.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

DBMODELING is a relational database of annotated comparative protein structure models and their metabolic, pathway characterization. It is focused on enzymes identified in the genomes of Mycobacterium tuberculosis and Xylella fastidiosa. The main goal of the present database is to provide structural models to be used in docking simulations and drug design. However, since the accuracy of structural models is highly dependent on sequence identity between template and target, it is necessary to make clear to the user that only models which show high structural quality should be used in such efforts. Molecular modeling of these genomes generated a database, in which all structural models were built using alignments presenting more than 30% of sequence identity, generating models with medium and high accuracy. All models in the database are publicly accessible at http://www.biocristalografia.df.ibilce.unesp.br/tools. DBMODELING user interface provides users friendly menus, so that all information can be printed in one stop from any web browser. Furthermore, DBMODELING also provides a docking interface, which allows the user to carry out geometric docking simulation, against the molecular models available in the database. There are three other important homology model databases: MODBASE, SWISSMODEL, and GTOP. The main applications of these databases are described in the present article. © 2007 Bentham Science Publishers Ltd.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: The functional and structural characterisation of enzymes that belong to microbial metabolic pathways is very important for structure-based drug design. The main interest in studying shikimate pathway enzymes involves the fact that they are essential for bacteria but do not occur in humans, making them selective targets for design of drugs that do not directly impact humans.Description: The ShiKimate Pathway DataBase (SKPDB) is a relational database applied to the study of shikimate pathway enzymes in microorganisms and plants. The current database is updated regularly with the addition of new data; there are currently 8902 enzymes of the shikimate pathway from different sources. The database contains extensive information on each enzyme, including detailed descriptions about sequence, references, and structural and functional studies. All files (primary sequence, atomic coordinates and quality scores) are available for downloading. The modeled structures can be viewed using the Jmol program.Conclusions: The SKPDB provides a large number of structural models to be used in docking simulations, virtual screening initiatives and drug design. It is freely accessible at http://lsbzix.rc.unesp.br/skpdb/. © 2010 Arcuri et al; licensee BioMed Central Ltd.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Mannitol is the most abundant sugar alcohol in nature, occurring in bacteria, fungi, lichens, and many species of vascular plants. Celery (Apium graveolens L.), a plant that forms mannitol photosynthetically, has high photosynthetic rates thought to results from intrinsic differences in the biosynthesis of hexitols vs. sugars. Celery also exhibits high salt tolerance due to the function of mannitol as an osmoprotectant. A mannitol catabolic enzyme that oxidizes mannitol to mannose (mannitol dehydrogenase, MTD) has been identified. In celery plants, MTD activity and tissue mannitol concentration are inversely related. MTD provides the initial step by which translocated mannitol is committed to central metabolism and, by regulating mannitol pool size, is important in regulating salt tolerance at the cellular level. We have now isolated, sequenced, and characterized a Mtd cDNA from celery. Analyses showed that Mtd RNA was more abundant in cells grown on mannitol and less abundant in salt-stressed cells. A protein database search revealed that the previously described ELI3 pathogenesis-related proteins from parsley and Arabidopsis are MTDs. Treatment of celery cells with salicylic acid resulted in increased MTD activity and RNA. Increased MTD activity results in an increased ability to utilize mannitol. Among other effects, this may provide an additional source of carbon and energy for response to pathogen attack. These responses of the primary enzyme controlling mannitol pool size reflect the importance of mannitol metabolism in plant responses to divergent types of environmental stress.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

For determining functionality dependencies between two proteins, both represented as 3D structures, it is an essential condition that they have one or more matching structural regions called patches. As 3D structures for proteins are large, complex and constantly evolving, it is computationally expensive and very time-consuming to identify possible locations and sizes of patches for a given protein against a large protein database. In this paper, we address a vector space based representation for protein structures, where a patch is formed by the vectors within the region. Based on our previews work, a compact representation of the patch named patch signature is applied here. A similarity measure of two patches is then derived based on their signatures. To achieve fast patch matching in large protein databases, a match-and-expand strategy is proposed. Given a query patch, a set of small k-sized matching patches, called candidate patches, is generated in match stage. The candidate patches are further filtered by enlarging k in expand stage. Our extensive experimental results demonstrate encouraging performances with respect to this biologically critical but previously computationally prohibitive problem.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background Invasive species pose a significant threat to global economies, agriculture and biodiversity. Despite progress towards understanding the ecological factors associated with plant invasions, limited genomic resources have made it difficult to elucidate the evolutionary and genetic factors responsible for invasiveness. This study presents the first expressed sequence tag (EST) collection for Senecio madagascariensis, a globally invasive plant species. Methods We used pyrosequencing of one normalized and two subtractive libraries, derived from one native and one invasive population, to generate an EST collection. ESTs were assembled into contigs, annotated by BLAST comparison with the NCBI non-redundant protein database and assigned gene ontology (GO) terms from the Plant GO Slim ontologies. Key Results Assembly of the 221 746 sequence reads resulted in 12 442 contigs. Over 50 % (6183) of 12 442 contigs showed significant homology to proteins in the NCBI database, representing approx. 4800 independent transcripts. The molecular transducer GO term was significantly over-represented in the native (South African) subtractive library compared with the invasive (Australian) library. Based on NCBI BLAST hits and literature searches, 40 % of the molecular transducer genes identified in the South African subtractive library are likely to be involved in response to biotic stimuli, such as fungal, bacterial and viral pathogens. Conclusions This EST collection is the first representation of the S. madagascariensis transcriptome and provides an important resource for the discovery of candidate genes associated with plant invasiveness. The over-representation of molecular transducer genes associated with defence responses in the native subtractive library provides preliminary support for aspects of the enemy release and evolution of increased competitive ability hypotheses in this successful invasive. This study highlights the contribution of next-generation sequencing to better understanding the molecular mechanisms underlying ecological hypotheses that are important in successful plant invasions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background The koala, Phascolarctos cinereus, is a biologically unique and evolutionarily distinct Australian arboreal marsupial. The goal of this study was to sequence the transcriptome from several tissues of two geographically separate koalas, and to create the first comprehensive catalog of annotated transcripts for this species, enabling detailed analysis of the unique attributes of this threatened native marsupial, including infection by the koala retrovirus. Results RNA-Seq data was generated from a range of tissues from one male and one female koala and assembled de novo into transcripts using Velvet-Oases. Transcript abundance in each tissue was estimated. Transcripts were searched for likely protein-coding regions and a non-redundant set of 117,563 putative protein sequences was produced. In similarity searches there were 84,907 (72%) sequences that aligned to at least one sequence in the NCBI nr protein database. The best alignments were to sequences from other marsupials. After applying a reciprocal best hit requirement of koala sequences to those from tammar wallaby, Tasmanian devil and the gray short-tailed opossum, we estimate that our transcriptome dataset represents approximately 15,000 koala genes. The marsupial alignment information was used to look for potential gene duplications and we report evidence for copy number expansion of the alpha amylase gene, and of an aldehyde reductase gene. Koala retrovirus (KoRV) transcripts were detected in the transcriptomes. These were analysed in detail and the structure of the spliced envelope gene transcript was determined. There was appreciable sequence diversity within KoRV, with 233 sites in the KoRV genome showing small insertions/deletions or single nucleotide polymorphisms. Both koalas had sequences from the KoRV-A subtype, but the male koala transcriptome has, in addition, sequences more closely related to the KoRV-B subtype. This is the first report of a KoRV-B-like sequence in a wild population. Conclusions This transcriptomic dataset is a useful resource for molecular genetic studies of the koala, for evolutionary genetic studies of marsupials, for validation and annotation of the koala genome sequence, and for investigation of koala retrovirus. Annotated transcripts can be browsed and queried at http://koalagenome.org

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The expressed sequence tags (EST) has been proved to be a useful tool for discovering and identifying functional genes, especially in some species whose genetic information is unavailable. A total of 180 ESTs have been generated from a cDNA library of gametophytic Gracilaria lemaneiformis in this study. These clones are clustered into 151 groups, among which 8 groups are highly homologous to chloroplast genes and are abundant in the library. After searching for matches in the EST database of red alga, 22 groups are found to match with the registered ESTs of Rhadophyta and 6 with Gracilaria. Searching in the protein database reveal that 73 non-redundant clones have significant similarity to some known sequences, the majority of which are involved in photosynthesis, DNA transcription or translation, and 6, 4 and 3 clones are associated with growth or development, signal transduction and stress or defense response, respectively.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bacteriophages, viruses infecting bacteria, are uniformly present in any location where there are high numbers of bacteria, both in the external environment and the human body. Knowledge of their diversity is limited by the difficulty to culture the host species and by the lack of the universal marker gene present in all viruses. Metagenomics is a powerful tool that can be used to analyse viral communities in their natural environments. The aim of this study was to investigate diverse populations of uncultured viruses from clinical (a sputum of patient with cystic fibrosis, CF) and environmental samples (a sludge from a dairy food wastewater treatment plant) containing rich bacterial populations using genetic and metagenomic analyses. Metagenomic sequencing of viruses obtained from these samples revealed that the majority of the metagenomic reads (97-99%) were novel when compared to the NCBI protein database using BLAST. A large proportion of assembled contigs were assignable as novel phages or uncharacterised prophages, the next largest assignable group being single-stranded eukaryotic virus genomes. Sputum from a cystic fibrosis patient contained DNA typical of phages of bacteria that are traditionally involved in CF lung infections and other bacteria that are part of the normal oral flora. The only eukaryotic virus detected in the CF sputum was Torque Teno virus (TTV). A substantial number of assigned sequences from dairy wastewater could be affiliated with phages of bacteria that are typically found in the soil and aquatic environments, including wastewater. Eukaryotic viral sequences were dominated by plant pathogens from the Geminiviridae and Nanoviridae families, and animal pathogens from the Circoviridae family. Antibiotic resistance genes were detected in both metagenomes suggesting phages could be a source for transmissible antimicrobial resistance. Overall, diversity of viruses in the CF sputum was low, with 89 distinct viral genotypes predicted, and higher (409 genotypes) in the wastewater. Function-based screening of a metagenomic library constructed from DNA extracted from dairy food wastewater viruses revealed candidate promoter sequences that have ability to drive expression of GFP in a promoter-trap vector in Escherichia coli. The majority of the cloned DNA sequences selected by the assay were related to ssDNA circular eukaryotic viruses and phages which formed a minority of the metagenome assembly, and many lacked any significant homology to known database sequences. Natural diversity of bacteriophages in wastewater samples was also examined by PCR amplification of the major capsid protein sequences, conserved within T4-type bacteriophages from Myoviridae family. Phylogenetic analysis of capsid sequences revealed that dairy wastewater contained mainly diverse and uncharacterized phages, while some showed a high level of similarity with phages from geographically distant environments.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coconut, Cocos nucifera L. is a major plantation crop, which ensures income for millions of people in the tropical region. Detailed molecular studies on zygotic embryo development would provide valuable clues for the identification of molecular markers to improve somatic embryogenesis. Since there is no ongoing genome project for this species, coconut expressed sequence tags (EST) would be an interesting technique to identify important coconut embryo specific genes as well as other functional genes in different biochemical pathways. The goal of this study was to analyse the ESTs by examining the transcriptome data of the different embryo tissue types together with one somatic tissue. Here, four cDNA libraries from immature embryo, mature embryo, microspore derived embryo and mature leaves were constructed. cDNA was sequenced by the Roche-454 GS-FLX system and assembled into 32621 putative unigenes and 155017 singletons. Of these unigenes, 18651 had significant sequence similarities to non-redundant protein database, from which 16153 were assigned to one or more gene ontology categories. Homologue genes, which are responsible for embryo development such as chitinase, beta-1,3-glucanase, ATP synthase CF0 subunit, thaumatin-like protein and metallothionein-like protein were identified among the embryo EST collection. Of the unigenes, 6694 were mapped into 139 KEGG pathways including carbohydrate metabolism, energy metabolism, lipid metabolism, amino acid metabolism and nucleotide metabolism. This collection of 454-derived EST data generated from different tissue types provides a significant resource for genome wide studies and gene discovery of coconut, a non-model species.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Honey bee venom toxins trigger immunological, physiological, and neurological responses within victims. The high occurrence of bee attacks involving potentially fatal toxic and allergic reactions in humans and the prospect of developing novel pharmaceuticals make honey bee venom an attractive target for proteomic studies. Using label-free quantification, we compared the proteome and phosphoproteome of the venom of Africanized honeybees with that of two European subspecies, namely Apis mellifera ligustica and A. m. carnica. From the total of 51 proteins, 42 were common to all three subspecies. Remarkably, the toxins melittin and icarapin were phosphorylated. In all venoms, icarapin was phosphorylated at the 205Ser residue, which is located in close proximity to its known antigenic site. Melittin, the major toxin of honeybee venoms, was phosphorylated in all venoms at the 10Thr and 18Ser residues. 18Ser phosphorylated melittin-the major of its two phosphorylated forms-was less toxic compared to the native peptide. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract Background Sugarcane (Saccharum spp.) has become an increasingly important crop for its leading role in biofuel production. The high sugar content species S. officinarum is an octoploid without known diploid or tetraploid progenitors. Commercial sugarcane cultivars are hybrids between S. officinarum and wild species S. spontaneum with ploidy at ~12×. The complex autopolyploid sugarcane genome has not been characterized at the DNA sequence level. Results The microsynteny between sugarcane and sorghum was assessed by comparing 454 pyrosequences of 20 sugarcane bacterial artificial chromosomes (BACs) with sorghum sequences. These 20 BACs were selected by hybridization of 1961 single copy sorghum overgo probes to the sugarcane BAC library with one sugarcane BAC corresponding to each of the 20 sorghum chromosome arms. The genic regions of the sugarcane BACs shared an average of 95.2% sequence identity with sorghum, and the sorghum genome was used as a template to order sequence contigs covering 78.2% of the 20 BAC sequences. About 53.1% of the sugarcane BAC sequences are aligned with sorghum sequence. The unaligned regions contain non-coding and repetitive sequences. Within the aligned sequences, 209 genes were annotated in sugarcane and 202 in sorghum. Seventeen genes appeared to be sugarcane-specific and all validated by sugarcane ESTs, while 12 appeared sorghum-specific but only one validated by sorghum ESTs. Twelve of the 17 sugarcane-specific genes have no match in the non-redundant protein database in GenBank, perhaps encoding proteins for sugarcane-specific processes. The sorghum orthologous regions appeared to have expanded relative to sugarcane, mostly by the increase of retrotransposons. Conclusions The sugarcane and sorghum genomes are mostly collinear in the genic regions, and the sorghum genome can be used as a template for assembling much of the genic DNA of the autopolyploid sugarcane genome. The comparable gene density between sugarcane BACs and corresponding sorghum sequences defied the notion that polyploidy species might have faster pace of gene loss due to the redundancy of multiple alleles at each locus.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In der vorliegenden Arbeit wurden die bioinformatischen Methoden der Homologie-Modellierung und Molekularen Modellierung dazu benutzt, die dreidimensionalen Strukturen verschiedenster Proteine vorherzusagen und zu analysieren. Experimentelle Befunde aus Laborversuchen wurden dazu benutzt, die Genauigkeit der Homologie-Modelle zu erhöhen. Die Ergebnisse aus den Modellierungen wurden wiederum dazu benutzt, um neue experimentelle Versuche vorzuschlagen. Anhand der erstellten Modelle und bekannten Kristallstrukturen aus der Protein-Datenbank PDB wurde die Struktur-Funktionsbeziehung verschiedener Tyrosinasen untersucht. Dazu gehörten sowohl die Tyrosinase des Bakteriums Streptomyces als auch die Tyrosinase der Hausmaus. Aus den vergleichenden Strukturanalysen der Tyrosinasen resultierten Mechanismen für die Monophenolhydroxylase-Aktivität der Tyrosinasen sowie für den Import der Kupferionen ins aktive Zentrum. Es konnte der Beweis geführt werden, daß die Blockade des CuA-Zentrums tatsächlich der Grund für die unterschiedliche Aktivität von Tyrosinasen und Catecholoxidasen ist. Zum ersten Mal konnte mit der Maus-Tyrosinase ein vollständiges Strukturmodell einer Säugetier-Tyrosinase erstellt werden, das dazu in der Lage ist, die Mechanismen bekannter Albino-Mutationen auf molekularer Ebene zu erklären. Die auf der Basis des ermittelten 3D-Modells gewonnenen Erkenntnisse über die Wichtigkeit bestimmter Aminosäuren für die Funktion wurde durch gerichtete Mutagenese an der rekombinant hergestellten Maus-Tyrosinase getestet und bestätigt. Weiterhin wurde die Struktur der Tyrosinase des Krebses Palinurus elephas durch eine niedrigaufgelöste 3D-Rekonstruktion aus elektronenmikroskopischen Bildern aufgeklärt. Der zweite große Themenkomplex umfasst die Strukturanalyse der Lichtsammlerkomplexe LHCI-730 und LHCII. Im Falle des LHCII konnte der Oligomerisierungszustand der LHCMoleküle mit diskreten Konformationen des N-Terminus korreliert werden. Auch hier kam eine Kombination von Homologie-Modellierung und einer experimentellen Methode, der Elektronen-Spin-Resonanz-Messung, zum Einsatz. Die Änderung des Oligomerisierungszustands des LHCII kontrolliert den Energiezufluß zu den Photosystemen PS I und PS II. Des Weiteren wurde ein vollständiges Modell des LHCI-730 erstellt, um die Auswirkungen gerichteter Mutagenese auf das Dimerisierungsverhalten zu untersuchen. Auf Basis dieses Modells wurden die Wechselwirkungen zwischen den Monomeren Lhca1 und Lhca4 evaluiert und potentielle Bindungspartner identifiziert.