Biblioteca Digital

949 resultados para Sequence-based PCR

Estimating the probability for a protein to have a new fold: A statistical computational model

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Structural genomics aims to solve a large number of protein structures that represent the protein space. Currently an exhaustive solution for all structures seems prohibitively expensive, so the challenge is to define a relatively small set of proteins with new, currently unknown folds. This paper presents a method that assigns each protein with a probability of having an unsolved fold. The method makes extensive use of protomap, a sequence-based classification, and scop, a structure-based classification. According to protomap, the protein space encodes the relationship among proteins as a graph whose vertices correspond to 13,354 clusters of proteins. A representative fold for a cluster with at least one solved protein is determined after superposition of all scop (release 1.37) folds onto protomap clusters. Distances within the protomap graph are computed from each representative fold to the neighboring folds. The distribution of these distances is used to create a statistical model for distances among those folds that are already known and those that have yet to be discovered. The distribution of distances for solved/unsolved proteins is significantly different. This difference makes it possible to use Bayes' rule to derive a statistical estimate that any protein has a yet undetermined fold. Proteins that score the highest probability to represent a new fold constitute the target list for structural determination. Our predicted probabilities for unsolved proteins correlate very well with the proportion of new folds among recently solved structures (new scop 1.39 records) that are disjoint from our original training set.

The InterPro database, an integrated documentation resource for protein families, domains and functional sites

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1 000 000 hits from 462 500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk.

The [(G/C)3NN]n motif: a common DNA repeat that excludes nucleosomes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nucleosomes, the basic structural elements of chromosomes, consist of 146 bp of DNA coiled around an octamer of histone proteins, and their presence can strongly influence gene expression. Considerations of the anisotropic flexibility of nucleotide triplets containing 3 cytosines or guanines suggested that a [5'(G/C)3 NN3']n motif might resist wrapping around a histone octamer. To test this, DNAs were constructed containing a 5'-CCGNN-3' pentanucleotide repeat with the Ns varied. Using in vitro nucleosome reconstitution and electron microscopy, a plasmid with 48 contiguous CCGNN repeats strongly excluded nucleosomes in the repeat region. Competitive reconstitution gel retardation experiments using DNA fragments containing 12, 24, or 48 CCGNN repeats showed that the propensity to exclude nucleosomes increased with the length of the repeat. Analysis showed that a 268-bp DNA containing a (CCGNN)48 block is 4.9 +/- 0.6-fold less efficient in nucleosome assembly than a similar length pUC19 fragment and approximately 78-fold less efficient than a similar length (CTG)n sequence, based on results from previous studies. Computer searches against the GenBank database for matches with a [(G/C)3NN]48 sequence revealed numerous examples that frequently were present in the control regions of "TATA-less" genes, including the human ETS-2 and human dihydrofolate reductase genes. In both cases the (G/C)3NN repeat, present in the promoter region, co-maps with loci previously shown to be nuclease hypersensitive sites.

Conservation of synteny between the genome of the pufferfish (Fugu rubripes) and the region on human chromosome 14 (14q24.3) associated with familial Alzheimer disease (AD3 locus)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The genome of the pufferfish (Fugu rubripes) (400 Mb) is approximately 7.5 times smaller than the human genome, but it has a similar gene repertoire to that of man. If regions of the two genomes exhibited conservation of gene order (i.e., were syntenic), it should be possible to reduce dramatically the effort required for identification of candidate genes in human disease loci by sequencing syntenic regions of the compact Fugu genome. We have demonstrated that three genes (dihydrolipoamide succinyltransferase, S31iii125, and S20i15), which are linked to FOS in the familial Alzheimer disease focus (AD3) on human chromosome 14, have homologues in the Fugu genome adjacent to Fugu cFOS. The relative gene order of cFOS, S31iii125, and S20i15 was the same in both genomes, but in Fugu these three genes lay within a 12.4-kb region, compared to >600 kb in the human AD3 locus. These results demonstrate the conservation of synteny between the genomes of Fugu and man and highlight the utility of this approach for sequence-based identification of genes in human disease loci.

Molecular cloning and expression of cDNAs encoding human alpha-mannosidase II and a previously unrecognized alpha-mannosidase IIx isozyme.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Golgi alpha-mannosidase II (alpha-MII) is an enzyme involved in the processing of N-linked glycans. Using a previously isolated murine cDNA clone as a probe, we have isolated cDNA clones encompassing the human alpha-MII cDNA open reading frame and initiated isolation of human genomic clones. During the isolation of genomic clones, genes related to that encoding alpha-MII were isolated. One such gene was found to encode an isozyme, designated alpha-MIIx. A 5-kb cDNA clone encoding alpha-MIIx was then isolated from a human melanoma cDNA library. However, comparison between alpha-MIIx and alpha-MII cDNAs suggested that the cloned cDNA encodes a truncated polypeptide with 796 amino acid residues, while alpha-MII consists of 1144 amino acid residues. To reevaluate the sequence of alpha-MIIx cDNA, polymerase chain reaction (PCR) was performed with lymphocyte mRNAs. Comparison of the sequence of PCR products with the alpha-MIIx genomic sequence revealed that alternative splicing of the alpha-MIIx transcript can result in an additional transcript encoding a 1139-amino acid polypeptide. Northern analysis showed transcription of alpha-MIIx in various tissues, suggesting that the alpha-MIIx gene is a housekeeping gene. COS cells transfected with alpha-MIIx cDNA containing the full-length open reading frame showed an increase of alpha-mannosidase activity. The alpha-MIIx gene was mapped to human chromosome 15q25, whereas the alpha-MII gene was mapped to 5q21-22.

Core derived downhole logs for holes CRP-2 and CRP-2A

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the northern McMurdo Sound (Ross Sea, Antarctica), the CRP-2/2A drillhole targeted the western margin of the Victoria Land Basin to investigate Neogene to Palaeogene climatic and tectonic history by obtaining continuous core and downhole logs. Well logging of CRP-2/2A has provided a complete and comprehensive dataset of in situ geophysical measurements. This paper describes the evaluation and interpretation of the downhole logging data using multivariate statistical methods. Two major types of multivariate statistical methods were each yielding a different perspective: (1) Factor analysis was used as an objective tool for classification of the drilled sequence based on physical and chemical properties. The factor logs are mirroring the basic geological controls (i.e., grain size, porosity, clay mineralogy) behind the measured geophysical properties, thereby making them easier to interpret geologically. (2) Cluster analysis of the logs groups similar downhole geophysical properties into one cluster, delineating individual logging or sedimentological units. These objectively and independently defined units, or statistical electrofacies, are helpful in differentiating lithological and sedimentological characterisations (e.g. grain size, provenance). The multivariate statistical methods of factor and cluster analysis proved to be powerful tools for fast, reliable, and objective characterisation of downhole geophysical properties at CRP-2/2A, resulting in interpretations which are consistent with sedimentological findings.

Prediction of protein B-factor profiles

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The polypeptide backbones and side chains of proteins are constantly moving due to thermal motion and the kinetic energy of the atoms. The B-factors of protein crystal structures reflect the fluctuation of atoms about their average positions and provide important information about protein dynamics. Computational approaches to predict thermal motion are useful for analyzing the dynamic properties of proteins with unknown structures. In this article, we utilize a novel support vector regression (SVR) approach to predict the B-factor distribution (B-factor profile) of a protein from its sequence. We explore schemes for encoding sequences and various settings for the parameters used in SVR. Based on a large dataset of high-resolution proteins, our method predicts the B-factor distribution with a Pearson correlation coefficient (CC) of 0.53. In addition, our method predicts the B-factor profile with a CC of at least 0.56 for more than half of the proteins. Our method also performs well for classifying residues (rigid vs. flexible). For almost all predicted B-factor thresholds, prediction accuracies (percent of correctly predicted residues) are greater than 70%. These results exceed the best results of other sequence-based prediction methods. (C) 2005 Wiley-Liss, Inc.

Prediction of HLA-DQ3.2 beta ligands: evidence of multiple registers in class II binding peptides

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Motivation: While processing of MHC class II antigens for presentation to helper T-cells is essential for normal immune response, it is also implicated in the pathogenesis of autoimmune disorders and hypersensitivity reactions. Sequence-based computational techniques for predicting HLA-DQ binding peptides have encountered limited success, with few prediction techniques developed using three-dimensional models. Methods: We describe a structure-based prediction model for modeling peptide-DQ3.2 beta complexes. We have developed a rapid and accurate protocol for docking candidate peptides into the DQ3.2 beta receptor and a scoring function to discriminate binders from the background. The scoring function was rigorously trained, tested and validated using experimentally verified DQ3.2 beta binding and non-binding peptides obtained from biochemical and functional studies. Results: Our model predicts DQ3.2 beta binding peptides with high accuracy [area under the receiver operating characteristic (ROC) curve A(ROC) > 0.90], compared with experimental data. We investigated the binding patterns of DQ3.2 beta peptides and illustrate that several registers exist within a candidate binding peptide. Further analysis reveals that peptides with multiple registers occur predominantly for high-affinity binders.

Finding negative event oriented patterns in long temporal sequences

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Pattern discovery in a long temporal event sequence is of great importance in many application domains. Most of the previous work focuses on identifying positive associations among time stamped event types. In this paper, we introduce the problem of defining and discovering negative associations that, as positive rules, may also serve as a source of knowledge discovery. In general, an event-oriented pattern is a pattern that associates with a selected type of event, called a target event. As a counter-part of previous research, we identify patterns that have a negative relationship with the target events. A set of criteria is defined to evaluate the interestingness of patterns associated with such negative relationships. In the process of counting the frequency of a pattern, we propose a new approach, called unique minimal occurrence, which guarantees that the Apriori property holds for all patterns in a long sequence. Based on the interestingness measures, algorithms are proposed to discover potentially interesting patterns for this negative rule problem. Finally, the experiment is made for a real application.

Introducing uncertainty into pattern discovery in temporal event sequences

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Pattern discovery in temporal event sequences is of great importance in many application domains, such as telecommunication network fault analysis. In reality, not every type of event has an accurate timestamp. Some of them, defined as inaccurate events may only have an interval as possible time of occurrence. The existence of inaccurate events may cause uncertainty in event ordering. The traditional support model cannot deal with this uncertainty, which would cause some interesting patterns to be missing. A new concept, precise support, is introduced to evaluate the probability of a pattern contained in a sequence. Based on this new metric, we define the uncertainty model and present an algorithm to discover interesting patterns in the sequence database that has one type of inaccurate event. In our model, the number of types of inaccurate events can be extended to k readily, however, at a cost of increasing computational complexity.

Toward bacterial protein sub-cellular location prediction:single-class discrimminant models for all gram- and gram+ compartments

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Based on Bayesian Networks, methods were created that address protein sequence-based bacterial subcellular location prediction. Distinct predictive algorithms for the eight bacterial subcellular locations were created. Several variant methods were explored. These variations included differences in the number of residues considered within the query sequence - which ranged from the N-terminal 10 residues to the whole sequence - and residue representation - which took the form of amino acid composition, percentage amino acid composition, or normalised amino acid composition. The accuracies of the best performing networks were then compared to PSORTB. All individual location methods outperform PSORTB except for the Gram+ cytoplasmic protein predictor, for which accuracies were essentially equal, and for outer membrane protein prediction, where PSORTB outperforms the binary predictor. The method described here is an important new approach to method development for subcellular location prediction. It is also a new, potentially valuable tool for candidate subunit vaccine selection.

Towards in silico prediction of immunogenic epitopes

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As torrents of new data now emerge from microbial genomics, bioinformatic prediction of immunogenic epitopes remains challenging but vital. In silico methods often produce paradoxically inconsistent results: good prediction rates on certain test sets but not others. The inherent complexity of immune presentation and recognition processes complicates epitope prediction. Two encouraging developments – data driven artificial intelligence sequence-based methods for epitope prediction and molecular modeling methods based on three-dimensional protein structures – offer hope for the future.

Colletotrichum species in Australia

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Forty-four species of Colletotrichum are confirmed as present in Australia based on DNA sequencing analyses. Many of these species were identified directly as a result of two workshops organised by the Subcommittee on Plant Health Diagnostics in Australia in 2015 that covered morphological and molecular approaches to identification of Colletotrichum. There are several other species of Colletotrichum reported from Australia that remain to be substantiated by DNA sequence-based methods. This body of work aims to provide a basis from which to critically examine a number of isolates of Colletotrichum deposited in Australian culture collections.

Archaea appear to dominate the microbiome of Inflatella pellicula deep sea sponges

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Microbes associated with marine sponges play significant roles in host physiology. Remarkable levels of microbial diversity have been observed in sponges worldwide through both culture-dependent and culture-independent studies. Most studies have focused on the structure of the bacterial communities in sponges and have involved sponges sampled from shallow waters. Here, we used pyrosequencing of 16S rRNA genes to compare the bacterial and archaeal communities associated with two individuals of the marine sponge Inflatella pellicula from the deep-sea, sampled from a depth of 2,900 m, a depth which far exceeds any previous sequence-based report of sponge-associated microbial communities. Sponge-microbial communities were also compared to the microbial community in the surrounding seawater. Sponge-associated microbial communities were dominated by archaeal sequencing reads with a single archaeal OTU, comprising similar to ∼60% and similar to ∼72% of sequences, being observed from Inflatella pellicula. Archaeal sequencing reads were less abundant in seawater (similar to ∼11% of sequences). Sponge-associated microbial communities were less diverse and less even than any other sponge-microbial community investigated to date with just 210 and 273 OTUs (97% sequence identity) identified in sponges, with 4 and 6 dominant OTUs comprising similar to ∼88% and similar to ∼89% of sequences, respectively. Members of the candidate phyla, SAR406, NC10 and ZB3 are reported here from sponges for the first time, increasing the number of bacterial phyla or candidate divisions associated with sponges to 43. A minor cohort from both sponge samples (similar to ∼0.2% and similar to ∼0.3% of sequences) were not classified to phylum level. A single OTU, common to both sponge individuals, dominates these unclassified reads and shares sequence homology with a sponge associated clone which itself has no known close relative and may represent a novel taxon.

Innovation in aquaculture. Modeling the adoption and implementation processes of innovations in the Italian aquaculture sector

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Since the turn of the century, fisheries have maintained a steady growth rate, while aquaculture has experienced a more rapid expansion. Aquaculture can offer EU consumers more diverse, healthy, and sustainable food options, some of which are more popular elsewhere. To develop the sector, the EU is investing heavily. The EU supports innovative projects that promote the sustainable development of seafood sectors and food security. Priority 3 promotes sector development through innovation dissemination. This doctoral dissertation examined innovation transfer in the Italian aquaculture sector, specifically the adoption of innovative tools, using a theoretical model to better understand the complexity of these processes. The work focused on innovation adoption, emphasising that it is the end of a well-defined process. The Awareness Knowledge Adoption Implementation Effectiveness (AKAIE) model was created to better analyse post-adoption phases and evaluate technology adoption implementation and impact. To identify AKAIE drivers and barriers, aquaculture actors were consulted. "Perceived complexity"—barriers to adoption that are strongly influenced by contextual factors—has been used to examine their perspectives (i.e. socio-economic, institutional, cultural ones). The new model will contextualise the sequence based on technologies, entrepreneur traits, corporate and institutional contexts, and complexity perception, the sequence's central node. Technology adoption can also be studied by examining complexity perceptions along the AKAIE sequence. This study proposes a new model to evaluate the diffusion of a given technology, offering the policy maker the possibility to be able to act promptly across the process. The development of responsible policies for evaluating the effectiveness of innovation is more necessary than ever, especially to orient strategies and interventions in the face of major scenarios of change.

«
1
2
...
6
7
8
9
10
11
12
...
63
64
»