24 resultados para Multiple Sequence Alignment

em CentAUR: Central Archive University of Reading - UK


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One hundred and nine lactic acid bacterial strains (56 bifidobacteria-like and 53 lactobacilli-like) were isolated from faecal samples donated by healthy elderly individuals (>65 years old). Isolates were identified to species level by phenotypic analysis (by API) and by 16S rDNA sequencing. Eleven species of Lactobacillus and six species of Bifidobacterium were identified. The most frequently isolated lactobacillus was L. fermentum and the most frequently isolated bifidobacterium was closely related to B. infantis by 16S rDNA sequence alignment. The isolates were characterized for their antimicrobial activity against Clostridium difficile, enteropathogenic Escherichia coli (EPEC), verocytotoxigenic E. coli (VTEC) and Campylobacter jejuni. The lactobacilli displayed variations in their antimicrobial activity with few strains showing inhibitory activity against all pathogens. The bifidobacteria displayed higher levels of inhibitory activity against C. jejuni and Cl. difficile than against the E. coli strains. Keywords: Lactobacillus, Bifidobacterium, elderly, gastrointestinal microbiota, inhibition, Clostridium difficile, enteropathogenic Escherichia coli (EPEC), verocytotoxigenic E. coli (VTEC), Campylobacter jejuni.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A number of new and newly improved methods for predicting protein structure developed by the Jones–University College London group were used to make predictions for the CASP6 experiment. Structures were predicted with a combination of fold recognition methods (mGenTHREADER, nFOLD, and THREADER) and a substantially enhanced version of FRAGFOLD, our fragment assembly method. Attempts at automatic domain parsing were made using DomPred and DomSSEA, which are based on a secondary structure parsing algorithm and additionally for DomPred, a simple local sequence alignment scoring function. Disorder prediction was carried out using a new SVM-based version of DISOPRED. Attempts were also made at domain docking and “microdomain” folding in order to build complete chain models for some targets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Motivation: In order to enhance genome annotation, the fully automatic fold recognition method GenTHREADER has been improved and benchmarked. The previous version of GenTHREADER consisted of a simple neural network which was trained to combine sequence alignment score, length information and energy potentials derived from threading into a single score representing the relationship between two proteins, as designated by CATH. The improved version incorporates PSI-BLAST searches, which have been jumpstarted with structural alignment profiles from FSSP, and now also makes use of PSIPRED predicted secondary structure and bi-directional scoring in order to calculate the final alignment score. Pairwise potentials and solvation potentials are calculated from the given sequence alignment which are then used as inputs to a multi-layer, feed-forward neural network, along with the alignment score, alignment length and sequence length. The neural network has also been expanded to accommodate the secondary structure element alignment (SSEA) score as an extra input and it is now trained to learn the FSSP Z-score as a measurement of similarity between two proteins. Results: The improvements made to GenTHREADER increase the number of remote homologues that can be detected with a low error rate, implying higher reliability of score, whilst also increasing the quality of the models produced. We find that up to five times as many true positives can be detected with low error rate per query. Total MaxSub score is doubled at low false positive rates using the improved method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The authors examine the housing pathways of young people in the UK in the years 1999 to 2008, and consider the changing nature of these pathways in the run up to 2020. They employ a highly innovative methodology, which begins with the identification and description of key drivers likely to affect young people’s housing circumstances in the future. The empirical identification and analysis of housing pathways is then achieved using multiple-sequence analysis and cluster analysis of the British Household Panel Survey, contextualised by qualitative interviews with a large sample of young people. The authors describe how the interactions between the meanings, perceptions, and aspirations of young people, and the opportunities and constraints imposed by the drivers, are having a major impact on young people’s housing pathways, resulting in considerable housing policy challenges, particularly in relation to the private rented sector

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The self-assembly and hydrogelation properties of two Fmoc-tripeptides [Fmoc = N-(fluorenyl-9-methoxycarbonyl)] are investigated, in borate buffer and other basic solutions. A remarkable difference in self-assembly properties is observed comparing Fmoc-VLK(Boc) with Fmoc-K(Boc)LV, both containing K protected by N(epsilon)-tert-butyloxycarbonate (Boc). In borate buffer, the former peptide forms highly anisotropic fibrils which show local alignment, and the hydrogels show flow-aligning properties. In contrast, Fmoc-K(Boc)LV forms highly branched fibrils that produce isotropic hydrogels with a much higher modulus (G' > 10(4) Pa), and lower concentration for hydrogel formation. The distinct self-assembled structures are ascribed to conformational differences, as revealed by secondary structure probes (CD, FTIR, Raman spectroscopy) and X-ray diffraction. Fmoc-VLK(Boc) forms well-defined beta-sheets with a cross-beta X-ray diffraction pattern, whereas Fmoc-KLV(Boc) forms unoriented assemblies with multiple stacked sheets. Interchange of the K and V residues when inverting the tripeptide sequence thus leads to substantial differences in self-assembled structures, suggesting a promising approach to control hydrogel properties.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Motivation: Modelling the 3D structures of proteins can often be enhanced if more than one fold template is used during the modelling process. However, in many cases, this may also result in poorer model quality for a given target or alignment method. There is a need for modelling protocols that can both consistently and significantly improve 3D models and provide an indication of when models might not benefit from the use of multiple target-template alignments. Here, we investigate the use of both global and local model quality prediction scores produced by ModFOLDclust2, to improve the selection of target-template alignments for the construction of multiple-template models. Additionally, we evaluate clustering the resulting population of multi- and single-template models for the improvement of our IntFOLD-TS tertiary structure prediction method. Results: We find that using accurate local model quality scores to guide alignment selection is the most consistent way to significantly improve models for each of the sequence to structure alignment methods tested. In addition, using accurate global model quality for re-ranking alignments, prior to selection, further improves the majority of multi-template modelling methods tested. Furthermore, subsequent clustering of the resulting population of multiple-template models significantly improves the quality of selected models compared with the previous version of our tertiary structure prediction method, IntFOLD-TS.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: Intrinsic protein disorder is functionally implicated in numerous biological roles and is, therefore, ubiquitous in proteins from all three kingdoms of life. Determining the disordered regions in proteins presents a challenge for experimental methods and so recently there has been much focus on the development of improved predictive methods. In this article, a novel technique for disorder prediction, called DISOclust, is described, which is based on the analysis of multiple protein fold recognition models. The DISOclust method is rigorously benchmarked against the top.ve methods from the CASP7 experiment. In addition, the optimal consensus of the tested methods is determined and the added value from each method is quantified. Results: The DISOclust method is shown to add the most value to a simple consensus of methods, even in the absence of target sequence homology to known structures. A simple consensus of methods that includes DISOclust can significantly outperform all of the previous individual methods tested.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the performance of phylogenetic mixture models in reducing a well-known and pervasive artifact of phylogenetic inference known as the node-density effect, comparing them to partitioned analyses of the same data. The node-density effect refers to the tendency for the amount of evolutionary change in longer branches of phylogenies to be underestimated compared to that in regions of the tree where there are more nodes and thus branches are typically shorter. Mixture models allow more than one model of sequence evolution to describe the sites in an alignment without prior knowledge of the evolutionary processes that characterize the data or how they correspond to different sites. If multiple evolutionary patterns are common in sequence evolution, mixture models may be capable of reducing node-density effects by characterizing the evolutionary processes more accurately. In gene-sequence alignments simulated to have heterogeneous patterns of evolution, we find that mixture models can reduce node-density effects to negligible levels or remove them altogether, performing as well as partitioned analyses based on the known simulated patterns. The mixture models achieve this without knowledge of the patterns that generated the data and even in some cases without specifying the full or true model of sequence evolution known to underlie the data. The latter result is especially important in real applications, as the true model of evolution is seldom known. We find the same patterns of results for two real data sets with evidence of complex patterns of sequence evolution: mixture models substantially reduced node-density effects and returned better likelihoods compared to partitioning models specifically fitted to these data. We suggest that the presence of more than one pattern of evolution in the data is a common source of error in phylogenetic inference and that mixture models can often detect these patterns even without prior knowledge of their presence in the data. Routine use of mixture models alongside other approaches to phylogenetic inference may often reveal hidden or unexpected patterns of sequence evolution and can improve phylogenetic inference.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Diversity in the chloroplast genome of 171 accessions representing the Brassica 'C' (n = 9) genome, including domesticated and wild B. oleracea and nine inter-fertile related wild species, was investigated using six chloroplast SSR (microsatellite) markers. The lack of diversity detected among 105 cultivated and wild accessions of B. oleracea contrasted starkly with that found within its wild relatives. The vast majority of B. oleracea accessions shared a single haplotype, whereas as many as six haplotypes were detected in two wild species, B. villosa Biv. and B. cretica Lam.. The SSRs proved to be highly polymorphic across haplotypes, with calculated genetic diversity values (H) of 0.23-0.87. In total, 23 different haplotypes were detected in C genome species, with an additional five haplotypes detected in B. rapa L. (A genome n = 10) and another in B. nigra L. (B genome, n = 8). The low chloroplast diversity of B. oleracea is not suggestive of multiple domestication events. The predominant B. oleracea haplotype was also common in B. incana Ten. and present in low frequencies in B. villosa, B. macrocarpa Guss, B. rupestris Raf. and B. cretica. The chloroplast SSRs reveal a wealth of diversity within wild Brassica species that will facilitate further evolutionary and phylogeographic studies of this important crop genus.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Specific monomer sequences in aromatic copolyimides are recognized through their -stacking and hydrogen-bonding interactions with a sterically and electronically complementary molecular tweezer. These interactions enable the tweezer molecule to read monomer sequences comprising up to 27 aromatic rings by multiple adjacent binding to neighboring sites on the polymer chain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel type of tweezer molecule containing electron-rich 2-pyrenyloxy arms has been designed to exploit intramolecular hydrogen bonding in stabilising a preferred conformation for supramolecular complexation to complementary sequences in aromatic copolyimides. This tweezer-conformation is demonstrated by single-crystal X-ray analyses of the tweezer molecule itself and of its complex with an aromatic diimide model-compound. In terms of its ability to bind selectively to polyimide chains, the new tweezer molecule shows very high sensitivity to sequence effects. Thus, even low concentrations of tweezer relative to diimide units (<2.5 mol%) are sufficient to produce dramatic, sequence-related splittings of the pyromellitimide proton NMR resonances. These induced resonance-shifts arise from ring-current shielding of pyromellitimide protons by the pyrenyloxy arms of the tweezer-molecule, and the magnitude of such shielding is a function of the tweezer-binding constant for any particular monomer sequence. Recognition of both short-range and long-range sequences is observed, the latter arising from cumulative ring-current shielding of diimide protons by tweezer molecules binding at multiple adjacent sites on the copolymer chain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pyrene-based molecular tweezers show sequence-specific binding to aromatic polyimides through sterically-controlled donor-acceptor pi-stacking and hydrogen bonding; H-1 NMR spectra of tweezer-complexes with polyimides having different sequence-restrictions show conclusively that the detection of long range sequence-information results from multiple tweezer-binding at adjacent imide residues.