15 resultados para Alignments.

em CentAUR: Central Archive University of Reading - UK


Relevância:

20.00% 20.00%

Publicador:

Resumo:

MOTIVATION: The accurate prediction of the quality of 3D models is a key component of successful protein tertiary structure prediction methods. Currently, clustering or consensus based Model Quality Assessment Programs (MQAPs) are the most accurate methods for predicting 3D model quality; however they are often CPU intensive as they carry out multiple structural alignments in order to compare numerous models. In this study, we describe ModFOLDclustQ - a novel MQAP that compares 3D models of proteins without the need for CPU intensive structural alignments by utilising the Q measure for model comparisons. The ModFOLDclustQ method is benchmarked against the top established methods in terms of both accuracy and speed. In addition, the ModFOLDclustQ scores are combined with those from our older ModFOLDclust method to form a new method, ModFOLDclust2, that aims to provide increased prediction accuracy with negligible computational overhead. RESULTS: The ModFOLDclustQ method is competitive with leading clustering based MQAPs for the prediction of global model quality, yet it is up to 150 times faster than the previous version of the ModFOLDclust method at comparing models of small proteins (<60 residues) and over 5 times faster at comparing models of large proteins (>800 residues). Furthermore, a significant improvement in accuracy can be gained over the previous clustering based MQAPs by combining the scores from ModFOLDclustQ and ModFOLDclust to form the new ModFOLDclust2 method, with little impact on the overall time taken for each prediction. AVAILABILITY: The ModFOLDclustQ and ModFOLDclust2 methods are available to download from: http://www.reading.ac.uk/bioinf/downloads/ CONTACT: l.j.mcguffin@reading.ac.uk.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins (similar to30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the performance of phylogenetic mixture models in reducing a well-known and pervasive artifact of phylogenetic inference known as the node-density effect, comparing them to partitioned analyses of the same data. The node-density effect refers to the tendency for the amount of evolutionary change in longer branches of phylogenies to be underestimated compared to that in regions of the tree where there are more nodes and thus branches are typically shorter. Mixture models allow more than one model of sequence evolution to describe the sites in an alignment without prior knowledge of the evolutionary processes that characterize the data or how they correspond to different sites. If multiple evolutionary patterns are common in sequence evolution, mixture models may be capable of reducing node-density effects by characterizing the evolutionary processes more accurately. In gene-sequence alignments simulated to have heterogeneous patterns of evolution, we find that mixture models can reduce node-density effects to negligible levels or remove them altogether, performing as well as partitioned analyses based on the known simulated patterns. The mixture models achieve this without knowledge of the patterns that generated the data and even in some cases without specifying the full or true model of sequence evolution known to underlie the data. The latter result is especially important in real applications, as the true model of evolution is seldom known. We find the same patterns of results for two real data sets with evidence of complex patterns of sequence evolution: mixture models substantially reduced node-density effects and returned better likelihoods compared to partitioning models specifically fitted to these data. We suggest that the presence of more than one pattern of evolution in the data is a common source of error in phylogenetic inference and that mixture models can often detect these patterns even without prior knowledge of their presence in the data. Routine use of mixture models alongside other approaches to phylogenetic inference may often reveal hidden or unexpected patterns of sequence evolution and can improve phylogenetic inference.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The 3' untranslated regions (3'UTRs) of flaviviruses are reviewed and analyzed in relation to short sequences conserved as direct repeats (DRs). Previously, alignments of the 3'UTRs have been constructed for three of the four recognized flavivirus groups, namely mosquito-borne, tick-borne, and nonclassified flaviviruses (MBFV, TBFV, and NCFV, respectively). This revealed (1) six long repeat sequences (LRSs) in the 3'UTR and open-reading frame (ORF) of the TBFV, (2) duplication of the 3'UTR of the NCFV by intramolecular recombination, and (3) the possibility of a common origin for all DRs within the MBFV. We have now extended this analysis and review it in the context of all previous published analyses. This has been achieved by constructing a robust alignment between all flaviviruses using the published DRs and secondary RNA structures as "anchors" to reveal additional homologies along the 3'UTR. This approach identified nucleotide regions within the MBFV, NKV (no-known vector viruses), and NCFV 3'UTRs that are homologous to different LRSs in the TBFV 3'UTR and ORF. The analysis revealed that some of the DRs and secondary RNA structures described individually within each flavivirus group share common evolutionary origins. The 3'UTR of flaviviruses, and possibly the ORF, therefore probably evolved through multiple duplication of an RNA domain, homologous to the LRS previously identified only in the TBFV. The short DRs in all virus groups appear to represent the evolutionary remnants of these domains rather than resulting from new duplications. The relevance of these flavivirus DRs to evolution, diversity, 3'UTR enhancer function, and virus transmission is reviewed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Flavivirus replication is mediated by interactions between complementary ssRNA sequences of the 5'- and 3'-termini that form dsRNA cyclisation stems or panhandles, varying in length, sequence and specific location in the mosquito-borne, tick-borne, non-vectored and non-classified flaviviruses. In this manuscript we manually aligned the flavivirus 5'UTRs and adjacent capsid genes and revealed significantly more homology than has hitherto been identified. Analysis of the alignments revealed that the panhandles represent evolutionary remnants of a long cyclisation domain that probably emerged through duplication of one of the UTR termini.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Summary: The program LVB seeks parsimonious phylogenies from nucleotide alignments, using the simulated annealing heuristic. LVB runs fast and gives high quality results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Patterns of substitution in chloroplast encoded trnL_F regions were compared between species of Actaea (Ranunculales), Digitalis (Scrophulariales), Drosera (Caryophyllales), Panicoideae (Poales), the small chromosome species clade of Pelargonium (Geraniales), each representing a different order of flowering plants, and Huperzia (Lycopodiales). In total, the study included 265 taxa, each with > 900-bp sequences, totaling 0.24 Mb. Both pairwise and phylogeny-based comparisons were used to assess nucleotide substitution patterns. In all six groups, we found that transition/transversion ratios, as estimated by maximum likelihood on most-parsimonious trees, ranged between 0.8 and 1.0 for ingroups. These values occurred both at low sequence divergences, where substitutional saturation, i.e., multiple substitutions having occurred at the same (homologous) nucleotide position, was not expected, and at higher levels of divergence. This suggests that the angiosperm trnL-F regions evolve in a pattern different from that generally observed for nuclear and animal mtDNA (transitional/transversion ratio > or = 2). Transition/transversion ratios in the intron and the spacer region differed in all alignments compared, yet base compositions between the regions were highly similar in all six groups. A>-C transversions were significantly less frequent than the other four substitution types. This correlates with results from studies on fidelity mechanisms in DNA replication that predict A<->T and G<->C transversions to be least likely to occur. It therefore strengthens confidence in the link between mutation bias at the polymerase level and the actual fixation of substitutions as recorded on evolutionary trees, and concomitantly, in the neutrality of nucleotide substitutions as phylogenetic markers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An alternative approach to understanding innovation is made using two intersecting ideas. The first is that successful innovation requires consideration of the social and organizational contexts in which it is located. The complex context of construction work is characterized by inter-organizational collaboration, a project-based approach and power distributed amongst collaborating organizations. The second is that innovations can be divided into two modes: ‘bounded’, where the implications of innovation are restricted within a single, coherent sphere of influence, and ‘unbounded’, where the effects of implementation spill over beyond this. Bounded innovations are adequately explained within the construction literature. However, less discussed are unbounded innovations, where many firms' collaboration is required for successful implementation, even though many innovations can be considered unbounded within construction's inter-organizational context. It is argued that unbounded innovations require an approach to understand and facilitate the interactions both within a range of actors and between the actors and technological artefacts. The insights from a sociology of technology approach can be applied to the multiplicity of negotiations and alignments that constitute the implementation of unbounded innovation. The utility of concepts from the sociology of technology, including ‘system building’ and ‘heterogeneous engineering’, is demonstrated by applying them to an empirical study of an unbounded innovation on a major construction project (the new terminal at Heathrow Airport, London, UK). This study suggests that ‘system building’ contains outcomes that are not only transformations of practices, processes and systems, but also the potential transformation of technologies themselves.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

If secondary structure predictions are to be incorporated into fold recognition methods, an assessment of the effect of specific types of errors in predicted secondary structures on the sensitivity of fold recognition should be carried out. Here, we present a systematic comparison of different secondary structure prediction methods by measuring frequencies of specific types of error. We carry out an evaluation of the effect of specific types of error on secondary structure element alignment (SSEA), a baseline fold recognition method. The results of this evaluation indicate that missing out whole helix or strand elements, or predicting the wrong type of element, is more detrimental than predicting the wrong lengths of elements or overpredicting helix or strand. We also suggest that SSEA scoring is an effective method for assessing accuracy of secondary structure prediction and perhaps may also provide a more appropriate assessment of the “usefulness” and quality of predicted secondary structure, if secondary structure alignments are to be used in fold recognition.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Motivation: Modelling the 3D structures of proteins can often be enhanced if more than one fold template is used during the modelling process. However, in many cases, this may also result in poorer model quality for a given target or alignment method. There is a need for modelling protocols that can both consistently and significantly improve 3D models and provide an indication of when models might not benefit from the use of multiple target-template alignments. Here, we investigate the use of both global and local model quality prediction scores produced by ModFOLDclust2, to improve the selection of target-template alignments for the construction of multiple-template models. Additionally, we evaluate clustering the resulting population of multi- and single-template models for the improvement of our IntFOLD-TS tertiary structure prediction method. Results: We find that using accurate local model quality scores to guide alignment selection is the most consistent way to significantly improve models for each of the sequence to structure alignment methods tested. In addition, using accurate global model quality for re-ranking alignments, prior to selection, further improves the majority of multi-template modelling methods tested. Furthermore, subsequent clustering of the resulting population of multiple-template models significantly improves the quality of selected models compared with the previous version of our tertiary structure prediction method, IntFOLD-TS.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Whole-genome transcriptome profiling is revealing how biological systems are regulated at the transcriptional level. This study reports the development of a robust method to profile and compare the transcriptomes of two nonmodel plant species, Thlaspi caerulescens, a zinc (Zn) hyperaccumulator, and Thlaspi arvense, a nonhyperaccumulator, using Affymetrix Arabidopsis thaliana ATH1-121501 GeneChip (R) arrays (Affymetrix, Santa Clara, CA, USA). Transcript abundance was quantified in the shoots of agar- and compost-grown plants of both species. Analyses were optimized using a genomic DNA (gDNA)-based probe-selection strategy based on the hybridization efficiency of Thlaspi gDNA with corresponding A. thaliana probes. In silico alignments of GeneChip (R) probes with Thlaspi gene sequences, and quantitative real-time PCR, confirmed the validity of this approach. Approximately 5000 genes were differentially expressed in the shoots of T. caerulescens compared with T. arvense, including genes involved in Zn transport and compartmentalization. Future functional analyses of genes identified as differentially expressed in the shoots of these closely related species will improve our understanding of the molecular mechanisms of Zn hyperaccumulation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Ιn the eighteenth century the printing of Greek texts continued to be central to scholarship and discourse. The typography of Greek texts could be characterised as a continuation of French models from the sixteenth century, with a gradual dilution of the complexity of ligatures and abbreviations, mostly through printers in the Low Countries. In Britain, Greek printing was dominated by the university presses, which reproduced conservatively the continental models – exemplified by Oxford's Fell types, which were Dutch adaptations of earlier French models. Hindsight allows us to identify a meaningful development in the Greek types cut by Alexander Wilson for the Foulis Press in Glasgow, but we can argue that in the middle of the eighteenth century Baskerville was considering Greek printing the typographic environment was ripe for a new style of Greek types. The opportunity to cut the types for a New Testament (in an twin edition that included a generous octavo and a large quarto version) would seem perfect for showcasing Baskerville's capacity for innovation. His Greek type maintained the cursive ductus of earlier models, but abandoned complex ligatures and any hint of scribal flourish. He homogenised the modulation of the letter strokes and the treatment of terminals, and normalised the horizontal alignments of all letters. Although the strokes are in some letters too delicate, the narrow set of the style composes a consistent, uniform texture that is a clean break from contemporaneous models. The argument is made that this is the first Greek typeface that can be described as fully typographic in the context of the technology of the time. It sets a pattern that was to be followed, without acknowledgement, by Richard Porson nearly a century and a half later. The typeface received little praise by typographic historians, and was condemned by Victor Scholderer in his retrospective of Greek typography. A survey of typeface reviews in the surrounding decades establishes that the commentators were mostly reproducing the views of an arbitrary typographic orthodoxy, for which only types with direct references to Renaissance models were acceptable. In these comments we detect a bias against someone considered an arriviste in the scholarly printing establishment, as well as a conservative attitude to typographic innovation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Evolved resistance to fungicides is a major problem limiting our ability to control agricultural, medical and veterinary pathogens and is frequently associated with substitutions in the amino acid sequence of the target protein. The convention for describing amino-acid substitutions is to cite the wild type amino acid, the codon number and the new amino acid, using the one letter amino acid code. It has frequently been observed that orthologous amino acid mutations have been selected in different species by fungicides from the same mode of action class, but the amino acids have different numbers. These differences in numbering arise from the different lengths of the proteins in each species. The purpose of the current paper is to propose a system for unifying the labelling of amino acids in fungicide target proteins. To do this we have produced alignments between fungicide target proteins of relevant species fitted to a well-studied “archetype” species. Orthologous amino acids in all species are then assigned numerical “labels” based on the position of the amino acid in the archetype protein.