963 resultados para Multiple sequence alignment
Resumo:
Motivation: Intrinsic protein disorder is functionally implicated in numerous biological roles and is, therefore, ubiquitous in proteins from all three kingdoms of life. Determining the disordered regions in proteins presents a challenge for experimental methods and so recently there has been much focus on the development of improved predictive methods. In this article, a novel technique for disorder prediction, called DISOclust, is described, which is based on the analysis of multiple protein fold recognition models. The DISOclust method is rigorously benchmarked against the top.ve methods from the CASP7 experiment. In addition, the optimal consensus of the tested methods is determined and the added value from each method is quantified. Results: The DISOclust method is shown to add the most value to a simple consensus of methods, even in the absence of target sequence homology to known structures. A simple consensus of methods that includes DISOclust can significantly outperform all of the previous individual methods tested.
Resumo:
We investigate the performance of phylogenetic mixture models in reducing a well-known and pervasive artifact of phylogenetic inference known as the node-density effect, comparing them to partitioned analyses of the same data. The node-density effect refers to the tendency for the amount of evolutionary change in longer branches of phylogenies to be underestimated compared to that in regions of the tree where there are more nodes and thus branches are typically shorter. Mixture models allow more than one model of sequence evolution to describe the sites in an alignment without prior knowledge of the evolutionary processes that characterize the data or how they correspond to different sites. If multiple evolutionary patterns are common in sequence evolution, mixture models may be capable of reducing node-density effects by characterizing the evolutionary processes more accurately. In gene-sequence alignments simulated to have heterogeneous patterns of evolution, we find that mixture models can reduce node-density effects to negligible levels or remove them altogether, performing as well as partitioned analyses based on the known simulated patterns. The mixture models achieve this without knowledge of the patterns that generated the data and even in some cases without specifying the full or true model of sequence evolution known to underlie the data. The latter result is especially important in real applications, as the true model of evolution is seldom known. We find the same patterns of results for two real data sets with evidence of complex patterns of sequence evolution: mixture models substantially reduced node-density effects and returned better likelihoods compared to partitioning models specifically fitted to these data. We suggest that the presence of more than one pattern of evolution in the data is a common source of error in phylogenetic inference and that mixture models can often detect these patterns even without prior knowledge of their presence in the data. Routine use of mixture models alongside other approaches to phylogenetic inference may often reveal hidden or unexpected patterns of sequence evolution and can improve phylogenetic inference.
Resumo:
Diversity in the chloroplast genome of 171 accessions representing the Brassica 'C' (n = 9) genome, including domesticated and wild B. oleracea and nine inter-fertile related wild species, was investigated using six chloroplast SSR (microsatellite) markers. The lack of diversity detected among 105 cultivated and wild accessions of B. oleracea contrasted starkly with that found within its wild relatives. The vast majority of B. oleracea accessions shared a single haplotype, whereas as many as six haplotypes were detected in two wild species, B. villosa Biv. and B. cretica Lam.. The SSRs proved to be highly polymorphic across haplotypes, with calculated genetic diversity values (H) of 0.23-0.87. In total, 23 different haplotypes were detected in C genome species, with an additional five haplotypes detected in B. rapa L. (A genome n = 10) and another in B. nigra L. (B genome, n = 8). The low chloroplast diversity of B. oleracea is not suggestive of multiple domestication events. The predominant B. oleracea haplotype was also common in B. incana Ten. and present in low frequencies in B. villosa, B. macrocarpa Guss, B. rupestris Raf. and B. cretica. The chloroplast SSRs reveal a wealth of diversity within wild Brassica species that will facilitate further evolutionary and phylogeographic studies of this important crop genus.
Resumo:
We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.
Resumo:
Specific monomer sequences in aromatic copolyimides are recognized through their -stacking and hydrogen-bonding interactions with a sterically and electronically complementary molecular tweezer. These interactions enable the tweezer molecule to read monomer sequences comprising up to 27 aromatic rings by multiple adjacent binding to neighboring sites on the polymer chain.
Resumo:
A novel type of tweezer molecule containing electron-rich 2-pyrenyloxy arms has been designed to exploit intramolecular hydrogen bonding in stabilising a preferred conformation for supramolecular complexation to complementary sequences in aromatic copolyimides. This tweezer-conformation is demonstrated by single-crystal X-ray analyses of the tweezer molecule itself and of its complex with an aromatic diimide model-compound. In terms of its ability to bind selectively to polyimide chains, the new tweezer molecule shows very high sensitivity to sequence effects. Thus, even low concentrations of tweezer relative to diimide units (<2.5 mol%) are sufficient to produce dramatic, sequence-related splittings of the pyromellitimide proton NMR resonances. These induced resonance-shifts arise from ring-current shielding of pyromellitimide protons by the pyrenyloxy arms of the tweezer-molecule, and the magnitude of such shielding is a function of the tweezer-binding constant for any particular monomer sequence. Recognition of both short-range and long-range sequences is observed, the latter arising from cumulative ring-current shielding of diimide protons by tweezer molecules binding at multiple adjacent sites on the copolymer chain.
Resumo:
Pyrene-based molecular tweezers show sequence-specific binding to aromatic polyimides through sterically-controlled donor-acceptor pi-stacking and hydrogen bonding; H-1 NMR spectra of tweezer-complexes with polyimides having different sequence-restrictions show conclusively that the detection of long range sequence-information results from multiple tweezer-binding at adjacent imide residues.
Resumo:
We investigated infants' sensitivity to spatiotemporal structure. In Experiment 1, circles appeared in a statistically defined spatial pattern. At test 11-month-olds, but not 8-month-olds, looked longer at a novel spatial sequence. Experiment 2 presented different color/shape stimuli, but only the location sequence was violated during test; 8-month-olds preferred the novel spatial structure, but 5-month-olds did not. In Experiment 3, the locations but not color/shape pairings were constant at test; 5-month-olds showed a novelty preference. Experiment 4 examined "online learning": We recorded eye movements of 8-month-olds watching a spatiotemporal sequence. Saccade latencies to predictable locations decreased. We argue that temporal order statistics involving informative spatial relations become available to infants during the first year after birth, assisted by multiple cues.
Resumo:
The alignment of model amyloid peptide YYKLVFFC is investigated in bulk and at a solid surface using a range of spectroscopic methods employing polarized radiation. The peptide is based on a core sequence of the amyloid beta (A beta) peptide, KLVFF. The attached tyrosine and cysteine units are exploited to yield information on alignment and possible formation of disulfide or dityrosine links. Polarized Raman spectroscopy on aligned stalks provides information on tyrosine orientation, which complements data from linear dichroism (LD) on aqueous solutions subjected to shear in a Couette cell. LD provides a detailed picture of alignment of peptide strands and aromatic residues and was also used to probe the kinetics of self-assembly. This suggests initial association of phenylalanine residues, followed by subsequent registry of strands and orientation of tyrosine residues. X-ray diffraction (XRD) data from aligned stalks is used to extract orientational order parameters from the 0.48 nm reflection in the cross-beta pattern, from which an orientational distribution function is obtained. X-ray diffraction on solutions subject to capillary flow confirmed orientation in situ at the level of the cross-beta pattern. The information on fibril and tyrosine orientation from polarized Raman spectroscopy is compared with results from NEXAFS experiments on samples prepared as films on silicon. This indicates fibrils are aligned parallel to the surface, with phenyl ring normals perpendicular to the surface. Possible disulfide bridging leading to peptide dimer formation was excluded by Raman spectroscopy, whereas dityrosine formation was probed by fluorescence experiments and was found not to occur except under alkaline conditions. Congo red binding was found not to influence the cross-beta XRD pattern.
Resumo:
Sequence-specific binding is demonstrated between pyrene-based tweezer molecules and soluble, high molar mass copolyimides. The binding involves complementary pi - pi stacking interactions, polymer chain-folding, and hydrogen bonding and is extremely sensitive to the steric environment around the pyromellitimide binding-site. A detailed picture of the intermolecular interactions involved has been obtained through single-crystal X-ray studies of tweezer complexes with model diimides. Ring-current magnetic shielding of polyimide protons by the pyrene '' arms '' of the tweezer molecule induces large complexation shifts of the corresponding H-1 NMR resonances, enabling specific triplet sequences to be identified by their complexation shifts. Extended comonomer sequences (triplets of triplets in which the monomer residues differ only by the presence or absence of a methyl group) can be '' read '' by a mechanism which involves multiple binding of tweezer molecules to adjacent diimide residues within the copolymer chain. The adjacent-binding model for sequence recognition has been validated by two conceptually different sets of tweezer binding experiments. One approach compares sequence-recognition events for copolyimides having either restricted or unrestricted triple-triplet sequences, and the other makes use of copolymers containing both strongly binding and completely nonbinding diimide residues. In all cases the nature and relative proportions of triple-triplet sequences predicted by the adjacent-binding model are fully consistent with the observed H-1 NMR data.
Resumo:
The elucidation of the domain content of a given protein sequence in the absence of determined structure or significant sequence homology to known domains is an important problem in structural biology. Here we address how successfully the delineation of continuous domains can be accomplished in the absence of sequence homology using simple baseline methods, an existing prediction algorithm (Domain Guess by Size), and a newly developed method (DomSSEA). The study was undertaken with a view to measuring the usefulness of these prediction methods in terms of their application to fully automatic domain assignment. Thus, the sensitivity of each domain assignment method was measured by calculating the number of correctly assigned top scoring predictions. We have implemented a new continuous domain identification method using the alignment of predicted secondary structures of target sequences against observed secondary structures of chains with known domain boundaries as assigned by Class Architecture Topology Homology (CATH). Taking top predictions only, the success rate of the method in correctly assigning domain number to the representative chain set is 73.3%. The top prediction for domain number and location of domain boundaries was correct for 24% of the multidomain set (±20 residues). These results have been put into context in relation to the results obtained from the other prediction methods assessed
Resumo:
Escherichia coli, the most common cause of bacteraemia in humans in the UK, can also cause serious diseases in animals. However the population structure, virulence and antimicrobial resistance genes of those from extraintestinal organs of livestock animals are poorly characterised. The aims of this study were to investigate the diversity of these isolates from livestock animals and to understand if there was any correlation between the virulence and antimicrobial resistance genes and the genetic backbone of the bacteria and if these isolates were similar to those isolated from humans. Here 39 E. coli isolates from liver (n=31), spleen (n=5) and blood (n=3) of cattle (n=34), sheep (n=3), chicken (n=1) and pig (n=1) were assigned to 19 serogroups with O8 being the most common (n=7), followed by O101, O20 (both n=3) and O153 (n=2). They belong to 29 multi-locus sequence types, 20 clonal complexes with ST23 (n=7), ST10 (n=6), ST117 and ST155 (both n=3) being most common and were distributed among phylogenetic group A (n=16), B1 (n=12), B2 (n=2) and D (n=9). The pattern of a subset of putative virulence genes was different in almost all isolates. No correlation between serogroups, animal hosts, MLST types, virulence and antimicrobial resistance genes was identified. The distributions of clonal complexes and virulence genes were similar to other extraintestinal or commensal E. coli from humans and other animals, suggesting a zoonotic potential. The diverse and various combinations of virulence genes implied that the infections were caused by different mechanisms and infection control will be challenging.
Resumo:
AIMS/HYPOTHESIS: The PPARGC1A gene coactivates multiple nuclear transcription factors involved in cellular energy metabolism and vascular stasis. In the present study, we genotyped 35 tagging polymorphisms to capture all common PPARGC1A nucleotide sequence variations and tested for association with metabolic and cardiovascular traits in 2,101 Danish and Estonian boys and girls from the European Youth Heart Study, a multicentre school-based cross-sectional cohort study. METHODS: Fasting plasma glucose concentrations, anthropometric variables and blood pressure were measured. Habitual physical activity and aerobic fitness were objectively assessed using uniaxial accelerometry and a maximal aerobic exercise stress test on a bicycle ergometer, respectively. RESULTS: In adjusted models, nominally significant associations were observed for BMI (rs10018239, p = 0.039), waist circumference (rs7656250, p = 0.012; rs8192678 [Gly482Ser], p = 0.015; rs3755863, p = 0.02; rs10018239, beta = -0.01 cm per minor allele copy, p = 0.043), systolic blood pressure (rs2970869, p = 0.018) and fasting glucose concentrations (rs11724368, p = 0.045). Stronger associations were observed for aerobic fitness (rs7656250, p = 0.005; rs13117172, p = 0.008) and fasting glucose concentrations (rs7657071, p = 0.002). None remained significant after correcting for the number of statistical comparisons. We proceeded by testing for gene x physical activity interactions for the polymorphisms that showed nominal evidence of association in the main effect models. None of these tests was statistically significant. CONCLUSIONS/INTERPRETATION: Variants at PPARGC1A may influence several metabolic traits in this European paediatric cohort. However, variation at PPARGC1A is unlikely to have a major impact on cardiovascular or metabolic health in these children.
Resumo:
OBJECTIVES: In 2009, CTX-M Enterobacteriaceae and Salmonella isolates were recovered from a UK pig farm, prompting studies into the dissemination of the resistance and to establish any relationships between the isolates. METHODS: PFGE was used to elucidate clonal relationships between isolates whilst plasmid profiling, restriction analysis, sequencing and PCR were used to characterize the CTX-M-harbouring plasmids. RESULTS: Escherichia coli, Klebsiella pneumoniae and Salmonella 4,5,12:i:- and Bovismorbificans resistant to cefotaxime (n = 65) were recovered and 63 were shown by PCR to harbour a group 1 CTX-M gene. The harbouring hosts were diverse, but the group 1 CTX-M plasmids were common. Three sequenced CTX-M plasmids from E. coli, K. pneumoniae and Salmonella enterica serotype 4,5,12:i:- were identical except for seven mutations and highly similar to IncI1 plasmid ColIb-P9. Two antimicrobial resistance regions were identified: one inserted upstream of yacABC harbouring ISCR2 transposases, sul2 and floR; and the other inserted within shfB of the pilV shufflon harbouring the ISEcp1 transposase followed by blaCTX-M-1. CONCLUSIONS: These data suggest that an ST108 IncI1 plasmid encoding a blaCTX-M-1 gene had disseminated across multiple genera on this farm, an example of horizontal gene transfer of the blaCTX-M-1 gene.
Resumo:
This article focuses on the identity accounts of a group of Chinese children who attend a heritage language school. Bakhtin’s concepts of ideological becoming, and authoritative and internally persuasive discourse, frame our exploration. Taking a dialogic view of language and learning raises questions about schools as socializing spaces and ideological environments. The children in this inquiry articulate their own ideological patterns of alignment. Those patterns, and the children's code switching, seem mostly determined by their socialization, language affiliations, friendship patterns, family situations, and legal access to particular schools. Five patterns of ideological becoming are presented. The children’s articulated preferences indicate that they assert their own ideological stances towards prevailing authoritative discourses, give voice to their own sense of agency and internally persuasive discourses, and respond to the ideological resources that mediate their linguistic repertoires.