19 resultados para SEQUENCE EVOLUTION
em CentAUR: Central Archive University of Reading - UK
Resumo:
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Resumo:
We investigate the performance of phylogenetic mixture models in reducing a well-known and pervasive artifact of phylogenetic inference known as the node-density effect, comparing them to partitioned analyses of the same data. The node-density effect refers to the tendency for the amount of evolutionary change in longer branches of phylogenies to be underestimated compared to that in regions of the tree where there are more nodes and thus branches are typically shorter. Mixture models allow more than one model of sequence evolution to describe the sites in an alignment without prior knowledge of the evolutionary processes that characterize the data or how they correspond to different sites. If multiple evolutionary patterns are common in sequence evolution, mixture models may be capable of reducing node-density effects by characterizing the evolutionary processes more accurately. In gene-sequence alignments simulated to have heterogeneous patterns of evolution, we find that mixture models can reduce node-density effects to negligible levels or remove them altogether, performing as well as partitioned analyses based on the known simulated patterns. The mixture models achieve this without knowledge of the patterns that generated the data and even in some cases without specifying the full or true model of sequence evolution known to underlie the data. The latter result is especially important in real applications, as the true model of evolution is seldom known. We find the same patterns of results for two real data sets with evidence of complex patterns of sequence evolution: mixture models substantially reduced node-density effects and returned better likelihoods compared to partitioning models specifically fitted to these data. We suggest that the presence of more than one pattern of evolution in the data is a common source of error in phylogenetic inference and that mixture models can often detect these patterns even without prior knowledge of their presence in the data. Routine use of mixture models alongside other approaches to phylogenetic inference may often reveal hidden or unexpected patterns of sequence evolution and can improve phylogenetic inference.
Resumo:
Inversions breaking the 1041 bp int1h-1 or the 9.5-kb int22h-1 sequence of the F8 gene cause hemophilia A in 1/30,000 males. These inversions are due to homologous recombination between the above sequences and their inverted copies on the same DNA molecule, respectively, int1h-2 and int22h-2 or int22h-3. We find that (1) int1h and int22h duplicated more than 25 million years ago; (2) the identity of the copies (>99%) of these sequences in humans and other primates is due to gene conversion; (3) gene conversion is most frequent in the internal regions of int22h; (4) breakpoints of int22h-related inversions also tend to involve the internal regions of int22h; (5) sequence variations in a sample of human X chromosomes defined eight haplotypes of int22h-1 and 27 of int22h-2 plus int22h-3; (6) the latter two sequences, which lie, respectively, 500 and 600 kb telomeric to int22h-1 are five-fold more identical when in cis than when in trans, thus suggesting that gene conversion may be predominantly intrachromosomal; (7) int1h, int22h, and flanking sequences evolved at a rate of about 0.1% substitutions per million years during the divergence between humans and other primates, except for int1h during the human-chimpanzee divergence, when its rate of evolution was significantly lower. This is reminiscent of the slower evolution of palindrome arms in the male specific regions of the Y chromosome and we propose, as an explanation, that intrachromosomal gene conversion and cosegregation of the duplicated regions favors retention of the ancestral sequence and thus reduces the evolution rate.
Resumo:
The cupin superfamily is a group of functionally diverse proteins that are found in all three kingdoms of life, Archaea, Eubacteria, and Eukaryota. These proteins have a characteristic signature domain comprising two histidine- containing motifs separated by an intermotif region of variable length. This domain consists of six beta strands within a conserved beta barrel structure. Most cupins, such as microbial phosphomannose isomerases (PMIs), AraC- type transcriptional regulators, and cereal oxalate oxidases (OXOs), contain only a single domain, whereas others, such as seed storage proteins and oxalate decarboxylases (OXDCs), are bi-cupins with two pairs of motifs. Although some cupins have known functions and have been characterized at the biochemical level, the majority are known only from gene cloning or sequencing projects. In this study, phylogenetic analyses were conducted on the conserved domain to investigate the evolution and structure/function relationships of cupins, with an emphasis on single- domain plant germin-like proteins (GLPs). An unrooted phylogeny of cupins from a wide spectrum of evolutionary lineages identified three main clusters, microbial PMIs, OXDCs, and plant GLPs. The sister group to the plant GLPs in the global analysis was then used to root a phylogeny of all available plant GLPs. The resulting phylogeny contained three main clades, classifying the GLPs into distinct subfamilies. It is suggested that these subfamilies correlate with functional categories, one of which contains the bifunctional barley germin that has both OXO and superoxide dismutase (SOD) activity. It is proposed that GLPs function primarily as SODs, enzymes that protect plants from the effects of oxidative stress. Closer inspection of the DNA sequence encoding the intermotif region in plant GLPs showed global conservation of thymine in the second codon position, a character associated with hydrophobic residues. Since many of these proteins are multimeric and enzymatically inactive in their monomeric state, this conservation of hydrophobicity is thought to be associated with the need to maintain the various monomer- monomer interactions. The type of structure-based predictive analysis presented in this paper is an important approach for understanding gene function and evolution in an era when genomes from a wide range of organisms are being sequenced at a rapid rate.
Resumo:
Germin and germin-like proteins (GLPs) are encoded by a family of genes found in all plants. They are part of the cupin superfamily of biochemically diverse proteins, a superfamily that has a conserved tertiary structure, though with limited similarity in primary sequence. The subgroups of GLPs have different enzyme functions that include the two hydrogen peroxide-generating enzymes, oxalate oxidase (OxO) and superoxide dismutase. This review summarizes the sequence and structural details of GLPs and also discusses their evolutionary progression, particularly their amplification in gene number during the evolution of the land plants. In terms of function, the GLPs are known to be differentially expressed during specific periods of plant growth and development, a pattern of evolutionary subfunctionalization. They are also implicated in the response of plants to biotic (viruses, bacteria, mycorrhizae, fungi, insects, nematodes, and parasitic plants) and abiotic (salt, heat/cold, drought, nutrient, and metal) stress. Most detailed data come from studies of fungal pathogenesis in cereals. This involvement with the protection of plants from environmental stress of various types has led to numerous plant breeding studies that have found links between GLPs and QTLs for disease and stress resistance. In addition the OxO enzyme has considerable commercial significance, based principally on its use in the medical diagnosis of oxalate concentration in plasma and urine. Finally, this review provides information on the nutritional importance of these proteins in the human diet, as several members are known to be allergenic, a feature related to their thermal stability and evolutionary connection to the seed storage proteins, also members of the cupin superfamily.
Resumo:
A longstanding debate in evolutionary biology concerns whether species diverge gradually through time or by rapid punctuational bursts at the time of speciation. The theory of punctuated equilibrium states that evolutionary change is characterised by short periods of rapid evolution followed by longer periods of stasis in which no change occurs. Despite years of work seeking evidence for punctuational change in the fossil record, the theory remains contentious. Further there is little consensus as to the size of the contribution of punctuational changes to overall evolutionary divergence. Here we review recent developments which show that punctuational evolution is common and widespread in gene sequence data.
Resumo:
Flavivirus replication is mediated by interactions between complementary ssRNA sequences of the 5'- and 3'-termini that form dsRNA cyclisation stems or panhandles, varying in length, sequence and specific location in the mosquito-borne, tick-borne, non-vectored and non-classified flaviviruses. In this manuscript we manually aligned the flavivirus 5'UTRs and adjacent capsid genes and revealed significantly more homology than has hitherto been identified. Analysis of the alignments revealed that the panhandles represent evolutionary remnants of a long cyclisation domain that probably emerged through duplication of one of the UTR termini.
Resumo:
Some families of mammalian interspersed repetitive DNA, such as the Alu SINE sequence, appear to have evolved by the serial replacement of one active sequence with another, consistent with there being a single source of transposition: the "master gene." Alternative models, in which multiple source sequences are simultaneously active, have been called "transposon models." Transposon models differ in the proportion of elements that are active and in whether inactivation occurs at the moment of transposition or later. Here we examine the predictions of various types of transposon model regarding the patterns of sequence variation expected at an equilibrium between transposition, inactivation, and deletion. Under the master gene model, all bifurcations in the true tree of elements occur in a single lineage. We show that this property will also hold approximately for transposon models in which most elements are inactive and where at least some of the inactivation events occur after transposition. Such tree shapes are therefore not conclusive evidence for a single source of transposition.
Resumo:
The cephalochordate amphioxus is the best available proxy for the last common invertebrate ancestor of the vertebrates. During the last decade, the developmental genetics of amphioxus have been extensively examined for insights into the evolutionary origin and early evolution of the vertebrates. Comparisons between expression domains of homologous genes in amphioxus and vertebrates have strengthened proposed homologies between specific body parts. Molecular genetic studies have also highlighted parallels in the developmental mechanisms of amphioxus and vertebrates. In both groups, a similar nested pattern of Hox gene expression is involved in rostrocaudal patterning of the neural tube, and homologous genes also appear to be involved in dorsoventral neural patterning. Studies of amphioxus molecular biology have also hinted that the protochordate ancestor of the vertebrates included cell populations that modified their developmental genetic pathways during early vertebrate evolution to yield definitive neural crest and neurogenic placodes. We also discuss how the application of expressed sequence tag and gene-mapping approaches to amphioxus have combined with developmental studies to advance our understanding of chordate genome evolution. We conclude by considering the potential offered by the sequencing of the amphioxus genome, which was completed in late 2004.
Resumo:
We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.
Resumo:
Objectives: Influenza A H3N2 viruses isolated recently have characteristic receptor binding properties that may decrease susceptibility to neuraminidase inhibitor drugs. A panel of clinical isolates and recombinant viruses generated by reverse genetics were characterized and tested for susceptibility to zanamivir. Methods: Plaque reduction assays and neuraminidase enzyme inhibition assays were used to assess susceptibility to zanamivir. Receptor binding properties of the viruses were characterized by differential agglutination of red blood cells (RBCs) from different species. Sequence analysis of the haemagglutinin (HA) and neuraminidase (NA) genes was carried out. Results: Characterization of a panel of H3N2 clinical isolates from 1968 to 2000 showed a gradual decrease in agglutination of chicken and guinea pig RBCs over time, although all isolates could agglutinate turkey RBCs equally. Sequence analysis of the HA and NA genes identified mutations in conserved residues of the HA1 receptor binding site, in particular Leu-226 --> Ile-226/Val-226, and modification of potential glycosylation site motifs. This may be indicative of changes in virus binding to sialic acid (SA) receptors in recent years. Although recent isolates had reduced susceptibility to zanamivir in MDCK cell based plaque reduction assays, no difference was found in an NA enzyme-inhibition assay. Assays with recombinant isogenic viruses showed that the recent HA, but not the NA, conferred reduced susceptibility to zanamivir. Conclusion: This study demonstrates that recent clinical isolates of influenza A H3N2 virus no longer agglutinate chicken RBCs, but despite significant receptor binding changes as a result of changes in HA, there was little variation in sensitivity of the NA to zanamivir.
Resumo:
The vertebrate Zic gene family encodes C2H2 zinc finger transcription factors closely related to the Gli proteins. Zic genes are expressed in multiple areas of developing vertebrate embryos, including the dorsal neural tube where they act as potent neural crest inducers. Here we describe the characterization of a Zic ortholog from the amphioxus Branchiostoma floridae and further describe the expression of a Zic ortholog from the ascidian Ciona intestinalis. Molecular phylogenetic analysis and sequence comparisons suggest the gene duplications that formed the vertebrate Zic family were specific to the vertebrate lineage. In Ciona maternal CiZic/Ci-macho1 transcripts are localized during cleavage stages by asymmetric cell division, whereas zygotic expression by neural plate cells commences during neurulation. The amphioxus Zic ortholog AmphiZic is expressed in dorsal mesoderm and ectoderm during gastrulation, before being eliminated first from midline cells and then from all neurectoderm during neurulation. After neurulation, expression is reactivated in the dorsal neural tube and dorsolateral somite. Comparison of CiZic and AmphiZic expression with vertebrate Zic expression leads to two main conclusions. First, Zic expression allows us to define homologous compartments between vertebrate and amphioxus somites, showing primitive subdivision of vertebrate segmented mesoderm. Second, we show that neural Zic expression is a chordate synapomorphy, whereas the precise pattern of neural expression has evolved differently on the different chordate lineages. Based on these observations we suggest that a change in Zic regulation, specifically the evolution of a dorsal neural expression domain in vertebrate neurulae, was an important step in the evolution of the neural crest.
Resumo:
This article presents a statistical method for detecting recombination in DNA sequence alignments, which is based on combining two probabilistic graphical models: (1) a taxon graph (phylogenetic tree) representing the relationship between the taxa, and (2) a site graph (hidden Markov model) representing interactions between different sites in the DNA sequence alignments. We adopt a Bayesian approach and sample the parameters of the model from the posterior distribution with Markov chain Monte Carlo, using a Metropolis-Hastings and Gibbs-within-Gibbs scheme. The proposed method is tested on various synthetic and real-world DNA sequence alignments, and we compare its performance with the established detection methods RECPARS, PLATO, and TOPAL, as well as with two alternative parameter estimation schemes.
Resumo:
A recently emerging bleeding canker disease, caused by Pseudomonas syringae pathovar aesculi (Pae), is threatening European horse chestnut in northwest Europe. Very little is known about the origin and biology of this new disease. We used the nucleotide sequences of seven commonly used marker genes to investigate the phylogeny of three strains isolated recently from bleeding stem cankers on European horse chestnut in Britain (E-Pae). On the basis of these sequences alone, the E-Pae strains were identical to the Pae type-strain (I-Pae), isolated from leaf spots on Indian horse chestnut in India in 1969. The phylogenetic analyses also showed that Pae belongs to a distinct clade of P. syringae pathovars adapted to woody hosts. We generated genome-wide Illumina sequence data from the three E-Pae strains and one strain of I-Pae. Comparative genomic analyses revealed pathovar-specific genomic regions in Pae potentially implicated in virulence on a tree host, including genes for the catabolism of plant-derived aromatic compounds and enterobactin synthesis. Several gene clusters displayed intra-pathovar variation, including those encoding type IV secretion, a novel fatty acid biosynthesis pathway and a sucrose uptake pathway. Rates of single nucleotide polymorphisms in the four Pae genomes indicate that the three E-Pae strains diverged from each other much more recently than they diverged from I-Pae. The very low genetic diversity among the three geographically distinct E-Pae strains suggests that they originate from a single, recent introduction into Britain, thus highlighting the serious environmental risks posed by the spread of an exotic plant pathogenic bacterium to a new geographic location. The genomic regions in Pae that are absent from other P. syringae pathovars that infect herbaceous hosts may represent candidate genetic adaptations to infection of the woody parts of the tree.
Resumo:
The structure and evolution of the Arctic stratospheric polar vortex is assessed during opposing phases of, primarily, the El Niño–Southern Oscillation (ENSO) and the Quasi-Biennial Oscillation (QBO), but the 11 year solar cycle and winters following large volcanic eruptions are also examined. The analysis is performed by taking 2-D moments of vortex potential vorticity (PV) fields which allow the area and centroid of the vortex to be calculated throughout the ERA-40 reanalysis data set (1958–2002). Composites of these diagnostics for the different phases of the natural forcings are then considered. Statistically significant results are found regarding the structure and evolution of the vortex during, in particular, the ENSO and QBO phases. When compared with the more traditional zonal mean zonal wind diagnostic at 60°N, the moment-based diagnostics are far more robust and contain more information regarding the state of the vortex. The study details, for the first time, a comprehensive sequence of events which map the evolution of the vortex during each of the forcings throughout an extended winter period.