3 resultados para Alignments.

em Duke University


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Determining the evolutionary relationships among the major lineages of extant birds has been one of the biggest challenges in systematic biology. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders. We used these genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomic analyses. FINDINGS: Here we present the datasets associated with the phylogenomic analyses, which include sequence alignment files consisting of nucleotides, amino acids, indels, and transposable elements, as well as tree files containing gene trees and species trees. Inferring an accurate phylogeny required generating: 1) A well annotated data set across species based on genome synteny; 2) Alignments with unaligned or incorrectly overaligned sequences filtered out; and 3) Diverse data sets, including genes and their inferred trees, indels, and transposable elements. Our total evidence nucleotide tree (TENT) data set (consisting of exons, introns, and UCEs) gave what we consider our most reliable species tree when using the concatenation-based ExaML algorithm or when using statistical binning with the coalescence-based MP-EST algorithm (which we refer to as MP-EST*). Other data sets, such as the coding sequence of some exons, revealed other properties of genome evolution, namely convergence. CONCLUSIONS: The Avian Phylogenomics Project is the largest vertebrate phylogenomics project to date that we are aware of. The sequence, alignment, and tree data are expected to accelerate analyses in phylogenomics and other related areas.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biological macromolecules can rearrange interdomain orientations when binding to various partners. Interdomain dynamics serve as a molecular mechanism to guide the transitions between orientations. However, our understanding of interdomain dynamics is limited because a useful description of interdomain motions requires an estimate of the probabilities of interdomain conformations, increasing complexity of the problem.

Staphylococcal protein A (SpA) has five tandem protein-binding domains and four interdomain linkers. The domains enable Staphylococcus aureus to evade the host immune system by binding to multiple host proteins including antibodies. Here, I present a study of the interdomain motions of two adjacent domains in SpA. NMR spin relaxation experiments identified a 6-residue flexible interdomain linker and interdomain motions. To quantify the anisotropy of the distribution of interdomain orientations, we measured residual dipolar couplings (RDCs) from the two domains with multiple alignments. The N-terminal domain was directly aligned by a lanthanide ion and not influenced by interdomain motions, so it acted as a reference frame to achieve motional decoupling. We also applied {\it de novo} methods to extract spatial dynamic information from RDCs and represent interdomain motions as a continuous distribution on the 3D rotational space. Significant anisotropy was observed in the distribution, indicating the motion populates some interdomain orientations more than others. Statistical thermodynamic analysis of the observed orientational distribution suggests that it is among the energetically most favorable orientational distributions for binding to antibodies. Thus, the affinity is enhanced by a pre-posed distribution of interdomain orientations while maintaining the flexibility required for function.

The protocol described above can be applied to other biological systems in general. Protein molecule calmodulin and RNA molecule trans-activation response element (TAR) also have intensive interdomain motions with relative small intradomain dynamics. Their interdomain motions were studied using our method based on published RDC data. Our results were consistent with literature results in general. The differences could be due to previous studies' use of physical models, which contain assumptions about potential energy and thus introduced non-experimental information into the interpretations.