13 resultados para Competing risks, Estimation of predator mortality, Over dispersion, Stochastic modeling
em National Center for Biotechnology Information - NCBI
Resumo:
Estimation of evolutionary distances has always been a major issue in the study of molecular evolution because evolutionary distances are required for estimating the rate of evolution in a gene, the divergence dates between genes or organisms, and the relationships among genes or organisms. Other closely related issues are the estimation of the pattern of nucleotide substitution, the estimation of the degree of rate variation among sites in a DNA sequence, and statistical testing of the molecular clock hypothesis. Mathematical treatments of these problems are considerably simplified by the assumption of a stationary process in which the nucleotide compositions of the sequences under study have remained approximately constant over time, and there now exist fairly extensive studies of stationary models of nucleotide substitution, although some problems remain to be solved. Nonstationary models are much more complex, but significant progress has been recently made by the development of the paralinear and LogDet distances. This paper reviews recent studies on the above issues and reports results on correcting the estimation bias of evolutionary distances, the estimation of the pattern of nucleotide substitution, and the estimation of rate variation among the sites in a sequence.
Resumo:
Sequence divergence acts as a potent barrier to homologous recombination; much of this barrier derives from an antirecombination activity exerted by mismatch repair proteins. An inverted repeat assay system with recombination substrates ranging in identity from 74% to 100% has been used to define the relationship between sequence divergence and the rate of mitotic crossing-over in yeast. To elucidate the role of the mismatch repair machinery in regulating recombination between mismatched substrates, we performed experiments in both wild-type and mismatch repair defective strains. We find that a single mismatch is sufficient to inhibit recombination between otherwise identical sequences, and that this inhibition is dependent on the mismatch repair system. Additional mismatches have a cumulative negative effect on the recombination rate. With sequence divergence of up to approximately 10%, the inhibitory effect of mismatches results mainly from antirecombination activity of the mismatch repair system. With greater levels of divergence, recombination is inefficient even in the absence of mismatch repair activity. In both wild-type and mismatch repair defective strains, an approximate log-linear relationship is observed between the recombination rate and the level of sequence divergence.
Resumo:
The cluA gene of Dictyostelium discoideum encodes a novel 150-kDa protein. Disruption of cluA results in clustering of mitochondria near the cell center. This is a striking difference from normal cells, whose mitochondria are dispersed uniformly throughout the cytoplasm. The mutant cell populations also exhibit an increased frequency of multinucleated cells, suggesting an impairment in cytokinesis. Both phenotypes are reversed by transformation of cluA− cells with a plasmid carrying a constitutively expressed cluA gene. The predicted sequence of the cluA gene product is homologous to sequences encoded by open reading frames in the genomes of Saccharomyces cerevisiae and Caenorhabditis elegans, but not to any known protein. The only exception is a short region with some homology to the 42-residue imperfect repeats present in the kinesin light chain, which probably function in protein–protein interaction. These studies identify a new class of proteins that appear to be required for the proper distribution of mitochondria.
Resumo:
The distribution of optimal local alignment scores of random sequences plays a vital role in evaluating the statistical significance of sequence alignments. These scores can be well described by an extreme-value distribution. The distribution’s parameters depend upon the scoring system employed and the random letter frequencies; in general they cannot be derived analytically, but must be estimated by curve fitting. For obtaining accurate parameter estimates, a form of the recently described ‘island’ method has several advantages. We describe this method in detail, and use it to investigate the functional dependence of these parameters on finite-length edge effects.
Resumo:
When many protein sequences are available for estimating the time of divergence between two species, it is customary to estimate the time for each protein separately and then use the average for all proteins as the final estimate. However, it can be shown that this estimate generally has an upward bias, and that an unbiased estimate is obtained by using distances based on concatenated sequences. We have shown that two concatenation-based distances, i.e., average gamma distance weighted with sequence length (d2) and multiprotein gamma distance (d3), generally give more satisfactory results than other concatenation-based distances. Using these two distance measures for 104 protein sequences, we estimated the time of divergence between mice and rats to be approximately 33 million years ago. Similarly, the time of divergence between humans and rodents was estimated to be approximately 96 million years ago. We also investigated the dependency of time estimates on statistical methods and various assumptions made by using sequence data from eubacteria, protists, plants, fungi, and animals. Our best estimates of the times of divergence between eubacteria and eukaryotes, between protists and other eukaryotes, and between plants, fungi, and animals were 3, 1.7, and 1.3 billion years ago, respectively. However, estimates of ancient divergence times are subject to a substantial amount of error caused by uncertainty of the molecular clock, horizontal gene transfer, errors in sequence alignments, etc.
Resumo:
A maximum likelihood estimator based on the coalescent for unequal migration rates and different subpopulation sizes is developed. The method uses a Markov chain Monte Carlo approach to investigate possible genealogies with branch lengths and with migration events. Properties of the new method are shown by using simulated data from a four-population n-island model and a source–sink population model. Our estimation method as coded in migrate is tested against genetree; both programs deliver a very similar likelihood surface. The algorithm converges to the estimates fairly quickly, even when the Markov chain is started from unfavorable parameters. The method was used to estimate gene flow in the Nile valley by using mtDNA data from three human populations.
Resumo:
The extent to which new technological knowledge flows across institutional and national boundaries is a question of great importance for public policy and the modeling of economic growth. In this paper we develop a model of the process generating subsequent citations to patents as a lens for viewing knowledge diffusion. We find that the probability of patent citation over time after a patent is granted fits well to a double-exponential function that can be interpreted as the mixture of diffusion and obsolescense functions. The results indicate that diffusion is geographically localized. Controlling for other factors, within-country citations are more numerous and come more quickly than those that cross country boundaries.