3 resultados para substitution rate

em DigitalCommons@The Texas Medical Center


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Models of DNA sequence evolution and methods for estimating evolutionary distances are needed for studying the rate and pattern of molecular evolution and for inferring the evolutionary relationships of organisms or genes. In this dissertation, several new models and methods are developed.^ The rate variation among nucleotide sites: To obtain unbiased estimates of evolutionary distances, the rate heterogeneity among nucleotide sites of a gene should be considered. Commonly, it is assumed that the substitution rate varies among sites according to a gamma distribution (gamma model) or, more generally, an invariant+gamma model which includes some invariable sites. A maximum likelihood (ML) approach was developed for estimating the shape parameter of the gamma distribution $(\alpha)$ and/or the proportion of invariable sites $(\theta).$ Computer simulation showed that (1) under the gamma model, $\alpha$ can be well estimated from 3 or 4 sequences if the sequence length is long; and (2) the distance estimate is unbiased and robust against violations of the assumptions of the invariant+gamma model.^ However, this ML method requires a huge amount of computational time and is useful only for less than 6 sequences. Therefore, I developed a fast method for estimating $\alpha,$ which is easy to implement and requires no knowledge of tree. A computer program was developed for estimating $\alpha$ and evolutionary distances, which can handle the number of sequences as large as 30.^ Evolutionary distances under the stationary, time-reversible (SR) model: The SR model is a general model of nucleotide substitution, which assumes (i) stationary nucleotide frequencies and (ii) time-reversibility. It can be extended to SRV model which allows rate variation among sites. I developed a method for estimating the distance under the SR or SRV model, as well as the variance-covariance matrix of distances. Computer simulation showed that the SR method is better than a simpler method when the sequence length $L>1,000$ bp and is robust against deviations from time-reversibility. As expected, when the rate varies among sites, the SRV method is much better than the SR method.^ The evolutionary distances under nonstationary nucleotide frequencies: The statistical properties of the paralinear and LogDet distances under nonstationary nucleotide frequencies were studied. First, I developed formulas for correcting the estimation biases of the paralinear and LogDet distances. The performances of these formulas and the formulas for sampling variances were examined by computer simulation. Second, I developed a method for estimating the variance-covariance matrix of the paralinear distance, so that statistical tests of phylogenies can be conducted when the nucleotide frequencies are nonstationary. Third, a new method for testing the molecular clock hypothesis was developed in the nonstationary case. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Primate immunodeficiency viruses, or lentiviruses (HIV-1, HIV-2, and SIV), and hepatitis delta virus (HDV) are RNA viruses characterized by rapid evolution. Infection by primate immunodeficiency viruses usually results in the development of acquired immunodeficiency syndrome (AIDS) in humans and AIDS-like illnesses in Asian macaques. Similarly, hepatitis delta virus infection causes hepatitis and liver cancer in humans. These viruses are heterogeneous within an infected patient and among individuals. Substitution rates in the virus genomes are high and vary in different lineages and among sites. Methods of phylogenetic analysis were applied to study the evolution of primate lentiviruses and the hepatitis delta virus. The following results have been obtained: (1) The substitution rate varies among sites of primate lentivirus genes according to the two parameter gamma distribution, with the shape parameter $\alpha$ being close to 1. (2) Primate immunodeficiency viruses fall into species-specific lineages. Therefore, viral transmissions across primate species are not as frequent as suggested by previous authors. (3) Primate lentiviruses have acquired or lost their pathogenicity several times in the course of evolution. (4) Evidence was provided for multiple infections of a North American patient by distinct HIV-1 strains of the B subtype. (5) Computer simulations indicate that the probability of committing an error in testing HIV transmission depends on the number of virus sequences and their length, the divergence times among sequences, and the model of nucleotide substitution. (6) For future investigations of HIV-1 transmissions, using longer virus sequences and avoiding the use of distant outgroups is recommended. (7) Hepatitis delta virus strains are usually related according to the geographic region of isolation. (8) Evolution of HDV is characterized by the rate of synonymous substitution being lower than the nonsynonymous substitution rate and the rate of evolution of the noncoding region. (9) There is a strong preference for G and C nucleotides at the third codon positions of the HDV coding region. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Theoretical and empirical studies were conducted on the pattern of nucleotide and amino acid substitution in evolution, taking into account the effects of mutation at the nucleotide level and purifying selection at the amino acid level. A theoretical model for predicting the evolutionary change in electrophoretic mobility of a protein was also developed by using information on the pattern of amino acid substitution. The specific problems studied and the main results obtained are as follows: (1) Estimation of the pattern of nucleotide substitution in DNA nuclear genomes. The pattern of point mutations and nucleotide substitutions among the four different nucleotides are inferred from the evolutionary changes of pseudogenes and functional genes, respectively. Both patterns are non-random, the rate of change varying considerably with nucleotide pair, and that in both cases transitions occur somewhat more frequently than transversions. In protein evolution, substitution occurs more often between amino acids with similar physico-chemical properties than between dissimilar amino acids. (2) Estimation of the pattern of nucleotide substitution in RNA genomes. The majority of mutations in retroviruses accumulate at the reverse transcription stage. Selection at the amino acid level is very weak, and almost non-existent between synonymous codons. The pattern of mutation is very different from that in DNA genomes. Nevertheless, the pattern of purifying selection at the amino acid level is similar to that in DNA genomes, although selection intensity is much weaker. (3) Evaluation of the determinants of molecular evolutionary rates in protein-coding genes. Based on rates of nucleotide substitution for mammalian genes, the rate of amino acid substitution of a protein is determined by its amino acid composition. The content of glycine is shown to correlate strongly and negatively with the rate of substitution. Empirical formulae, called indices of mutability, are developed in order to predict the rate of molecular evolution of a protein from data on its amino acid sequence. (4) Studies on the evolutionary patterns of electrophoretic mobility of proteins. A theoretical model was constructed that predicts the electric charge of a protein at any given pH and its isoelectric point from data on its primary and quaternary structures. Using this model, the evolutionary change in electrophoretic mobilities of different proteins and the expected amount of electrophoretically hidden genetic variation were studied. In the absence of selection for the pI value, proteins will on the average evolve toward a mildly basic pI. (Abstract shortened with permission of author.) ^