Biblioteca Digital

17 resultados para molecular evolution

em DigitalCommons@The Texas Medical Center

Molecular evolution of primate immunodeficiency viruses and hepatitis delta virus

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Primate immunodeficiency viruses, or lentiviruses (HIV-1, HIV-2, and SIV), and hepatitis delta virus (HDV) are RNA viruses characterized by rapid evolution. Infection by primate immunodeficiency viruses usually results in the development of acquired immunodeficiency syndrome (AIDS) in humans and AIDS-like illnesses in Asian macaques. Similarly, hepatitis delta virus infection causes hepatitis and liver cancer in humans. These viruses are heterogeneous within an infected patient and among individuals. Substitution rates in the virus genomes are high and vary in different lineages and among sites. Methods of phylogenetic analysis were applied to study the evolution of primate lentiviruses and the hepatitis delta virus. The following results have been obtained: (1) The substitution rate varies among sites of primate lentivirus genes according to the two parameter gamma distribution, with the shape parameter $\alpha$ being close to 1. (2) Primate immunodeficiency viruses fall into species-specific lineages. Therefore, viral transmissions across primate species are not as frequent as suggested by previous authors. (3) Primate lentiviruses have acquired or lost their pathogenicity several times in the course of evolution. (4) Evidence was provided for multiple infections of a North American patient by distinct HIV-1 strains of the B subtype. (5) Computer simulations indicate that the probability of committing an error in testing HIV transmission depends on the number of virus sequences and their length, the divergence times among sequences, and the model of nucleotide substitution. (6) For future investigations of HIV-1 transmissions, using longer virus sequences and avoiding the use of distant outgroups is recommended. (7) Hepatitis delta virus strains are usually related according to the geographic region of isolation. (8) Evolution of HDV is characterized by the rate of synonymous substitution being lower than the nonsynonymous substitution rate and the rate of evolution of the noncoding region. (9) There is a strong preference for G and C nucleotides at the third codon positions of the HDV coding region. ^

Population and molecular evolution studies on the Melanocortin 1 receptor locus implicate its role in human pigmentation variation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human pigmentation is a complex trait with the observed variation caused by the varied production of eumelanin (brown/black melanins) and phaeomelanin (red/yellow melanins) by the melanocytes. The melanocortin 1 receptor (MC1R), a G protein-coupled receptor expressed in the melanocytes, is a regulator eu- and phaeomelanin synthesis, and MC1R mutations causing skin and coat color changes are known in many mammals. To understand the role of MC1R in human pigmentation variation, I have sequenced the MC1R gene in 121 individuals sampled from world populations. In addition, I have sequenced the MC1R gene in common and pygmy chimpanzees, gorilla, orangutan, and baboon to study the evolution of MC1R and to infer the ancestral human MC1R sequence. The ancestral MC1R sequence is observed in all 25 African individuals studied, but at lower frequencies in the other populations examined, especially in East and Southeast Asians. The Arg163Gln variant is absent in the Africans studied, almost absent in Europeans, and at a low frequency in Indians, but is at an exceptionally high frequency (70%) in East and Southeast Asians. To further evaluate the role of MC1R variants in human pigmentation variation, I have combined these molecular evolution and population studies with functional assays on MC1R variants and primate MC1Rs. ^

THEORETICAL STUDIES ON THE METHODS OF RECONSTRUCTING PHYLOGENETIC TREES FROM DNA SEQUENCE DATA (MOLECULAR EVOLUTION, HOMINOID EVOLUTION, COMPUTER SIMULATION)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

(1) A mathematical theory for computing the probabilities of various nucleotide configurations is developed, and the probability of obtaining the correct phylogenetic tree (model tree) from sequence data is evaluated for six phylogenetic tree-making methods (UPGMA, distance Wagner method, transformed distance method, Fitch-Margoliash's method, maximum parsimony method, and compatibility method). The number of nucleotides (m*) necessary to obtain the correct tree with a probability of 95% is estimated with special reference to the human, chimpanzee, and gorilla divergence. m* is at least 4,200, but the availability of outgroup species greatly reduces m* for all methods except UPGMA. m* increases if transitions occur more frequently than transversions as in the case of mitochondrial DNA. (2) A new tree-making method called the neighbor-joining method is proposed. This method is applicable either for distance data or character state data. Computer simulation has shown that the neighbor-joining method is generally better than UPGMA, Farris' method, Li's method, and modified Farris method on recovering the true topology when distance data are used. A related method, the simultaneous partitioning method, is also discussed. (3) The maximum likelihood (ML) method for phylogeny reconstruction under the assumption of both constant and varying evolutionary rates is studied, and a new algorithm for obtaining the ML tree is presented. This method gives a tree similar to that obtained by UPGMA when constant evolutionary rate is assumed, whereas it gives a tree similar to that obtained by the maximum parsimony tree and the neighbor-joining method when varying evolutionary rate is assumed. ^

Models of DNA sequence evolution

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Models of DNA sequence evolution and methods for estimating evolutionary distances are needed for studying the rate and pattern of molecular evolution and for inferring the evolutionary relationships of organisms or genes. In this dissertation, several new models and methods are developed.^ The rate variation among nucleotide sites: To obtain unbiased estimates of evolutionary distances, the rate heterogeneity among nucleotide sites of a gene should be considered. Commonly, it is assumed that the substitution rate varies among sites according to a gamma distribution (gamma model) or, more generally, an invariant+gamma model which includes some invariable sites. A maximum likelihood (ML) approach was developed for estimating the shape parameter of the gamma distribution $(\alpha)$ and/or the proportion of invariable sites $(\theta).$ Computer simulation showed that (1) under the gamma model, $\alpha$ can be well estimated from 3 or 4 sequences if the sequence length is long; and (2) the distance estimate is unbiased and robust against violations of the assumptions of the invariant+gamma model.^ However, this ML method requires a huge amount of computational time and is useful only for less than 6 sequences. Therefore, I developed a fast method for estimating $\alpha,$ which is easy to implement and requires no knowledge of tree. A computer program was developed for estimating $\alpha$ and evolutionary distances, which can handle the number of sequences as large as 30.^ Evolutionary distances under the stationary, time-reversible (SR) model: The SR model is a general model of nucleotide substitution, which assumes (i) stationary nucleotide frequencies and (ii) time-reversibility. It can be extended to SRV model which allows rate variation among sites. I developed a method for estimating the distance under the SR or SRV model, as well as the variance-covariance matrix of distances. Computer simulation showed that the SR method is better than a simpler method when the sequence length $L>1,000$ bp and is robust against deviations from time-reversibility. As expected, when the rate varies among sites, the SRV method is much better than the SR method.^ The evolutionary distances under nonstationary nucleotide frequencies: The statistical properties of the paralinear and LogDet distances under nonstationary nucleotide frequencies were studied. First, I developed formulas for correcting the estimation biases of the paralinear and LogDet distances. The performances of these formulas and the formulas for sampling variances were examined by computer simulation. Second, I developed a method for estimating the variance-covariance matrix of the paralinear distance, so that statistical tests of phylogenies can be conducted when the nucleotide frequencies are nonstationary. Third, a new method for testing the molecular clock hypothesis was developed in the nonstationary case. ^

Evolution of cis-regulatory elements in a repetitive sequence adjacent to a sea urchin aboral ectoderm -specific gene

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The creation, preservation, and degeneration of cis-regulatory elements controlling developmental gene expression are fundamental genome-level evolutionary processes about which little is known. In this study, critical differences in cis-regulatory elements controlling the expression of the sea urchin aboral ectoderm-specific spec genes were identified and explored. In genomes of species within the Strongylocentrotidae family, multiple copies of a repetitive sequence element termed RSR were present, but RSRs were not detected in genomes of species outside Strongylocentrotidae. RSRs are invariably associated with spec genes, and in Strongylocentrotus purpuratus, the spec2a RSR functioned as a transcriptional enhancer displaying greater activity than RSRs from the spec1 or spec2c paralogs. Single base-pair differences at two cis-regulatory elements within the spec2a RSR greatly increased the binding affinities of four transcription factors: SpCCAAT-binding factor at one element and SpOtx, SpGoosecoid, and SpGATA-E at another. The cis-regulatory elements to which SpCCAAT-binding factor, SpOtx, SpGoosecoid, and SpGATA-E bound were recent evolutionary acquisitions that could act either to activate or repress transcription, depending on the cell type. These elements were found in the spec2a RSR ortholog in Strongylocentrotus pallidus but not in the RSR orthologs of Strongylocentrotus droebachiensis or Hemicentrotus pulcherrimus. These results indicate that spec genes exhibit a dynamic pattern of cis-regulatory element evolution while stabilizing selection preserves their aboral ectoderm expression domain. ^

MATHEMATICAL STUDIES ON THE EVOLUTIONARY CHANGE OF DNA SEQUENCES

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the aim of understanding the mechanism of molecular evolution, mathematical problems on the evolutionary change of DNA sequences are studied. The problems studied and the results obtained are as follows: (1) Estimation of evolutionary distance between nucleotide sequences. Studying the pattern of nucleotide substitution for the case of unequal substitution rates, a new mathematical formula for estimating the average number of nucleotide substitutions per site between two homologous DNA sequences is developed. It is shown that this formula has a wider applicability than currently available formulae. A statistical method for estimating the number of nucleotide changes due to deletion and insertion is also developed. (2) Biases of the estimates of nucleotide substitutions obtained by the restriction enzyme method. The deviation of the estimate of nucleotide substitutions obtained by the restriction enzyme method from the true value is investigated theoretically. It is shown that the amount of the deviation depends on the nucleotides in the recognition sequence of the restriction enzyme used, unequal rates of substitution among different nucleotides, and nucleotide frequences, but the primary factor is the unequal rates of nucleotide substitution. When many different kinds of enzymes are used, however, the amount of average deviation is generally small. (3) Distribution of restriction fragment lengths. To see the effect of undetectable restriction fragments and fragment differences on the estimate of nucleotide differences, the theoretical distribution of fragment lengths is studied. This distribution depends on the type of restriction enzymes used as well as on the relative frequencies of four nucleotides. It is shown that undetectability of small fragments or fragment differences gives a serious underestimate of nucleotide substitutions when the length-difference method of estimation is used, but the extent of underestimation is small when the site-difference method is used. (4) Evolutionary relationships of DNA sequences in finite populations. A mathematical theory on the expected evolutionary relationships among DNA sequences (nucleons) randomly chosen from the same or different populations is developed under the assumption that the evolutionary change of nucleons is determined solely by mutation and random genetic drift. . . . (Author's abstract exceeds stipulated maximum length. Discontinued here with permission of author). UMI ^

STUDIES ON THE PATTERNS OF NUCLEOTIDE AND AMINO ACID SUBSTITUTION (ELECTROPHORESIS, MUTATION RATE, SELECTION, BASE, COMPOSITION)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Theoretical and empirical studies were conducted on the pattern of nucleotide and amino acid substitution in evolution, taking into account the effects of mutation at the nucleotide level and purifying selection at the amino acid level. A theoretical model for predicting the evolutionary change in electrophoretic mobility of a protein was also developed by using information on the pattern of amino acid substitution. The specific problems studied and the main results obtained are as follows: (1) Estimation of the pattern of nucleotide substitution in DNA nuclear genomes. The pattern of point mutations and nucleotide substitutions among the four different nucleotides are inferred from the evolutionary changes of pseudogenes and functional genes, respectively. Both patterns are non-random, the rate of change varying considerably with nucleotide pair, and that in both cases transitions occur somewhat more frequently than transversions. In protein evolution, substitution occurs more often between amino acids with similar physico-chemical properties than between dissimilar amino acids. (2) Estimation of the pattern of nucleotide substitution in RNA genomes. The majority of mutations in retroviruses accumulate at the reverse transcription stage. Selection at the amino acid level is very weak, and almost non-existent between synonymous codons. The pattern of mutation is very different from that in DNA genomes. Nevertheless, the pattern of purifying selection at the amino acid level is similar to that in DNA genomes, although selection intensity is much weaker. (3) Evaluation of the determinants of molecular evolutionary rates in protein-coding genes. Based on rates of nucleotide substitution for mammalian genes, the rate of amino acid substitution of a protein is determined by its amino acid composition. The content of glycine is shown to correlate strongly and negatively with the rate of substitution. Empirical formulae, called indices of mutability, are developed in order to predict the rate of molecular evolution of a protein from data on its amino acid sequence. (4) Studies on the evolutionary patterns of electrophoretic mobility of proteins. A theoretical model was constructed that predicts the electric charge of a protein at any given pH and its isoelectric point from data on its primary and quaternary structures. Using this model, the evolutionary change in electrophoretic mobilities of different proteins and the expected amount of electrophoretically hidden genetic variation were studied. In the absence of selection for the pI value, proteins will on the average evolve toward a mildly basic pI. (Abstract shortened with permission of author.) ^

Detecting the signature of natural selection with microsatellites

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Natural selection is one of the major factors in the evolution of all organisms. Detecting the signature of natural selection has been a central theme in evolutionary genetics. With the availability of microsatellite data, it is of interest to study how natural selection can be detected with microsatellites. ^ The overall aim of this research is to detect signatures of natural selection with data on genetic variation at microsatellite loci. The null hypothesis to be tested is the neutral mutation theory of molecular evolution, which states that different alleles at a locus have equivalent effects on fitness. Currently used tests of this hypothesis based on data on genetic polymorphism in natural populations presume that mutations at the loci follow the infinite allele/site models (IAM, ISM), in the sense that at each site at most only one mutation event is recorded, and each mutation leads to an allele not seen before in the population. Microsatellite loci, which are abundant in the genome, do not obey these mutation models, since the new alleles at such loci can be created either by contraction or expansion of tandem repeat sizes of core motifs. Since the current genome map is mainly composed of microsatellite loci and this class of loci is still most commonly studied in the context of human genome diversity, this research explores how the current test procedures for testing the neutral mutation hypothesis should be modified to take into account a generalized model of forward-backward stepwise mutations. In addition, recent literature also suggested that past demographic history of populations, presence of population substructure, and varying rates of mutations across loci all have confounding effects for detecting signatures of natural selection. ^ The effects of the stepwise mutation model and other confounding factors on detecting signature of natural selection are the main results of the research. ^

A coalescent analysis for modeling the mutation process in colorectal cancer

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Colorectal cancer is the forth most common diagnosed cancer in the United States. Every year about a hundred forty-seven thousand people will be diagnosed with colorectal cancer and fifty-six thousand people lose their lives due to this disease. Most of the hereditary nonpolyposis colorectal cancer (HNPCC) and 12% of the sporadic colorectal cancer show microsatellite instability. Colorectal cancer is a multistep progressive disease. It starts from a mutation in a normal colorectal cell and grows into a clone of cells that further accumulates mutations and finally develops into a malignant tumor. In terms of molecular evolution, the process of colorectal tumor progression represents the acquisition of sequential mutations. ^ Clinical studies use biomarkers such as microsatellite or single nucleotide polymorphisms (SNPs) to study mutation frequencies in colorectal cancer. Microsatellite data obtained from single genome equivalent PCR or small pool PCR can be used to infer tumor progression. Since tumor progression is similar to population evolution, we used an approach known as coalescent, which is well established in population genetics, to analyze this type of data. Coalescent theory has been known to infer the sample's evolutionary path through the analysis of microsatellite data. ^ The simulation results indicate that the constant population size pattern and the rapid tumor growth pattern have different genetic polymorphic patterns. The simulation results were compared with experimental data collected from HNPCC patients. The preliminary result shows the mutation rate in 6 HNPCC patients range from 0.001 to 0.01. The patients' polymorphic patterns are similar to the constant population size pattern which implies the tumor progression is through multilineage persistence instead of clonal sequential evolution. The results should be further verified using a larger dataset. ^

Molecular Studies in Treponema pallidum Evolution: Toward Clarity?

Relevância:

40.00% 40.00%

Publicador:

MOLECULAR AND GENOMIC BASED INSIGHTS INTO THE EVOLUTION OF ENTEROCOCCUS FAECIUM FROM COMMENSAL TO HOSPITAL-ADAPTED PATHOGEN

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The basis for the recent transition of Enterococcus faecium from a primarily commensal organism to one of the leading causes of hospital-acquired infections in the United States is not yet understood. To address this, the first part of my project assessed isolates from early outbreaks in the USA and South America using sequence analysis, colony hybridizations, and minimal inhibitory concentrations (MICs) which showed clinical isolates possess virulence and antibiotic resistance determinants that are less abundant or lacking in community isolates. I also revealed that the level of ampicillin resistance increased over time in clinical strains. By sequencing the pbp5 gene, I demonstrated an ~5% difference in the pbp5 gene between strains with MICs <4ug/ml and those with MICs >4µg/ml, but no specific sequence changes correlated with increases in MICs within the latter group. A 3-10% nucleotide difference was also seen in three other genes analyzed, which suggested the existence of two distinct subpopulations of E. faecium. This led to the second part of my project analyzing concatenated core gene sequences, SNPs, the 16S rRNA, and phylogenetics of 21 E. faecium genomes confirming two distinct clades; a community-associated (CA) clade and hospital-associated (HA) clade. Molecular clock calculations indicate that these two clades likely diverged ~ 300,000 to > 1 million years ago, long before the modern antibiotic era. Genomic analysis also showed that, in addition to core genomic differences, HA E. faecium harbor specific accessory genetic elements that may confer selection advantages over CA E. faecium. The third part of my project discovered 6 E. faecium genes with the newly identified “WxL” domain. My analyses, using RT-PCR, western blots, patient sera, whole-cell ELISA, and immunogold electron microscopy, indicated that E. faecium WxL genes exist in operons, encode bacterial cell surface localized proteins, that WxL proteins are antigenic in humans, and are more exposed on the surface of clinical isolates versus community isolates (even though they are ubiquitous in both clades). ELISAs and BIAcore analyses also showed that proteins encoded by these operons bind several different host extracellular matrix proteins, as well as to each other, suggesting a novel cell-surface complex. In summary, my studies provide new insights into the evolution of E. faecium by showing that there are two distantly related clades; one being more successful in the hospital setting. My studies also identified operons encoding WxL proteins whose characteristics could also contribute to colonization and virulence within this species.

Evolution of sensory complexity recorded in a myxobacterial genome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Myxobacteria are single-celled, but social, eubacterial predators. Upon starvation they build multicellular fruiting bodies using a developmental program that progressively changes the pattern of cell movement and the repertoire of genes expressed. Development terminates with spore differentiation and is coordinated by both diffusible and cell-bound signals. The growth and development of Myxococcus xanthus is regulated by the integration of multiple signals from outside the cells with physiological signals from within. A collection of M. xanthus cells behaves, in many respects, like a multicellular organism. For these reasons M. xanthus offers unparalleled access to a regulatory network that controls development and that organizes cell movement on surfaces. The genome of M. xanthus is large (9.14 Mb), considerably larger than the other sequenced delta-proteobacteria. We suggest that gene duplication and divergence were major contributors to genomic expansion from its progenitor. More than 1,500 duplications specific to the myxobacterial lineage were identified, representing >15% of the total genes. Genes were not duplicated at random; rather, genes for cell-cell signaling, small molecule sensing, and integrative transcription control were amplified selectively. Families of genes encoding the production of secondary metabolites are overrepresented in the genome but may have been received by horizontal gene transfer and are likely to be important for predation.

Selective elevation of c-{\it myc} RNA transcripts during clonal evolution of N-methyl-N-nitrosourea induced thymic lymphomas in AKR/J mice

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Untreated AKR mice develop spontaneous thymic lymphomas by 6-12 months of age. Lymphoma development is accelerated when young mice are injected with the carcinogen N-methyl-N-nitrosourea (MNU). Selected molecular and cellular events were compared during the latent period preceding "spontaneous" (retrovirally-induced) and MNU-induced thymic lymphoma development in AKR mice. These studies were undertaken to test the hypothesis that thymic lymphomas induced in the same inbred mouse strain by endogenous retroviruses and by a chemical carcinogen develop by different mechanisms.^ Immunofluorescence analysis of differentiation antigens showed that most MNU-induced lymphomas express an immature CD4-8+ profile. In contrast, spontaneous lymphomas represent each of the major lymphocyte subsets. These data suggest involvement of different target populations in MNU-induced and spontaneous lymphomas. Analyses at intervals after MNU treatment revealed selective expansion of the CD4-8+ J11d+ thymocyte subset at 8-10 weeks post-MNU in 68% of the animals examined, suggesting that these cells are targets for MNU-induced lymphomagenesis. Untreated age-matched animals showed no selective expansion of thymocyte subsets.^ Previous data have shown that both spontaneous and MNU-induced lymphomas are monoclonal or oligoclonal. Distinct rearrangement patterns of the J$\sb2$ region of the T-cell receptor $\beta$-chain showed emergence of clonal thymocyte populations beginning at 6-7 weeks after MNU treatment. However, lymphocytes from untreated animals showed no evidence of clonal expansion at the time intervals investigated.^ Activation of c-myc frequently occurs during development of B- and T- cell lymphomas. Both spontaneous and MNU-induced lymphomas showed increased c-myc transcript levels. Increased c-myc transcription was first detected at 6 weeks post-MNU, and persisted throughout the latent period. However, untreated animals showed no increases in c-myc transcripts at the time intervals examined. Another nuclear oncogene, c-fos, did not display a similar change in RNA transcription during the latent period.^ These results supports the hypothesis that MNU-induced and spontaneous tumors develop by multi-step pathways which are distinct with respect to the target cell population affected. Clonal emergence and c-myc deregulation are important steps in the development of both MNU-induced and spontaneous tumors, but the onset of these events is later in spontaneous tumor development. ^

Origins and evolution of VNTR loci: The apolipoprotein B 3$\sp\prime$ VNTR

Relevância:

30.00% 30.00%

Publicador:

Resumo:

I studied the apolipoprotein (apo) B 3$\sp\prime$ variable number tandem repeat (VNTR) and did computer simulations of the stepwise mutation model to address four questions: (1) How did the apo B VNTR originate? (2) What is the mutational mechanism of repeat number change at the apo B VNTR? (3) To what extent are population and molecular level events responsible for the determination of the contemporary apo B allele frequency distribution? (4) Can VNTR allele frequency distributions be explained by a simple and conservative mutation-drift model? I used three general approaches to address these questions: (1) I characterized the apo B VNTR region in non-human primate species; (2) I constructed haplotypes of polymorphic markers flanking the apo B VNTR in a sample of individuals from Lorrain, France and studied the associations between the flanking-marker haplotypes and apo B VNTR size; (3) I did computer simulations of the one-step stepwise mutation model and compared the results to real data in terms of four allele frequency distribution characteristics.^ The results of this work have allowed me to conclude that the apo B VNTR originated after an initial duplication of a sequence which is still present as a single copy sequence in New World monkey species. I conclude that this locus did not originate by the transposition of an array of repeats from somewhere else in the genome. It is unlikely that recombination is the primary mutational mechanism. Furthermore, the clustered nature of these associations implicates a stepwise mutational mechanism. From the high frequencies of certain haplotype-allele size combinations, it is evident that population level events have also been important in the determination of the apo B VNTR allele frequency distribution. Results from computer simulations of the one-step stepwise mutation model have allowed me to conclude that bimodal and multimodal allele frequency distributions are not unexpected at loci evolving via stepwise mutation mechanisms. Short tandem repeat loci fit the stepwise mutation model best, followed by microsatellite loci. I therefore conclude that there are differences in the mutational mechanisms of VNTR loci as classed by repeat unit size. (Abstract shortened by UMI.) ^

The evolution of paired domains

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pax genes are important developmental control genes. They are involved in nervous system development, organogenesis and oncogenesis. A DNA specific binding domain called the paired domain, which is well conserved during evolution, defines Pax genes. Furthermore, Pax genes are also conserved in terms of their functions. For example, the Pax-6 gene has been showed to be one of the master control genes for eye development both in Drosophila and vertebrates. All of these properties of Pax genes make them an excellent model for studying the evolution of gene function. ^ Molecular evolutionary studies of paired domain are carried out in this study. Five Pax genes from cnidarians, which are the most primitive organisms possessing a nervous system, were isolated and characterized for their DNA binding properties. By combining data obtained from Genbank and this study, the phylogenetic relationship between Pax genes was studied. It was found that Pax genes could be divided into five groups: Pax-1/9, Pax-3 /7, Pax-A, Pax-2/5/ 8/B, and Pax- 4/6. Furthermore, Pax-2/5/8/ B, Pax-A and Pax-4/6 could be clustered into a supergroup I, while Pax-1/9 and Pax-3/7 could be clustered into supergroup II. The phylogeny was also supported by studies on DNA binding properties of paired domains from different groups. A statistical method was applied to infer the critical amino acid residue substitutions between two supergroups and within the supergroup I. It was found that two amino acid residues were mainly responsible for the difference of DNA binding between two supergroups, while only one amino acid was critical for the evolution of novel DNA binding properties of Pax-4/6 group from ancestor. Evolutionary implications of these data are also discussed. ^

«
1
2
»