984 resultados para Gene Duplication
Resumo:
Knowledge of the origin and evolution of gene families is critical to our understanding of the evolution of protein function. To gain a detailed understanding of the evolution of the small heat shock proteins (sHSPs) in plants, we have examined the evolutionary history of the chloroplast (CP)-localized sHSPs. Previously, these nuclear-encoded CP proteins had been identified only from angiosperms. This study reveals the presence of the CP sHSPs in a moss, Funaria hygrometrica. Two clones for CP sHSPs were isolated from a F. hygrometrica heat shock cDNA library that represent two distinct CP sHSP genes. Our analysis of the CP sHSPs reveals unexpected evolutionary relationships and patterns of sequence conservation. Phylogenetic analysis of the CP sHSPs with other plant CP sHSPs and eukaryotic, archaeal, and bacterial sHSPs shows that the CP sHSPs are not closely related to the cyanobacterial sHSPs. Thus, they most likely evolved via gene duplication from a nuclear-encoded cytosolic sHSP and not via gene transfer from the CP endosymbiont. Previous sequence analysis had shown that all angiosperm CP sHSPs possess a methionine-rich region in the N-terminal domain. The primary sequence of this region is not highly conserved in the F. hygrometrica CP sHSPs. This lack of sequence conservation indicates that sometime in land plant evolution, after the divergence of mosses from the common ancestor of angiosperms but before the monocot–dicot divergence, there was a change in the selective constraints acting on the CP sHSPs.
Resumo:
Cryptocyanin, a copper-free hexameric protein in crab (Cancer magister) hemolymph, has been characterized and the amino acid sequence has been deduced from its cDNA. It is markedly similar in sequence, size, and structure to hemocyanin, the copper-containing oxygen-transport protein found in many arthropods. Cryptocyanin does not bind oxygen, however, and lacks three of the six highly conserved copper-binding histidine residues of hemocyanin. Cryptocyanin has no phenoloxidase activity, although a phenoloxidase is present in the hemolymph. The concentration of cryptocyanin in the hemolymph is closely coordinated with the molt cycle and reaches levels higher than hemocyanin during premolt. Cryptocyanin resembles insect hexamerins in the lack of copper, molt cycle patterns of biosynthesis, and potential contributions to the new exoskeleton. Phylogenetic analysis of sequence similarities between cryptocyanin and other members of the hemocyanin gene family shows that cryptocyanin is closely associated with crustacean hemocyanins and suggests that cryptocyanin arose as a result of a hemocyanin gene duplication. The presence of both hemocyanin and cryptocyanin in one animal provides an example of how insect hexamerins might have evolved from hemocyanin. Our results suggest that multiple members of the hemocyanin gene family—hemocyanin, cryptocyanin, phenoloxidase, and hexamerins—may participate in two vital functions of molting animals, oxygen binding and molting. Cryptocyanin may provide important molecular data to further investigate evolutionary relationships among all molting animals.
Resumo:
This paper describes three distinct estrogen receptor (ER) subtypes: ERα, ERβ, and a unique type, ERγ, cloned from a teleost fish, the Atlantic croaker Micropogonias undulatus; the first identification of a third type of classical ER in vertebrate species. Phylogenetic analysis shows that ERγ arose through gene duplication from ERβ early in the teleost lineage and indicates that ERγ is present in other teleosts, although it has not been recognized as such. The Atlantic croaker ERγ shows amino acid differences in regions important for ligand binding and receptor activation that are conserved in all other ERγs. The three ER subtypes are genetically distinct and have different distribution patterns in Atlantic croaker tissues. In addition, ERβ and ERγ fusion proteins can each bind estradiol-17β with high affinity. The presence of three functional ERs in one species expands the role of ER multiplicity in estrogen signaling systems and provides a unique opportunity to investigate the dynamics and mechanisms of ER evolution.
Resumo:
The aryl hydrocarbon receptor (AHR) is a ligand-activated transcription factor through which halogenated aromatic hydrocarbons such as 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) cause altered gene expression and toxicity. The AHR belongs to the basic helix–loop–helix/Per-ARNT-Sim (bHLH-PAS) family of transcriptional regulatory proteins, whose members play key roles in development, circadian rhythmicity, and environmental homeostasis; however, the normal cellular function of the AHR is not yet known. As part of a phylogenetic approach to understanding the function and evolutionary origin of the AHR, we sequenced the PAS homology domain of AHRs from several species of early vertebrates and performed phylogenetic analyses of these AHR amino acid sequences in relation to mammalian AHRs and 24 other members of the PAS family. AHR sequences were identified in a teleost (the killifish Fundulus heteroclitus), two elasmobranch species (the skate Raja erinacea and the dogfish Mustelus canis), and a jawless fish (the lamprey Petromyzon marinus). Two putative AHR genes, designated AHR1 and AHR2, were found both in Fundulus and Mustelus. Phylogenetic analyses indicate that the AHR2 genes in these two species are orthologous, suggesting that an AHR gene duplication occurred early in vertebrate evolution and that multiple AHR genes may be present in other vertebrates. Database searches and phylogenetic analyses identified four putative PAS proteins in the nematode Caenorhabditis elegans, including possible AHR and ARNT homologs. Phylogenetic analysis of the PAS gene family reveals distinct clades containing both invertebrate and vertebrate PAS family members; the latter include paralogous sequences that we propose have arisen by gene duplication early in vertebrate evolution. Overall, our analyses indicate that the AHR is a phylogenetically ancient protein present in all living vertebrate groups (with a possible invertebrate homolog), thus providing an evolutionary perspective to the study of dioxin toxicity and AHR function.
Resumo:
Gnathostome vertebrates have multiple members of the Dlx family of transcription factors that are expressed during the development of several tissues considered to be vertebrate synapomorphies, including the forebrain, cranial neural crest, placodes, and pharyngeal arches. The Dlx gene family thus presents an ideal system in which to examine the relationship between gene duplication and morphological innovation during vertebrate evolution. Toward this end, we have cloned Dlx genes from the lamprey Petromyzon marinus, an agnathan vertebrate that occupies a critical phylogenetic position between cephalochordates and gnathostomes. We have identified four Dlx genes in P. marinus, whose orthology with gnathostome Dlx genes provides a model for how this gene family evolved in the vertebrate lineage. Differential expression of these lamprey Dlx genes in the forebrain, cranial neural crest, pharyngeal arches, and sensory placodes of lamprey embryos provides insight into the developmental evolution of these structures as well as a model of regulatory evolution after Dlx gene duplication events.
Resumo:
Pseudogenes are non-functioning copies of genes in genomic DNA, which may either result from reverse transcription from an mRNA transcript (processed pseudogenes) or from gene duplication and subsequent disablement (non-processed pseudogenes). As pseudogenes are apparently ‘dead’, they usually have a variety of obvious disablements (e.g., insertions, deletions, frameshifts and truncations) relative to their functioning homologs. We have derived an initial estimate of the size, distribution and characteristics of the pseudogene population in the Caenorhabditis elegans genome, performing a survey in ‘molecular archaeology’. Corresponding to the 18 576 annotated proteins in the worm (i.e., in Wormpep18), we have found an estimated total of 2168 pseudogenes, about one for every eight genes. Few of these appear to be processed. Details of our pseudogene assignments are available from http://bioinfo.mbb.yale.edu/genome/worm/pseudogene. The population of pseudogenes differs significantly from that of genes in a number of respects: (i) pseudogenes are distributed unevenly across the genome relative to genes, with a disproportionate number on chromosome IV; (ii) the density of pseudogenes is higher on the arms of the chromosomes; (iii) the amino acid composition of pseudogenes is midway between that of genes and (translations of) random intergenic DNA, with enrichment of Phe, Ile, Leu and Lys, and depletion of Asp, Ala, Glu and Gly relative to the worm proteome; and (iv) the most common protein folds and families differ somewhat between genes and pseudogenes—whereas the most common fold found in the worm proteome is the immunoglobulin fold and the most common ‘pseudofold’ is the C-type lectin. In addition, the size of a gene family bears little overall relationship to the size of its corresponding pseudogene complement, indicating a highly dynamic genome. There are in fact a number of families associated with large populations of pseudogenes. For example, one family of seven-transmembrane receptors (represented by gene B0334.7) has one pseudogene for every four genes, and another uncharacterized family (represented by gene B0403.1) is approximately two-thirds pseudogenic. Furthermore, over a hundred apparent pseudogenic fragments do not have any obvious homologs in the worm.
Resumo:
Plant chloroplasts originated from an endosymbiotic event by which an ancestor of contemporary cyanobacteria was engulfed by an early eukaryotic cell and then transformed into an organelle. Oxygenic photosynthesis is the specific feature of cyanobacteria and chloroplasts, and the photosynthetic machinery resides in an internal membrane system, the thylakoids. The origin and genesis of thylakoid membranes, which are essential for oxygenic photosynthesis, are still an enigma. Vipp1 (vesicle-inducing protein in plastids 1) is a protein located in both the inner envelope and the thylakoids of Pisum sativum and Arabidopsis thaliana. In Arabidopsis disruption of the VIPP1 gene severely affects the plant's ability to form properly structured thylakoids and as a consequence to carry out photosynthesis. In contrast, Vipp1 in Synechocystis appears to be located exclusively in the plasma membrane. Yet, as in higher plants, disruption of the VIPP1 gene locus leads to the complete loss of thylakoid formation. So far VIPP1 genes are found only in organisms carrying out oxygenic photosynthesis. They share sequence homology with a subunit encoded by the bacterial phage shock operon (PspA) but differ from PspA by a C-terminal extension of about 30 amino acids. In two cyanobacteria, Synechocystis and Anabaena, both a VIPP1 and a pspA gene are present, and phylogenetic analysis indicates that VIPP1 originated from a gene duplication of the latter and thereafter acquired its new function. It also appears that the C-terminal extension that discriminates VIPP1 proteins from PspA is important for its function in thylakoid formation.
Resumo:
Sm proteins form the core of small nuclear ribonucleoprotein particles (snRNPs), making them key components of several mRNA-processing assemblies, including the spliceosome. We report the 1.75-Å crystal structure of SmAP, an Sm-like archaeal protein that forms a heptameric ring perforated by a cationic pore. In addition to providing direct evidence for such an assembly in eukaryotic snRNPs, this structure (i) shows that SmAP homodimers are structurally similar to human Sm heterodimers, (ii) supports a gene duplication model of Sm protein evolution, and (iii) offers a model of SmAP bound to single-stranded RNA (ssRNA) that explains Sm binding-site specificity. The pronounced electrostatic asymmetry of the SmAP surface imparts directionality to putative SmAP–RNA interactions.
Resumo:
We analyze the evolutionary dynamics of three of the best-studied plant nuclear multigene families. The data analyzed derive from the genes that encode the small subunit of ribulose-1,5-bisphosphate carboxylase (rbcS), the gene family that encodes the enzyme chalcone synthase (Chs), and the gene family that encodes alcohol dehydrogenases (Adh). In addition, we consider the limited evolutionary data available on plant transposable elements. New Chs and rbcS genes appear to be recruited at about 10 times the rate estimated for Adh genes, and this is correlated with a much smaller average gene family size for Adh genes. In addition, duplication and divergence in function appears to be relatively common for Chs genes in flowering plant evolution. Analyses of synonymous nucleotide substitution rates for Adh genes in monocots reject a linear relationship with clock time. Replacement substitution rates vary with time in a complex fashion, which suggests that adaptive evolution has played an important role in driving divergence following gene duplication events. Molecular population genetic studies of Adh and Chs genes reveal high levels of molecular diversity within species. These studies also reveal that inter- and intralocus recombination are important forces in the generation allelic novelties. Moreover, illegitimate recombination events appear to be an important factor in transposable element loss in plants. When we consider the recruitment and loss of new gene copies, the generation of allelic diversity within plant species, and ectopic exchange among transposable elements, we conclude that recombination is a pervasive force at all levels of plant evolution.
Resumo:
Concerted evolution is often invoked to explain the diversity and evolution of the multigene families of major histocompatibility complex (MHC) genes and immunoglobulin (Ig) genes. However, this hypothesis has been controversial because the member genes of these families from the same species are not necessarily more closely related to one another than to the genes from different species. To resolve this controversy, we conducted phylogenetic analyses of several multigene families of the MHC and Ig systems. The results show that the evolutionary pattern of these families is quite different from that of concerted evolution but is in agreement with the birth-and-death model of evolution in which new genes are created by repeated gene duplication and some duplicate genes are maintained in the genome for a long time but others are deleted or become nonfunctional by deleterious mutations. We found little evidence that interlocus gene conversion plays an important role in the evolution of MHC and Ig multigene families.
Resumo:
Genetic mapping of wheat, maize, and rice and other grass species with common DNA probes has revealed remarkable conservation of gene content and gene order over the 60 million years of radiation of Poaceae. The linear organization of genes in some nine different genomes differing in basic chromosome number from 5 to 12 and nuclear DNA amount from 400 to 6,000 Mb, can be described in terms of only 25 “rice linkage blocks.” The extent to which this intergenomic colinearity is confounded at the micro level by gene duplication and micro-rearrangements is still an open question. Nevertheless, it is clear that the elucidation of the organization of the economically important grasses with larger genomes, such as maize (2n = 10, 4,500 Mb DNA), will, to a greater or lesser extent, be predicted from sequence analysis of smaller genomes such as rice, with only 400 Mb, which in turn may be greatly aided by knowledge of the entire sequence of Arabidopsis, which may be available as soon as the turn of the century. Comparative genetics will provide the key to unlock the genomic secrets of crop plants with bigger genomes than Homo sapiens.
Resumo:
The alcohol dehydrogenase (Adh; alcohol:NAD+ oxidoreductase, EC 1.1.1.1) gene family has two or three loci in a broad array of angiosperm species. The relative stability in the number of Adh loci led Gottlieb [Gottlieb, L. D. (1982) Science 216, 373-380] to propose that the Adh gene family arose from an ancient gene duplication. In this study, the isolation of three loci from the California fan palm (Washingtonia robusta) is reported. The three loci from palm are highly diverged. One palm Adh gene, referred to here as adhB, has been completely sequenced, including 950 nucleotides of the upstream regulatory region. For the second locus, adhA, 81% of the exon sequence is complete. Both show the same basic structure as grass Adh genes in terms of intron number and intron location. The third locus, adhC, for which only a small amount of sequence is available (12% of exon sequence) appears to be more highly diverged. Comparison of the Adh gene families from palms and grasses shows that the adh1 and adh2 genes of grasses, and the adhA and adhB genes of palms, arose by duplication following the divergence of the two families. This finding suggests that the multiple Adh loci in different monocot lineages are not the result of a single ancestral duplication but, rather, of multiple duplication events.
Resumo:
The genes for the protein synthesis elongation factors Tu (EF-Tu) and G (EF-G) are the products of an ancient gene duplication, which appears to predate the divergence of all extant organismal lineages. Thus, it should be possible to root a universal phylogeny based on either protein using the second protein as an outgroup. This approach was originally taken independently with two separate gene duplication pairs, (i) the regulatory and catalytic subunits of the proton ATPases and (ii) the protein synthesis elongation factors EF-Tu and EF-G. Questions about the orthology of the ATPase genes have obscured the former results, and the elongation factor data have been criticized for inadequate taxonomic representation and alignment errors. We have expanded the latter analysis using a broad representation of taxa from all three domains of life. All phylogenetic methods used strongly place the root of the universal tree between two highly distinct groups, the archaeons/eukaryotes and the eubacteria. We also find that a combined data set of EF-Tu and EF-G sequences favors placement of the eukaryotes within the Archaea, as the sister group to the Crenarchaeota. This relationship is supported by bootstrap values of 60-89% with various distance and maximum likelihood methods, while unweighted parsimony gives 58% support for archaeal monophyly.
Resumo:
Odorant receptors (ORs) on nasal olfactory sensory neurons are encoded by a large multigene family. Each member of the family is expressed in a small percentage of neurons that are confined to one of several spatial zones in the nose but are randomly distributed throughout that zone. This pattern of expression suggests that when the sensory neuron selects which OR gene to express it may be confined to a particular zonal gene set of several hundred OR genes but select from among the members of that set via a stochastic mechanism. Both locus-dependent and locus-independent models of OR gene choice have been proposed. To investigate the feasibility of these models, we determined the chromosomal locations of 21 OR genes expressed in four different spatial zones. We found that OR genes are clustered within multiple loci that are broadly distributed in the genome. These loci lie within paralogous chromosomal regions that appear to have arisen by duplications of large chromosomal domains followed by extensive gene duplication and divergence. Our studies show that OR genes expressed in the same zone map to numerous loci; moreover, a single locus can contain genes expressed in different zones. These findings raise the possibility that OR gene choice may be locus-independent or involve consecutive stochastic choices.
Resumo:
The myc gene family encodes a group of transcription factors that regulate cell proliferation and differentiation. These genes are widely studied because of their importance as proto-oncogenes. Phylogenetic analyses are described here for 45 Myc protein sequences representing c-, N-, L-, S-, and B-myc genes. A gene duplication early in vertebrate evolution produced the c-myc lineage and another lineage that later gave rise to the N- and L-myc lineages by another gene duplication. Evolutionary divergence in the myc gene family corresponds closely to the known branching order of the major vertebrate groups. The patterns of sequence evolution are described for five separate highly conserved regions, and these analyses show that differential rates of sequence divergence (= mosaic evolution) have occurred among conserved motifs. Further, the closely related dimerization partner protein Max exhibits significantly less sequence variability than Myc. It is suggested that the reduced variability in max stems from natural selection acting to preserve dimerization capability with products of myc and related genes.