47 resultados para protein evolution


Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the advancement of high-throughput sequencing and dramatic increase of available genetic data, statistical modeling has become an essential part in the field of molecular evolution. Statistical modeling results in many interesting discoveries in the field, from detection of highly conserved or diverse regions in a genome to phylogenetic inference of species evolutionary history Among different types of genome sequences, protein coding regions are particularly interesting due to their impact on proteins. The building blocks of proteins, i.e. amino acids, are coded by triples of nucleotides, known as codons. Accordingly, studying the evolution of codons leads to fundamental understanding of how proteins function and evolve. The current codon models can be classified into three principal groups: mechanistic codon models, empirical codon models and hybrid ones. The mechanistic models grasp particular attention due to clarity of their underlying biological assumptions and parameters. However, they suffer from simplified assumptions that are required to overcome the burden of computational complexity. The main assumptions applied to the current mechanistic codon models are (a) double and triple substitutions of nucleotides within codons are negligible, (b) there is no mutation variation among nucleotides of a single codon and (c) assuming HKY nucleotide model is sufficient to capture essence of transition- transversion rates at nucleotide level. In this thesis, I develop a framework of mechanistic codon models, named KCM-based model family framework, based on holding or relaxing the mentioned assumptions. Accordingly, eight different models are proposed from eight combinations of holding or relaxing the assumptions from the simplest one that holds all the assumptions to the most general one that relaxes all of them. The models derived from the proposed framework allow me to investigate the biological plausibility of the three simplified assumptions on real data sets as well as finding the best model that is aligned with the underlying characteristics of the data sets. -- Avec l'avancement de séquençage à haut débit et l'augmentation dramatique des données géné¬tiques disponibles, la modélisation statistique est devenue un élément essentiel dans le domaine dé l'évolution moléculaire. Les résultats de la modélisation statistique dans de nombreuses découvertes intéressantes dans le domaine de la détection, de régions hautement conservées ou diverses dans un génome de l'inférence phylogénétique des espèces histoire évolutive. Parmi les différents types de séquences du génome, les régions codantes de protéines sont particulièrement intéressants en raison de leur impact sur les protéines. Les blocs de construction des protéines, à savoir les acides aminés, sont codés par des triplets de nucléotides, appelés codons. Par conséquent, l'étude de l'évolution des codons mène à la compréhension fondamentale de la façon dont les protéines fonctionnent et évoluent. Les modèles de codons actuels peuvent être classés en trois groupes principaux : les modèles de codons mécanistes, les modèles de codons empiriques et les hybrides. Les modèles mécanistes saisir une attention particulière en raison de la clarté de leurs hypothèses et les paramètres biologiques sous-jacents. Cependant, ils souffrent d'hypothèses simplificatrices qui permettent de surmonter le fardeau de la complexité des calculs. Les principales hypothèses retenues pour les modèles actuels de codons mécanistes sont : a) substitutions doubles et triples de nucleotides dans les codons sont négligeables, b) il n'y a pas de variation de la mutation chez les nucléotides d'un codon unique, et c) en supposant modèle nucléotidique HKY est suffisant pour capturer l'essence de taux de transition transversion au niveau nucléotidique. Dans cette thèse, je poursuis deux objectifs principaux. Le premier objectif est de développer un cadre de modèles de codons mécanistes, nommé cadre KCM-based model family, sur la base de la détention ou de l'assouplissement des hypothèses mentionnées. En conséquence, huit modèles différents sont proposés à partir de huit combinaisons de la détention ou l'assouplissement des hypothèses de la plus simple qui détient toutes les hypothèses à la plus générale qui détend tous. Les modèles dérivés du cadre proposé nous permettent d'enquêter sur la plausibilité biologique des trois hypothèses simplificatrices sur des données réelles ainsi que de trouver le meilleur modèle qui est aligné avec les caractéristiques sous-jacentes des jeux de données. Nos expériences montrent que, dans aucun des jeux de données réelles, tenant les trois hypothèses mentionnées est réaliste. Cela signifie en utilisant des modèles simples qui détiennent ces hypothèses peuvent être trompeuses et les résultats de l'estimation inexacte des paramètres. Le deuxième objectif est de développer un modèle mécaniste de codon généralisée qui détend les trois hypothèses simplificatrices, tandis que d'informatique efficace, en utilisant une opération de matrice appelée produit de Kronecker. Nos expériences montrent que sur un jeux de données choisis au hasard, le modèle proposé de codon mécaniste généralisée surpasse autre modèle de codon par rapport à AICc métrique dans environ la moitié des ensembles de données. En outre, je montre à travers plusieurs expériences que le modèle général proposé est biologiquement plausible.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In addition to differences in protein-coding gene sequences, changes in expression resulting from mutations in regulatory sequences have long been hypothesized to be responsible for phenotypic differences between species. However, unlike comparison of genome sequences, few studies, generally restricted to pairwise comparisons of closely related mammalian species, have assessed between-species differences at the transcriptome level. They reported that gene expression evolves at different rates in various organs and in a pattern that is overall consistent with neutral models of evolution. In the first part of my thesis, I investigated the evolution of gene expression in therian mammals (i.e.7 placental and marsupials), based on microarray data from human, mouse and the gray short-tailed opossum (Monodelphis domestica). In addition to autosomal genes, a special focus was given to the evolution of X-linked genes. The therian X chromosome was recently shown to be younger than previously thought and to harbor a specific gene content (e.g., genes involved in brain or reproductive functions) that is thought to have been shaped by specific sex-related evolutionary forces. Sex chromosomes derive from ordinary autosomes and their differentiation led to the degeneration of the Y chromosome (in mammals) or W chromosome (in birds). Consequently, X- or Z-linked genes differ in gene dose between males and females such that the heterogametic sex has half the X/Z gene dose compared to the ancestral state. To cope with this dosage imbalance, mammals have been reported to have evolved mechanisms of dosage compensation.¦In the first project, I could first show that transcriptomes evolve at different rates in different organs. Out of the five tissues I investigated, the testis is the most rapidly evolving organ at the gene expression level while the brain has the most conserved transcriptome. Second, my analyses revealed that mammalian gene expression evolution is compatible with a neutral model, where the rates of change in gene expression levels is linked to the efficiency of purifying selection in a given lineage, which, in turn, is determined by the long-term effective population size in that lineage. Thus, the rate of DNA sequence evolution, which could be expected to determine the rate of regulatory sequence change, does not seem to be a major determinant of the rate of gene expression evolution. Thus, most gene expression changes seem to be (slightly) deleterious. Finally, X-linked genes seem to have experienced elevated rates of gene expression change during the early stage of X evolution. To further investigate the evolution of mammalian gene expression, we generated an extensive RNA-Seq gene expression dataset for nine mammalian species and a bird. The analyses of this dataset confirmed the patterns previously observed with microarrays and helped to significantly deepen our view on gene expression evolution.¦In a specific project based on these data, I sought to assess in detail patterns of evolution of dosage compensation in amniotes. My analyses revealed the absence of male to female dosage compensation in monotremes and its presence in marsupials and, in addition, confirmed patterns previously described for placental mammals and birds. I then assessed the global level of expression of X/Z chromosomes and contrasted this with its ancestral gene expression levels estimated from orthologous autosomal genes in species with non-homologous sex chromosomes. This analysis revealed a lack of up-regulation for placental mammals, the level of expression of X-linked genes being proportional to gene dose. Interestingly, the ancestral gene expression level was at least partially restored in marsupials as well as in the heterogametic sex of monotremes and birds. Finally, I investigated alternative mechanisms of dosage compensation and found that gene duplication did not seem to be a widespread mechanism to restore the ancestral gene dose. However, I could show that placental mammals have preferentially down-regulated autosomal genes interacting with X-linked genes which underwent gene expression decrease, and thus identified a novel alternative mechanism of dosage compensation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The expansion of amino acid repeats is determined by a high mutation rate and can be increased or limited by selection. It has been suggested that recent expansions could be associated with the potential of adaptation to new environments. In this work, we quantify the strength of this association, as well as the contribution of potential confounding factors. RESULTS: Mammalian positively selected genes have accumulated more recent amino acid repeats than other mammalian genes. However, we found little support for an accelerated evolutionary rate as the main driver for the expansion of amino acid repeats. The most significant predictors of amino acid repeats are gene function and GC content. There is no correlation with expression level. CONCLUSIONS: Our analyses show that amino acid repeat expansions are causally independent from protein adaptive evolution in mammalian genomes. Relaxed purifying selection or positive selection do not associate with more or more recent amino acid repeats. Their occurrence is slightly favoured by the sequence context but mainly determined by the molecular function of the gene.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ever since the pre-molecular era, the birth of new genes with novel functions has been considered to be a major contributor to adaptive evolutionary innovation. Here, I review the origin and evolution of new genes and their functions in eukaryotes, an area of research that has made rapid progress in the past decade thanks to the genomics revolution. Indeed, recent work has provided initial whole-genome views of the different types of new genes for a large number of different organisms. The array of mechanisms underlying the origin of new genes is compelling, extending way beyond the traditionally well-studied source of gene duplication. Thus, it was shown that novel genes also regularly arose from messenger RNAs of ancestral genes, protein-coding genes metamorphosed into new RNA genes, genomic parasites were co-opted as new genes, and that both protein and RNA genes were composed from scratch (i.e., from previously nonfunctional sequences). These mechanisms then also contributed to the formation of numerous novel chimeric gene structures. Detailed functional investigations uncovered different evolutionary pathways that led to the emergence of novel functions from these newly minted sequences and, with respect to animals, attributed a potentially important role to one specific tissue--the testis--in the process of gene birth. Remarkably, these studies also demonstrated that novel genes of the various types significantly impacted the evolution of cellular, physiological, morphological, behavioral, and reproductive phenotypic traits. Consequently, it is now firmly established that new genes have indeed been major contributors to the origin of adaptive evolutionary novelties.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, protein-ligand docking has become a powerful tool for drug development. Although several approaches suitable for high throughput screening are available, there is a need for methods able to identify binding modes with high accuracy. This accuracy is essential to reliably compute the binding free energy of the ligand. Such methods are needed when the binding mode of lead compounds is not determined experimentally but is needed for structure-based lead optimization. We present here a new docking software, called EADock, that aims at this goal. It uses an hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 A around the center of mass of the ligand position in the crystal structure, and on the contrary to other benchmarks, our algorithm was fed with optimized ligand positions up to 10 A root mean square deviation (RMSD) from the crystal structure, excluding the latter. This validation illustrates the efficiency of our sampling strategy, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures could be explained by the presence of crystal contacts in the experimental structure. Finally, the ability of EADock to accurately predict binding modes on a real application was illustrated by the successful docking of the RGD cyclic pentapeptide on the alphaVbeta3 integrin, starting far away from the binding pocket.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract : Post-translational modifications such as proteolytic processing, phosphorylation, and glycosylation, add extra layers of complexity to proteomes and allow a finely tuned regulation of the activity of many proteins. The evolutionarily conserved cell-cycle and transcriptional regulator HCP-] is regulated by proteolytic maturation via which a stable heterodirneric complex of two cleaved subunits is formed from a single precursor protein. The human HCF-1 precursor is cleaved at six nearly identical 26 amino acid sequence repeats, called HCF-1pro repeats, which represent uncommon protease recognition sites dedicated to human HCF-1 proteolysis. This proteolytic maturation process is conserved in vertebrate HCF-1 homologues and is essential for the functions of the human protein in cell-cycle regulation; the mechanisms that execute and control HCF-1 proteolysis, however, remain poorly understood. In this dissertation I investigate the mechanisms of proteolytic maturation of HCF-1 proteins in different species. I show that the Drosophila homolog of human HCF-1, called dHCP, is proteolytically cleaved via a different mechanism than human HCF-1. dHCP is processed by the same protease, called Taspase], which cleaves one of the key developmental regulators in flies, the Trithorax protein. Maturation of HCP proteins via Taspase] cleavage is probably not particular to dHCP as many invertebrate HCP proteins, particularly insects and flatworms, possess Taspase] recognition sites. In contrast, the vertebrate HCF-1 proteins lack Taspase] recognition sites and the HCF-1pro repeats are not Taspase1 substrates, suggesting that multiple mechanisms for HCF-1 proteolytic maturation have appeared during evolution. I also show that the proteolytic activity responsible for the cleavage of the HCP- 1pro repeats is very difficult to characterize, being resistant to most protease inhibitors and very sensitive to biochemical fractionation. Moreover, the HCF-1pro repeats represent complex protease recognition sites and I demonstrate that, in addition to be the HCF-1 cleavage sites, these repeated sequences, also recruit the OG1cNAc transferase OGT. The OGT protein and the OG1cNAc modification of HCF-1 are both important for HCF-1pro repeat proteolysis. Interestingly, a human recombinant OGT purified from insect cells is able to induce cleavage of a HCF-1pro-repeat precursor in vitro, indicating that OGT either (i) induces HCF-1 autoproteolysis,(ii) is the HCF-1pro- repeat proteolytic activity itself, or (iii) physically associates with a proteolytic activity that is conserved in insect cells. In any case, OGT plays an important role in HCF-1 proteolytic maturation and perhaps a broader role in HCF-1 biological function. Résumé : Les modifications post-traductionelles pomme le clivage protéolytique, la phosphorylation, et la glycosylation, augmentent significativement la complexité des protéomes et permettent une régulation fine de l'activité de beaucoup de protéines. La protéine HCF-1, qui est un régulateur du cycle cellulaire et de la transcription, est elle- même régulée par clivage protéolytique. La protéine HCF-1 est en effet coupée en deux sous-unités qui s'associent l'une a l'autre pour former la protéine mature. Le précurseur de la protéine HCF-1 humaine est clivé à six sites correspondant à six séquences répétées nommées les HCF-1pro repeats, chacune composée de 26 acide aminés. Les HCF-1pro- repeats ne ressemblent ai aucune séquence de clivage protéolytique connue et sont présentes seulement dans les protéines HCF-1 chez les vertébrés. Bien que la maturation protéolytique d'HCF-1 soit essentielle pour les activités de cette protéine pendant le cycle cellulaire, les mécanismes qui la contrôlent restent inconnus. Au cours de mon travail de thèse, j'ai analysé les mécanismes de clivage protéolytique des protéines HCF dans différentes espèces. J'ai montré que la protéine de Drosophile homologue d'HCF-1 humaine nommée dHCF est clivée par une protéase nommée Taspase1. Ainsi, dHCF est clivé par la même protéase que celle qui induit la maturation protéolytique d'un des principaux facteurs du développement chez la mouche, la protéine Trithorax. La maturation de dHCF via le clivage par la Taspase1 n'est pas spécifique à la mouche, mais est probablement étendu à plusieurs protéines HCF chez les invertébrés, surtout dans les familles des insectes et des plathehninthes, car ces protéines HCF présentent des sites de reconnaissance pour la Taspasel. Par contre, les protéines HCF-1 chez les vertébrés n'ont pas de sites de reconnaissance pour la Taspasel et cela suggère que différents mécanismes de maturation des protéines HCF- ls ont apparu au cours de l'évolution. J'ai montré aussi que les HCF-1pro-repeats sont clivés par une activité protéolytique très difficile a identifier, car elle est résistante à la plupart des inhibiteurs de protéases, mais elle est très sensible au fractionnement biochimique. En plus, les HCF-1pro-repeats sont un site de protéolyse complexe qui ne sert pas seulement au clivage des protéines HCF- chez les vertébrés mais aussi à recruter l'enzyme responsable de la O- GlcNAcylation nommée OGT. La protéine OGT et la O-GlcNAcylatio d'HCF-1 sont toutes les deux importantes pour le clivage protéolytique des HCF1pro-repeats. Curieusement, la protéine OGT humaine produite dans des cellules d'insectes est capable de cliver les HCF-1pro repeats in vitro et cela suggère que OGT soit (i) induit le clivage autocatalytique cl'HCF-1, soit (ii) est elle-même l'activité protéolytique qui clive HCF4, soit (iii) est associée à une activité protéolytique conservée dans les cellules d'insectes qui a été co-purifiée avec OGT. En conclusion, OGT joue un rôle important dans la maturation protéolytique d'HCF-1 et peut-être aussi un rôle plus large dans les fonctions biologiques de la protéine HCF-1.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent years have seen a significant increase in understanding of the host genetic and genomic determinants of susceptibility to HIV-1 infection and disease progression, driven in large part by candidate gene studies, genome-wide association studies, genome-wide transcriptome analyses, and large-scale in vitro genome screens. These studies have identified common variants in some host loci that clearly influence disease progression, characterized the scale and dynamics of gene and protein expression changes in response to infection, and provided the first comprehensive catalogs of genes and pathways involved in viral replication. Experimental models of AIDS and studies in natural hosts of primate lentiviruses have complemented and in some cases extended these findings. As the relevant technology continues to progress, the expectation is that such studies will increase in depth (e.g., to include host whole exome and whole genome sequencing) and in breadth (in particular, by integrating multiple data types).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Only a very small fraction of long noncoding RNAs (lncRNAs) are well characterized. The evolutionary history of lncRNAs can provide insights into their functionality, but the absence of lncRNA annotations in non-model organisms has precluded comparative analyses. Here we present a large-scale evolutionary study of lncRNA repertoires and expression patterns, in 11 tetrapod species. We identify approximately 11,000 primate-specific lncRNAs and 2,500 highly conserved lncRNAs, including approximately 400 genes that are likely to have originated more than 300 million years ago. We find that lncRNAs, in particular ancient ones, are in general actively regulated and may function predominantly in embryonic development. Most lncRNAs evolve rapidly in terms of sequence and expression levels, but tissue specificities are often conserved. We compared expression patterns of homologous lncRNA and protein-coding families across tetrapods to reconstruct an evolutionarily conserved co-expression network. This network suggests potential functions for lncRNAs in fundamental processes such as spermatogenesis and synaptic transmission, but also in more specific mechanisms such as placenta development through microRNA production.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A heme-containing transmembrane ferric reductase domain (FRD) is found in bacterial and eukaryotic protein families, including ferric reductases (FRE), and NADPH oxidases (NOX). The aim of this study was to understand the phylogeny of the FRD superfamily. Bacteria contain FRD proteins consisting only of the ferric reductase domain, such as YedZ and short bFRE proteins. Full length FRE and NOX enzymes are mostly found in eukaryotic cells and all possess a dehydrogenase domain, allowing them to catalyze electron transfer from cytosolic NADPH to extracellular metal ions (FRE) or oxygen (NOX). Metazoa possess YedZ-related STEAP proteins, possibly derived from bacteria through horizontal gene transfer. Phylogenetic analyses suggests that FRE enzymes appeared early in evolution, followed by a transition towards EF-hand containing NOX enzymes (NOX5- and DUOX-like). An ancestral gene of the NOX(1-4) family probably lost the EF-hands and new regulatory mechanisms of increasing complexity evolved in this clade. Two signature motifs were identified: NOX enzymes are distinguished from FRE enzymes through a four amino acid motif spanning from transmembrane domain 3 (TM3) to TM4, and YedZ/STEAP proteins are identified by the replacement of the first canonical heme-spanning histidine by a highly conserved arginine. The FRD superfamily most likely originated in bacteria.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The increasing number of completely sequenced bacterial genomes allows comparing their architecture and genetic makeup. Such new information highlights the crucial role of lateral genetic exchanges in bacterial evolution and speciation. RESULTS: Here we analyzed the twelve sequenced genomes of Streptococcus pyogenes by a naïve approach that examines the preferential nucleotide usage along the chromosome, namely the usage of G versus C (GC-skew) and T versus A (TA-skew). The cumulative GC-skew plot presented an inverted V-shape composed of two symmetrical linear segments, where the minimum and maximum corresponded to the origin and terminus of DNA replication. In contrast, the cumulative TA-skew presented a V-shape, which segments were interrupted by several steep slopes regions (SSRs), indicative of a different nucleotide composition bias. Each S. pyogenes genome contained up to nine individual SSRs, encompassing all described strain-specific prophages. In addition, each genome contained a similar unique non-phage SSR, the core of which consisted of 31 highly homologous genes. This core includes the M-protein, other mga-related factors and other virulence genes, totaling ten intrinsic virulence genes. In addition to a high content in virulence-related genes and to a peculiar nucleotide bias, this SSR, which is 47 kb-long in a M1GAS strain, harbors direct repeats and a tRNA gene, suggesting a mobile element. Moreover, its complete absence in a M-protein negative group A Streptococcus natural isolate demonstrates that it could be spontaneously lost, but in vitro deletion experiments indicates that its excision occurred at very low rate. The stability of this SSR, combined to its presence in all sequenced S. pyogenes sequenced genome, suggests that it results from an ancient acquisition. CONCLUSION: Thus, this non-phagic SSR is compatible with a pathogenicity island, acquired before S. pyogenes speciation. Its potential excision might bear relevance for vaccine development, because vaccines targeting M-protein might select for M-protein-negative variants that still carry other virulence determinants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Xenopus laevis 68-kd and 74-kd albumin amino acid sequences are examined with respect to their relationship to the other known members of the albumin/alpha-fetoprotein/vitamin D-binding protein gene family. Each of the three members of this family presents a unique pattern of conserved regions indicating a differential selective pressure related to specific functional characteristics. Furthermore, an evolutionary tree of these genes was deduced from the divergence times calculated from direct nucleotide sequence comparisons of individual gene pairs. These calculations indicate that the vitamin D-binding protein/albumin separation occurred 560-600 million years (Myr) ago and the albumin/alpha-fetoprotein divergence 280 Myr ago. This observation leads to the hypothesis according to which the albumin/alpha-fetoprotein gene duplication occurred shortly after the amphibian/reptile separation. Consequently, and unlike mammals, amphibians and fishes should lack an alpha-fetoprotein in their serum at larval stages, which is consistent with a recent analysis of serum proteins in Xenopus laevis larvae. This hypothesis now will have to be tested further in additional lower vertebrates.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

TWEAK (TNF homologue with weak apoptosis-inducing activity) and Fn14 (fibroblast growth factor-inducible protein 14) are members of the tumor necrosis factor (TNF) ligand and receptor super-families. Having observed that Xenopus Fn14 cross-reacts with human TWEAK, despite its relatively low sequence homology to human Fn14, we examined the conservation in tertiary fold and binding interfaces between the two species. Our results, combining NMR solution structure determination, binding assays, extensive site-directed mutagenesis and molecular modeling, reveal that, in addition to the known and previously characterized β-hairpin motif, the helix-loop-helix motif makes an essential contribution to the receptor/ligand binding interface. We further discuss the insight provided by the structural analyses regarding how the cysteine-rich domains of the TNF receptor super-family may have evolved over time. DATABASE: Structural data are available in the Protein Data Bank/BioMagResBank databases under the accession codes 2KMZ, 2KN0 and 2KN1 and 17237, 17247 and 17252. STRUCTURED DIGITAL ABSTRACT: TWEAK binds to hFn14 by surface plasmon resonance (View interaction) xeFn14 binds to TWEAK by enzyme linked immunosorbent assay (View interaction) TWEAK binds to xeFn14 by surface plasmon resonance (View interaction) hFn14 binds to TWEAK by enzyme linked immunosorbent assay (View interaction).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Alternative splicing (AS) has the potential to greatly expand the functional repertoire of mammalian transcriptomes. However, few variant transcripts have been characterized functionally, making it difficult to assess the contribution of AS to the generation of phenotypic complexity and to study the evolution of splicing patterns. We have compared the AS of 309 protein-coding genes in the human ENCODE pilot regions against their mouse orthologs in unprecedented detail, utilizing traditional transcriptomic and RNAseq data. The conservation status of every transcript has been investigated, and each functionally categorized as coding (separated into coding sequence [CDS] or nonsense-mediated decay [NMD] linked) or noncoding. In total, 36.7% of human and 19.3% of mouse coding transcripts are species specific, and we observe a 3.6 times excess of human NMD transcripts compared with mouse; in contrast to previous studies, the majority of species-specific AS is unlinked to transposable elements. We observe one conserved CDS variant and one conserved NMD variant per 2.3 and 11.4 genes, respectively. Subsequently, we identify and characterize equivalent AS patterns for 22.9% of these CDS or NMD-linked events in nonmammalian vertebrate genomes, and our data indicate that functional NMD-linked AS is more widespread and ancient than previously thought. Furthermore, although we observe an association between conserved AS and elevated sequence conservation, as previously reported, we emphasize that 30% of conserved AS exons display sequence conservation below the average score for constitutive exons. In conclusion, we demonstrate the value of detailed comparative annotation in generating a comprehensive set of AS transcripts, increasing our understanding of AS evolution in vertebrates. Our data supports a model whereby the acquisition of functional AS has occurred throughout vertebrate evolution and is considered alongside amino acid change as a key mechanism in gene evolution.