48 resultados para PROTEIN-CODING GENES
Resumo:
Assessing the contribution of promoters and coding sequences to gene evolution is an important step toward discovering the major genetic determinants of human evolution. Many specific examples have revealed the evolutionary importance of cis-regulatory regions. However, the relative contribution of regulatory and coding regions to the evolutionary process and whether systemic factors differentially influence their evolution remains unclear. To address these questions, we carried out an analysis at the genome scale to identify signatures of positive selection in human proximal promoters. Next, we examined whether genes with positively selected promoters (Prom+ genes) show systemic differences with respect to a set of genes with positively selected protein-coding regions (Cod+ genes). We found that the number of genes in each set was not significantly different (8.1% and 8.5%, respectively). Furthermore, a functional analysis showed that, in both cases, positive selection affects almost all biological processes and only a few genes of each group are located in enriched categories, indicating that promoters and coding regions are not evolutionarily specialized with respect to gene function. On the other hand, we show that the topology of the human protein network has a different influence on the molecular evolution of proximal promoters and coding regions. Notably, Prom+ genes have an unexpectedly high centrality when compared with a reference distribution (P = 0.008, for Eigenvalue centrality). Moreover, the frequency of Prom+ genes increases from the periphery to the center of the protein network (P = 0.02, for the logistic regression coefficient). This means that gene centrality does not constrain the evolution of proximal promoters, unlike the case with coding regions, and further indicates that the evolution of proximal promoters is more efficient in the center of the protein network than in the periphery. These results show that proximal promoters have had a systemic contribution to human evolution by increasing the participation of central genes in the evolutionary process.
Resumo:
Background: Alternatively spliced exons play an important role in the diversification of gene function in most metazoans and are highly regulated by conserved motifs in exons and introns. Two contradicting properties have been associated to evolutionary conserved alternative exons: higher sequence conservation and higher rate of non-synonymous substitutions, relative to constitutive exons. In order to clarify this issue, we have performed an analysis of the evolution of alternative and constitutive exons, using a large set of protein coding exons conserved between human and mouse and taking into account the conservation of the transcript exonic structure. Further, we have also defined a measure of the variation of the arrangement of exonic splicing enhancers (ESE-conservation score) to study the evolution of splicing regulatory sequences. We have used this measure to correlate the changes in the arrangement of ESEs with the divergence of exon and intron sequences. Results: We find evidence for a relation between the lack of conservation of the exonic structure and the weakening of the sequence evolutionary constraints in alternative and constitutive exons. Exons in transcripts with non-conserved exonic structures have higher synonymous (dS) and non-synonymous (dN) substitution rates than exons in conserved structures. Moreover, alternative exons in transcripts with non-conserved exonic structure are the least constrained in sequence evolution, and at high EST-inclusion levels they are found to be very similar to constitutive exons, whereas alternative exons in transcripts with conserved exonic structure have a dS significantly lower than average at all EST-inclusion levels. We also find higher conservation in the arrangement of ESEs in constitutive exons compared to alternative ones. Additionally, the sequence conservation at flanking introns remains constant for constitutive exons at all ESE-conservation values, but increases for alternative exons at high ESE-conservation values. Conclusion: We conclude that most of the differences in dN observed between alternative and constitutive exons can be explained by the conservation of the transcript exonic structure. Low dS values are more characteristic of alternative exons with conserved exonic structure, but not of those with non-conserved exonic structure. Additionally, constitutive exons are characterized by a higher conservation in the arrangement of ESEs, and alternative exons with an ESE-conservation similar to that of constitutive exons are characterized by a conservation of the flanking intron sequences higher than average, indicating the presence of more intronic regulatory signals.
Resumo:
Animal olfactory systems have a critical role for the survival and reproduction of individuals. In insects, the odorant-binding proteins (OBPs) are encoded by a moderately sized gene family, and mediate the first steps of the olfactory processing. Most OBPs are organized in clusters of a few paralogs, which are conserved over time. Currently, the biological mechanism explaining the close physical proximity among OBPs is not yet established. Here, we conducted a comprehensive study aiming to gain insights into the mechanisms underlying the OBP genomic organization. We found that the OBP clusters are embedded within large conserved arrangements. These organizations also include other non-OBP genes, which often encode proteins integral to plasma membrane. Moreover, the conservation degree of such large clusters is related to the following: 1) the promoter architecture of the confined genes, 2) a characteristic transcriptional environment, and 3) the chromatin conformation of the chromosomal region. Our results suggest that chromatin domains may restrict the location of OBP genes to regions having the appropriate transcriptional environment, leading to the OBP cluster structure. However, the appropriate transcriptional environment for OBP and the other neighbor genes is not dominated by reduced levels of expression noise. Indeed, the stochastic fluctuations in the OBP transcript abundance may have a critical role in the combinatorial nature of the olfactory coding process.
Resumo:
Selenoproteins are a diverse group of proteinsusually misidentified and misannotated in sequencedatabases. The presence of an in-frame UGA (stop)codon in the coding sequence of selenoproteingenes precludes their identification and correctannotation. The in-frame UGA codons are recodedto cotranslationally incorporate selenocysteine,a rare selenium-containing amino acid. The developmentof ad hoc experimental and, more recently,computational approaches have allowed the efficientidentification and characterization of theselenoproteomes of a growing number of species.Today, dozens of selenoprotein families have beendescribed and more are being discovered in recentlysequenced species, but the correct genomic annotationis not available for the majority of thesegenes. SelenoDB is a long-term project that aims toprovide, through the collaborative effort of experimentaland computational researchers, automaticand manually curated annotations of selenoproteingenes, proteins and SECIS elements. Version 1.0 ofthe database includes an initial set of eukaryoticgenomic annotations, with special emphasis on thehuman selenoproteome, for immediate inspectionby selenium researchers or incorporation into moregeneral databases. SelenoDB is freely available athttp://www.selenodb.org.
Resumo:
Background: It has been shown in a variety of organisms, including mammals, that genes that appeared recently in evolution, for example orphan genes, evolve faster than older genes. Low functional constraints at the time of origin of novel genes may explain these results. However, this observation has been recently attributed to an artifact caused by the inability of Blast to detect the fastest genes in different eukaryotic genomes. Distinguishing between these two possible explanations would be of great importance for any studies dealing with the taxon distribution of proteins and the origin of novel genes. Results: Here we used simulations of protein sequences to examine the capacity of Blast to detect proteins of diverse evolutionary rates in the different species of an eukaryotic phylogenetic tree that included metazoans, fungi and plants. We simulated the evolution of protein genes with the same evolutionary rates than those observed in functional mammalian genes and with among-site rate heterogeneity. Under these conditions, we found that only a very small percentage of simulated ancestral eukaryotic proteins was affected by the Blast artifact. We show that the good detectability of Blast is due to the heterogeneity of protein evolutionary rates at different sites, since only a small conserved motif in a sequence suffices to detect its homologues. Our results indicate that Blast, at least when applied within eukaryotes, only misses homologues of extremely fast-evolving sequences, which are rare in the mammalian genome, as well as sequences evolving homogeneously or pseudogenes.Conclusion: Although great care should be exercised in the recognition of remote homologues, most functional mammalian genes can be detected in eukaryotic genomes by Blast. That is, the majority of functional mammalian genes are not as fast as for not being detected in other metazoans, fungi or plants, if they had been present in these organisms. Thus, the correlation previously found between age and rate seems not to be due to a pure Blast artifact, at least for mammals. This may have important implications to understand the mechanisms by which novel genes originate.
Differences in the evolutionary history of disease genes affected by dominant or recessive mutations
Resumo:
Background: Global analyses of human disease genes by computational methods have yielded important advances in the understanding of human diseases. Generally these studies have treated the group of disease genes uniformly, thus ignoring the type of disease-causing mutations (dominant or recessive). In this report we present a comprehensive study of the evolutionary history of autosomal disease genes separated by mode of inheritance.Results: We examine differences in protein and coding sequence conservation between dominant and recessive human disease genes. Our analysis shows that disease genes affected by dominant mutations are more conserved than those affected by recessive mutations. This could be a consequence of the fact that recessive mutations remain hidden from selection while heterozygous. Furthermore, we employ functional annotation analysis and investigations into disease severity to support this hypothesis. Conclusion: This study elucidates important differences between dominantly- and recessively-acting disease genes in terms of protein and DNA sequence conservation, paralogy and essentiality. We propose that the division of disease genes by mode of inheritance will enhance both understanding of the disease process and prediction of candidate disease genes in the future.
Resumo:
Dietary fatty acid supply can affect stress response in fish during early development. Although knowledge on the mechanisms involved in fatty acid regulation of stress tolerance is scarce, it has often been hypothesised that eicosanoid profiles can influence cortisol production. Genomic cortisol actions are mediated by cytosolic receptors which may respond to cellular fatty acid signalling. An experiment was designed to test the effects of feeding gilthead sea-bream larvae with four microdiets, containing graded arachidonic acid (ARA) levels (0·4, 0·8, 1·5 and 3·0 %), on the expression of genes involved in stress response (steroidogenic acute regulatory protein, glucocorticoid receptor and phosphoenolpyruvate carboxykinase), lipid and, particularly, eicosanoid metabolism (hormone-sensitive lipase, PPARα, phospholipase A2, cyclo-oxygenase-2 and 5-lipoxygenase), as determined by real-time quantitative PCR. Fish fatty acid phenotypes reflected dietary fatty acid profiles. Growth performance, survival after acute stress and similar whole-body basal cortisol levels suggested that sea-bream larvae could tolerate a wide range of dietary ARA levels. Transcription of all genes analysed was significantly reduced at dietary ARA levels above 0·4 %. Nonetheless, despite practical suppression of phospholipase A2 transcription, higher leukotriene B4 levels were detected in larvae fed 3·0 % ARA, whereas a similar trend was observed regarding PGE2 production. The present study demonstrates that adaptation to a wide range of dietary ARA levels in gilthead sea-bream larvae involves the modulation of the expression of genes related to eicosanoid synthesis, lipid metabolism and stress response. The roles of ARA, other polyunsaturates and eicosanoids as signals in this process are discussed.
Resumo:
DNA methylation has an important impact on normal cell physiology, thus any defects in this mechanism may be related to the development of various diseases In this project we are interested in identifying epigeneticaliy modified genes, in general controlled by processes related to the DNA methylation, by means of a new strategy combining protomic and genomic analyses. First, the two Dimensional-Difference Gel Electrophoresis (2-DIGE) protein analyses of extracts obtained from HCT-116 wt and double knockout for DNMT1 and DNMT3b (DKO) cells revealed 34 proteins overexpressed in the condition of DNMTs depletion. From five genes with higher transcript lavels in DKO cells, comparing with HCT-116 wt. oniy AKR1B1, UCHLl and VIM are melhylated in HCT-116. As expected. the DNA methvlation 1s lost in DKO cells. The rneth,vl ation of VIM and UCHLl promoters in some cancer samples has already been repaired, thus further studies has been focused on AKRlBI. AKR1B1 expression due lo DNA methyiaton of promoter region seems to occur specilfically in the colon cancer cell Iines. which was confirmed in the DNA rnethylation status and expression analyses. performed on 32 different cancer cell lines (including colon, breast, lymphoma, leukemia, neuroblastoma, glioma and lung cancer cell Iines) as well as normal colon and normal lymphocytes samples. AKRIBI expression after treatments with DNA demethvlating agent (AZA) was rescued in 5 coloncancer cell lines (including genetic regulation of the candidate gene. The methylation status of the rest of the genes identified in proteomic analysis was checked by methylation specific PCR (MSP) experiment and all appeared to be unmethylated. The similar research has been done also bv means of Mecp2-null mouse model For 14 selected candidate genes the analyses of expression leveis, methylation Status and MeCP2 interaction with promoters are currently being performed.
Resumo:
In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.
Resumo:
The “one-gene, one-protein” rule, coined by Beadle and Tatum, has been fundamental to molecular biology. The rule implies that the genetic complexity of an organism depends essentially on its gene number. The discovery, however, that alternative gene splicing and transcription are widespread phenomena dramatically altered our understanding of the genetic complexity of higher eukaryotic organisms; in these, a limited number of genes may potentially encode a much larger number of proteins. Here we investigate yet another phenomenon that may contribute to generate additional protein diversity. Indeed, by relying on both computational and experimental analysis, we estimate that at least 4%–5% of the tandem gene pairs in the human genome can be eventually transcribed into a single RNA sequence encoding a putative chimeric protein. While the functional significance of most of these chimeric transcripts remains to be determined, we provide strong evidence that this phenomenon does not correspond to mere technical artifacts and that it is a common mechanism with the potential of generating hundreds of additional proteins in the human genome.
Resumo:
Either calorie restriction, loss of function of the nutrient-dependent PKA or TOR/SCH9 pathways, or activation of stress defences improves longevity in different eukaryotes. However, the molecular links between glucose depletion, nutrient-dependent pathways and stress responses are unknown. Here we show that either calorie restriction or inactivation of nutrient-dependent pathways induces life-span extension in fission yeast, and that such effect is dependent on the activation of the stress-dependent Sty1 MAP kinase. During transition to stationary phase in glucose-limiting conditions, Sty1 becomes activated and triggers a transcriptional stress program, whereas such activation does not occur under glucose-rich conditions. Deletion of the genes coding for the SCH9-homologue Sck2 or the Pka1 kinases, or mutations leading to constitutive activation of the Sty1 stress pathway increase life span under glucose-rich conditions, and importantly such beneficial effects depend ultimately on Sty1. Furthermore, cells lacking Pka1 display enhanced oxygen consumption and Sty1 activation under glucose-rich conditions. We conclude that calorie restriction favours oxidative metabolism, reactive oxygen species production and Sty1 MAP kinase activation, and this stress pathway favours life-span extension.
Resumo:
A large proportion of the death toll associated with malaria is a consequence of malaria infection during pregnancy, causing up to 200,000 infant deaths annually. We previously published the first extensive genetic association study of placental malaria infection, and here we extend this analysis considerably, investigating genetic variation in over 9,000 SNPs in more than 1,000 genes involved in immunity and inflammation for their involvement in susceptibility to placental malaria infection. We applied a new approach incorporating results from both single gene analysis as well as gene-gene interactionson a protein-protein interaction network. We found suggestive associations of variants in the gene KLRK1 in the single geneanalysis, as well as evidence for associations of multiple members of the IL-7/IL-7R signalling cascade in the combined analysis. To our knowledge, this is the first large-scale genetic study on placental malaria infection to date, opening the door for follow-up studies trying to elucidate the genetic basis of this neglected form of malaria.
Resumo:
Background: Different regions in a genome evolve at different rates depending on structural and functional constraints. Some genomic regions are highly conserved during metazoan evolution, while other regions may evolve rapidly, either in all species or in a lineage-specific manner. A strong or even moderate change in constraints in functional regions, for example in coding regions, can have significant evolutionary consequences. Results: Here we discuss a novel framework, 'BaseDiver', to classify groups of genes in humans based on the patterns of evolutionary constraints on polymorphic positions in their coding regions. Comparing the nucleotide-level divergence among mammals with the extent of deviation from the ancestral base in the human lineage, we identify patterns of evolutionary pressure on nonsynonymous base-positions in groups of genes belonging to the same functional category. Focussing on groups of genes in functional categories, we find that transcription factors contain a significant excess of nonsynonymous base-positions that are conserved in other mammals but changed in human, while immunity related genes harbour mutations at base-positions that evolve rapidly in all mammals including humans due to strong preference for advantageous alleles. Genes involved in olfaction also evolve rapidly in all mammals, and in humans this appears to be due to weak negative selection. Conclusion: While recent studies have identified genes under positive selection in humans, our approach identifies evolutionary constraints on Gene Ontology groups identifying changes in humans relative to some of the other mammals.
Resumo:
Background: Systematic approaches for identifying proteins involved in different types of cancer are needed. Experimental techniques such as microarrays are being used to characterize cancer, but validating their results can be a laborious task. Computational approaches are used to prioritize between genes putatively involved in cancer, usually based on further analyzing experimental data. Results: We implemented a systematic method using the PIANA software that predicts cancer involvement of genes by integrating heterogeneous datasets. Specifically, we produced lists of genes likely to be involved in cancer by relying on: (i) protein-protein interactions; (ii) differential expression data; and (iii) structural and functional properties of cancer genes. The integrative approach that combines multiple sources of data obtained positive predictive values ranging from 23% (on a list of 811 genes) to 73% (on a list of 22 genes), outperforming the use of any of the data sources alone. We analyze a list of 20 cancer gene predictions, finding that most of them have been recently linked to cancer in literature. Conclusion: Our approach to identifying and prioritizing candidate cancer genes can be used to produce lists of genes likely to be involved in cancer. Our results suggest that differential expression studies yielding high numbers of candidate cancer genes can be filtered using protein interaction networks.
Resumo:
Background. Microglia and astrocytes respond to homeostatic disturbances with profound changes of gene expression. This response, known as glial activation or neuroinflammation, can be detrimental to the surrounding tissue. The transcription factor CCAAT/enhancer binding protein ß (C/EBPß) is an important regulator of gene expression in inflammation but little is known about its involvement in glial activation. To explore the functional role of C/EBPß in glial activation we have analyzed pro-inflammatory gene expression and neurotoxicity in murine wild type and C/EBPß-null glial cultures. Methods. Due to fertility and mortality problems associated with the C/EBPß-null genotype we developed a protocol to prepare mixed glial cultures from cerebral cortex of a single mouse embryo with high yield. Wild-type and C/EBPß-null glial cultures were compared in terms of total cell density by Hoechst-33258 staining; microglial content by CD11b immunocytochemistry; astroglial content by GFAP western blot; gene expression by quantitative real-time PCR, western blot, immunocytochemistry and Griess reaction; and microglial neurotoxicity by estimating MAP2 content in neuronal/microglial cocultures. C/EBPß DNA binding activity was evaluated by electrophoretic mobility shift assay and quantitative chromatin immunoprecipitation. Results. C/EBPß mRNA and protein levels, as well as DNA binding, were increased in glial cultures by treatment with lipopolysaccharide (LPS) or LPS + interferon ¿ (IFN¿). Quantitative chromatin immunoprecipitation showed binding of C/EBPß to pro-inflammatory gene promoters in glial activation in a stimulus- and gene-dependent manner. In agreement with these results, LPS and LPS+IFN¿ induced different transcriptional patterns between pro-inflammatory cytokines and NO synthase-2 genes. Furthermore, the expressions of IL-1ß and NO synthase-2, and consequent NO production, were reduced in the absence of C/EBPß. In addition, neurotoxicity elicited by LPS+IFN¿-treated microglia co-cultured with neurons was completely abolished by the absence of C/EBPß in microglia.