1000 resultados para Genes Classificação
Resumo:
Cancer genomes frequently contain somatic copy number alterations (SCNA) that can significantly perturb the expression level of affected genes and thus disrupt pathways controlling normal growth. In melanoma, many studies have focussed on the copy number and gene expression levels of the BRAF, PTEN and MITF genes, but little has been done to identify new genes using these parameters at the genome-wide scale. Using karyotyping, SNP and CGH arrays, and RNA-seq, we have identified SCNA affecting gene expression ('SCNA-genes') in seven human metastatic melanoma cell lines. We showed that the combination of these techniques is useful to identify candidate genes potentially involved in tumorigenesis. Since few of these alterations were recurrent across our samples, we used a protein network-guided approach to determine whether any pathways were enriched in SCNA-genes in one or more samples. From this unbiased genome-wide analysis, we identified 28 significantly enriched pathway modules. Comparison with two large, independent melanoma SCNA datasets showed less than 10% overlap at the individual gene level, but network-guided analysis revealed 66% shared pathways, including all but three of the pathways identified in our data. Frequently altered pathways included WNT, cadherin signalling, angiogenesis and melanogenesis. Additionally, our results emphasize the potential of the EPHA3 and FRS2 gene products, involved in angiogenesis and migration, as possible therapeutic targets in melanoma. Our study demonstrates the utility of network-guided approaches, for both large and small datasets, to identify pathways recurrently perturbed in cancer.
Resumo:
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.
Resumo:
In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.
Resumo:
Selenoproteins are a diverse group of proteinsusually misidentified and misannotated in sequencedatabases. The presence of an in-frame UGA (stop)codon in the coding sequence of selenoproteingenes precludes their identification and correctannotation. The in-frame UGA codons are recodedto cotranslationally incorporate selenocysteine,a rare selenium-containing amino acid. The developmentof ad hoc experimental and, more recently,computational approaches have allowed the efficientidentification and characterization of theselenoproteomes of a growing number of species.Today, dozens of selenoprotein families have beendescribed and more are being discovered in recentlysequenced species, but the correct genomic annotationis not available for the majority of thesegenes. SelenoDB is a long-term project that aims toprovide, through the collaborative effort of experimentaland computational researchers, automaticand manually curated annotations of selenoproteingenes, proteins and SECIS elements. Version 1.0 ofthe database includes an initial set of eukaryoticgenomic annotations, with special emphasis on thehuman selenoproteome, for immediate inspectionby selenium researchers or incorporation into moregeneral databases. SelenoDB is freely available athttp://www.selenodb.org.
Resumo:
The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.
Resumo:
The recent availability of the chicken genome sequence poses the question of whether there are human protein-coding genes conserved in chicken that are currently not included in the human gene catalog. Here, we show, using comparative gene finding followed by experimental verification of exon pairs by RT–PCR, that the addition to the multi-exonic subset of this catalog could be as little as 0.2%, suggesting that we may be closing in on the human gene set. Our protocol, however, has two shortcomings: (i) the bioinformatic screening of the predicted genes, applied to filter out false positives, cannot handle intronless genes; and (ii) the experimental verification could fail to identify expression at a specific developmental time. This highlights the importance of developing methods that could provide a reliable estimate of the number of these two types of genes.
Resumo:
BACKGROUND: The trithorax group (trxG) and Polycomb group (PcG) proteins are responsible for the maintenance of stable transcriptional patterns of many developmental regulators. They bind to specific regions of DNA and direct the post-translational modifications of histones, playing a role in the dynamics of chromatin structure. RESULTS: We have performed genome-wide expression studies of trx and ash2 mutants in Drosophila melanogaster. Using computational analysis of our microarray data, we have identified 25 clusters of genes potentially regulated by TRX. Most of these clusters consist of genes that encode structural proteins involved in cuticle formation. This organization appears to be a distinctive feature of the regulatory networks of TRX and other chromatin regulators, since we have observed the same arrangement in clusters after experiments performed with ASH2, as well as in experiments performed by others with NURF, dMyc, and ASH1. We have also found many of these clusters to be significantly conserved in D. simulans, D. yakuba, D. pseudoobscura and partially in Anopheles gambiae. CONCLUSION: The analysis of genes governed by chromatin regulators has led to the identification of clusters of functionally related genes conserved in other insect species, suggesting this chromosomal organization is biologically important. Moreover, our results indicate that TRX and other chromatin regulators may act globally on chromatin domains that contain transcriptionally co-regulated genes.
Resumo:
Confronting a recently mated female with a strange male can induce a pregnancy block ('Bruce effect'). The physiology of this effect is well studied, but its functional significance is still not fully understood. The 'anticipated infanticide hypothesis' suggests that the pregnancy block serves to avoid the cost of embryogenesis and giving birth to offspring that are likely to be killed by a new territory holder. Some 'compatible-genes sexual selection hypotheses' suggest that the likelihood of a pregnancy block is also dependent on the female's perception of the stud's and the stimulus male's genetic quality. We used two inbred strains of mice (C57BL/6 and BALB/c) to test all possible combinations of female strain, stud strain, and stimulus strain under experimental conditions (N(total) = 241 mated females). As predicted from previous studies, we found increased rates of pregnancy blocks if stud and stimulus strains differed, and we found evidence for hybrid vigour in offspring of between-strain mating. Despite the observed heterosis, pregnancies of within-strain matings were not more likely to be blocked than pregnancies of between-strain matings. A power analysis revealed that if we missed an existing effect (type-II error), the effect must be very small. If a female gave birth, the number and weight of newborns were not significantly influenced by the stimulus males. In conclusion, we found no support for the 'compatible-genes sexual selection hypotheses'.
Resumo:
Placental malaria is a special form of malaria that causes up to 200,000 maternal and infant deaths every year. Previous studies show that two receptor molecules, hyaluronic acid and chondroitin sulphate A, are mediating the adhesion of parasite-infected erythrocytes in the placenta of patients, which is believed to be a key step in the pathogenesis of the disease. In this study, we aimed at identifying sites of malaria-induced adaptation by scanning for signatures of natural selection in 24 genes in the complete biosynthesis pathway of these two receptor molecules. We analyzed a total of 24 Mb of publicly available polymorphism data from the International HapMap project for three human populations with European, Asian and African ancestry, with the African population from a region of presently and historically high malaria prevalence. Using the methods based on allele frequency distributions, genetic differentiation between populations, and on long-range haplotype structure, we found only limited evidence for malaria-induced genetic adaptation in this set of genes in the African population; however, we identified one candidate gene with clear evidence of selection in the Asian population. Although historical exposure to malaria in this population cannot be ruled out, we speculate that it might be caused by other pathogens, as there is growing evidence that these molecules are important receptors in a variety of host-pathogen interactions. We propose to use the present methods in a systematic way to help identify candidate regions under positive selection as a consequence of malaria.
Resumo:
A large proportion of the death toll associated with malaria is a consequence of malaria infection during pregnancy, causing up to 200,000 infant deaths annually. We previously published the first extensive genetic association study of placental malaria infection, and here we extend this analysis considerably, investigating genetic variation in over 9,000 SNPs in more than 1,000 genes involved in immunity and inflammation for their involvement in susceptibility to placental malaria infection. We applied a new approach incorporating results from both single gene analysis as well as gene-gene interactionson a protein-protein interaction network. We found suggestive associations of variants in the gene KLRK1 in the single geneanalysis, as well as evidence for associations of multiple members of the IL-7/IL-7R signalling cascade in the combined analysis. To our knowledge, this is the first large-scale genetic study on placental malaria infection to date, opening the door for follow-up studies trying to elucidate the genetic basis of this neglected form of malaria.
Resumo:
Background: An excess of caffeine is cytotoxic to all eukaryotic cell types. We aim to study how cells become tolerant to atoxic dose of this drug, and the relationship between caffeine and oxidative stress pathways.Methodology/Principal Findings: We searched for Schizosaccharomyces pombe mutants with inhibited growth on caffeinecontainingplates. We screened a collection of 2,700 haploid mutant cells, of which 98 were sensitive to caffeine. The genes mutated in these sensitive clones were involved in a number of cellular roles including the H2O2-induced Pap1 and Sty1 stress pathways, the integrity and calcineurin pathways, cell morphology and chromatin remodeling. We have investigated the role of the oxidative stress pathways in sensing and promoting survival to caffeine. The Pap1 and the Sty1 pathways are both required for normal tolerance to caffeine, but only the Sty1 pathway is activated by the drug. Cells lacking Pap1 aresensitive to caffeine due to the decreased expression of the efflux pump Hba2. Indeed, ?hba2 cells are sensitive to caffeine, and constitutive activation of the Pap1 pathway enhances resistance to caffeine in an Hba2-dependent manner. Conclusions/Significance: With our caffeine-sensitive, genome-wide screen of an S. pombe deletion collection, we havedemonstrated the importance of some oxidative stress pathway components on wild-type tolerance to the drug.
Resumo:
Background: It is well known that the pattern of linkage disequilibrium varies between human populations, with remarkable geographical stratification. Indirect association studies routinely exploit linkage disequilibrium around genes, particularly in isolated populations where it is assumed to be higher. Here, we explore both the amount and the decay of linkage disequilibrium with physical distance along 211 gene regions, most of them related to complex diseases, across 39 HGDP-CEPH population samples, focusing particularly on the populations defined as isolates. Within each gene region and population we use r2 between all possible single nucleotide polymorphism (SNP) pairs as a measure of linkage disequilibrium and focus on the proportion of SNP pairs with r2 greater than 0.8.Results: Although the average r2 was found to be significantly different both between and within continental regions, a much higher proportion of r2 variance could be attributed to differences between continental regions (2.8% vs. 0.5%, respectively). Similarly, while the proportion of SNP pairs with r2 > 0.8 was significantly different across continents for all distance classes, it was generally much more homogenous within continents, except in the case of Africa and the Americas. The only isolated populations with consistently higher LD in all distance classes with respect to their continent are the Kalash (Central South Asia) and the Surui (America). Moreover, isolated populations showed only slightly higher proportions of SNP pairs with r2 > 0.8 per gene region than non-isolated populations in the same continent. Thus, the number of SNPs in isolated populations that need to be genotyped may be only slightly less than in non-isolates. Conclusion: The "isolated population" label by itself does not guarantee a greater genotyping efficiency in association studies, and properties other than increased linkage disequilibrium may make these populations interesting in genetic epidemiology.
Resumo:
Background: Different regions in a genome evolve at different rates depending on structural and functional constraints. Some genomic regions are highly conserved during metazoan evolution, while other regions may evolve rapidly, either in all species or in a lineage-specific manner. A strong or even moderate change in constraints in functional regions, for example in coding regions, can have significant evolutionary consequences. Results: Here we discuss a novel framework, 'BaseDiver', to classify groups of genes in humans based on the patterns of evolutionary constraints on polymorphic positions in their coding regions. Comparing the nucleotide-level divergence among mammals with the extent of deviation from the ancestral base in the human lineage, we identify patterns of evolutionary pressure on nonsynonymous base-positions in groups of genes belonging to the same functional category. Focussing on groups of genes in functional categories, we find that transcription factors contain a significant excess of nonsynonymous base-positions that are conserved in other mammals but changed in human, while immunity related genes harbour mutations at base-positions that evolve rapidly in all mammals including humans due to strong preference for advantageous alleles. Genes involved in olfaction also evolve rapidly in all mammals, and in humans this appears to be due to weak negative selection. Conclusion: While recent studies have identified genes under positive selection in humans, our approach identifies evolutionary constraints on Gene Ontology groups identifying changes in humans relative to some of the other mammals.
Resumo:
Background: Cancer is a major medical problem in modern societies. However, the incidence of this disease in non-human primates is very low. To study whether genetic differences between human and chimpanzee could contribute to their distinct cancer susceptibility, we have examined in the chimpanzee genome the orthologous genes of a set of 333 human cancer genes. Results: This analysis has revealed that all examined human cancer genes are present in chimpanzee, contain intact open reading frames and show a high degree of conservation between both species. However, detailed analysis of this set of genes has shown some differences in genes of special relevance for human cancer. Thus, the chimpanzee gene encoding p53 contains a Pro residue at codon 72, while this codon is polymorphic in humans and can code for Arg or Pro, generating isoforms with different ability to induce apoptosis or interact with p73. Moreover, sequencing of the BRCA1 gene has shown an 8 Kb deletion in the chimpanzee sequence that prematurely truncates the co-regulated NBR2 gene. Conclusion: These data suggest that small differences in cancer genes, as those found in tumor suppressor genes, might influence the differences in cancer susceptibility between human and chimpanzee. Nevertheless, further analysis will be required to determine the exact contribution of the genetic changes identified in this study to the different cancer incidence in non-human primates.
Resumo:
Background: One of the main goals of cancer genetics is to identify the causative elements at the molecular level leading to cancer.Results: We have conducted an analysis of a set of genes known to be involved in cancer in order to unveil their unique features that can assist towards the identification of new candidate cancer genes. Conclusion: We have detected key patterns in this group of genes in terms of the molecular function or the biological process in which they are involved as well as sequence properties. Based on these features we have developed an accurate Bayesian classification model with which human genes have been scored for their likelihood of involvement in cancer.