952 resultados para Gene Copy Number
Resumo:
In a number of programs for gene structure prediction in higher eukaryotic genomic sequences, exon prediction is decoupled from gene assembly: a large pool of candidate exons is predicted and scored from features located in the query DNA sequence, and candidate genes are assembled from such a pool as sequences of nonoverlapping frame-compatible exons. Genes are scored as a function of the scores of the assembled exons, and the highest scoring candidate gene is assumed to be the most likely gene encoded by the query DNA sequence. Considering additive gene scoring functions, currently available algorithms to determine such a highest scoring candidate gene run in time proportional to the square of the number of predicted exons. Here, we present an algorithm whose running time grows only linearly with the size of the set of predicted exons. Polynomial algorithms rely on the fact that, while scanning the set of predicted exons, the highest scoring gene ending in a given exon can be obtained by appending the exon to the highest scoring among the highest scoring genes ending at each compatible preceding exon. The algorithm here relies on the simple fact that such highest scoring gene can be stored and updated. This requires scanning the set of predicted exons simultaneously by increasing acceptor and donor position. On the other hand, the algorithm described here does not assume an underlying gene structure model. Indeed, the definition of valid gene structures is externally defined in the so-called Gene Model. The Gene Model specifies simply which gene features are allowed immediately upstream which other gene features in valid gene structures. This allows for great flexibility in formulating the gene identification problem. In particular it allows for multiple-gene two-strand predictions and for considering gene features other than coding exons (such as promoter elements) in valid gene structures.
Resumo:
The completion of the sequencing of the mouse genome promises to help predict human genes with greater accuracy. While current ab initio gene prediction programs are remarkably sensitive (i.e., they predict at least a fragment of most genes), their specificity is often low, predicting a large number of false-positive genes in the human genome. Sequence conservation at the protein level with the mouse genome can help eliminate some of those false positives. Here we describe SGP2, a gene prediction program that combines ab initio gene prediction with TBLASTX searches between two genome sequences to provide both sensitive and specific gene predictions. The accuracy of SGP2 when used to predict genes by comparing the human and mouse genomes is assessed on a number of data sets, including single-gene data sets, the highly curated human chromosome 22 predictions, and entire genome predictions from ENSEMBL. Results indicate that SGP2 outperforms purely ab initio gene prediction methods. Results also indicate that SGP2 works about as well with 3x shotgun data as it does with fully assembled genomes. SGP2 provides a high enough specificity that its predictions can be experimentally verified at a reasonable cost. SGP2 was used to generate a complete set of gene predictions on both the human and mouse by comparing the genomes of these two species. Our results suggest that another few thousand human and mouse genes currently not in ENSEMBL are worth verifying experimentally.
Resumo:
The recent availability of the chicken genome sequence poses the question of whether there are human protein-coding genes conserved in chicken that are currently not included in the human gene catalog. Here, we show, using comparative gene finding followed by experimental verification of exon pairs by RT–PCR, that the addition to the multi-exonic subset of this catalog could be as little as 0.2%, suggesting that we may be closing in on the human gene set. Our protocol, however, has two shortcomings: (i) the bioinformatic screening of the predicted genes, applied to filter out false positives, cannot handle intronless genes; and (ii) the experimental verification could fail to identify expression at a specific developmental time. This highlights the importance of developing methods that could provide a reliable estimate of the number of these two types of genes.
Resumo:
Background: Despite the continuous production of genome sequence for a number of organisms,reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularlytrue for genomes for which there is not a large collection of known gene sequences, such as therecently published chicken genome. We used the chicken sequence to test comparative andhomology-based gene-finding methods followed by experimental validation as an effective genomeannotation method.Results: We performed experimental evaluation by RT-PCR of three different computational genefinders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram wascomputed and each component of it was evaluated. The results showed that de novo comparativemethods can identify up to about 700 chicken genes with no previous evidence of expression, andcan correctly extend about 40% of homology-based predictions at the 5' end.Conclusions: De novo comparative gene prediction followed by experimental verification iseffective at enhancing the annotation of the newly sequenced genomes provided by standardhomology-based methods.
Resumo:
Malaria in pregnancy forms a substantial part of the worldwide burden of malaria, with an estimated annual death toll of up to 200,000 infants, as well as increased maternal morbidity and mortality. Studies of genetic susceptibility to malaria have so far focused on infant malaria, with only a few studies investigating the genetic basis of placental malaria, focusing only on a limited number of candidate genes. The aim of this study therefore was to identify novel host genetic factors involved in placental malaria infection. To this end we carried out a nested case-control study on 180 Mozambican pregnant women with placental malaria infection, and 180 controls within an intervention trial of malaria prevention. We genotyped 880 SNPs in a set of 64 functionally related genes involved in glycosylation and innate immunity. A SNP located in the gene FUT9, rs3811070, was significantly associated with placental malaria infection (OR = 2.31, permutation p-value = 0.028). Haplotypic analysis revealed a similarly strong association of a common haplotype of four SNPs including rs3811070. FUT9 codes for a fucosyl-transferase that is catalyzing the last step in the biosynthesis of the Lewis-x antigen, which forms part of the Lewis blood group-related antigens. These results therefore suggest an involvement of this antigen in the pathogenesis of placental malaria infection.
Resumo:
Of all Pacific salmonids, Chinook salmon Oncorhynchus tshawytscha display the greatest variability in return times to freshwater. The molecular mechanisms of these differential return times have not been well described. Current methods, such as long serial analysis of gene expression (LongSAGE) and microarrays, allow gene expression to be analyzed for thousands of genes simultaneously. To investigate whether differential gene expression is observed between fall- and spring-run Chinook salmon from California's Central Valley, LongSAGE libraries were constructed. Three libraries containing between 25,512 and 29,372 sequenced tags (21 base pairs/tag) were generated using messenger RNA from the brains of adult Chinook salmon returning in fall and spring and from one ocean-caught Chinook salmon. Tags were annotated to genes using complementary DNA libraries from Atlantic salmon Salmo salar and rainbow trout O. mykiss. Differentially expressed genes, as estimated by differences in the number of sequence tags, were found in all pairwise comparisons of libraries (freshwater versus saltwater = 40 genes; fall versus spring = 11 genes: and spawning versus nonspawning = 51 genes). The gene for ependymin, an extracellular glycoprotein involved in behavioral plasticity in fish, exhibited the most differential expression among the three groupings. Reverse transcription polymerase chain reaction analysis verified the differential expression of ependymin between the fall- and spring-run samples. These LongSAGE libraries, the first reported for Chinook salmon, provide a window of the transcriptional changes during Chinook salmon return migration to freshwater and spawning and increase the amount of expressed sequence data.
Resumo:
The breeding system of social organisms affects many important aspects of social life. Some species vary greatly in the number of breeders per group, but the mechanisms and selective pressures contributing to the maintenance of this polymorphism in social structure remain poorly understood. Here, we take advantage of a genetic dataset that spans 15 years to investigate the dynamics of colony queen number within a socially polymorphic ant species. Our study population of Formica selysi has single- and multiple-queen colonies. We found that the social structure of this species is somewhat flexible: on average, each year 3.2% of the single-queen colonies became polygynous, and conversely 1.4% of the multiple-queen colonies became monogynous. The annualized queen replacement rates were 10.3% and 11.9% for single- and multiple-queen colonies, respectively. New queens were often but not always related to previous colony members. At the population level, the social polymorphism appeared stable. There was no genetic differentiation between single- and multiple-queen colonies at eight microsatellite loci, suggesting ongoing gene flow between social forms. Overall, the regular and bidirectional changes in queen number indicate that social structure is a labile trait in F. selysi, with neither form being favored within a time-frame of 15 years.
Resumo:
Reliable and long-term expression of transgenes remain significant challenges for gene therapy and biotechnology applications, especially when antibiotic selection procedures are not applicable. In this context, transposons represent attractive gene transfer vectors because of their ability to promote efficient genomic integration in a variety of mammalian cell types. However, expression from genome-integrating vectors may be inhibited by variable gene transcription and/or silencing events. In this study, we assessed whether inclusion of two epigenetic control elements, the human Matrix Attachment Region (MAR) 1-68 and X-29, in a piggyBac transposon vector, may lead to more reliable and efficient expression in CHO cells. We found that addition of the MAR 1-68 at the center of the transposon did not interfere with transposition frequency, and transgene expressing cells could be readily detected from the total cell population without antibiotic selection. Inclusion of the MAR led to higher transgene expression per integrated copy, and reliable expression could be obtained from as few as 2-4 genomic copies of the MAR-containing transposon vector. The MAR X-29-containing transposons was found to mediate elevated expression of therapeutic proteins in polyclonal or monoclonal CHO cell populations using a transposable vector devoid of selection gene. Overall, we conclude that MAR and transposable vectors can be used to improve transgene expression from few genomic transposition events, which may be useful when expression from a low number of integrated transgene copies must be obtained and/or when antibiotic selection cannot be applied.
Resumo:
Abstract : Gene duplication is an essential source of material for the origin of genetic novelty and the evolution of lineage- or species-specific phenotypic traits. The reverse transcription of source gene mRNA followed by the genomic insertion of the resulting cDNA - retroposition - has provided the human genome with a significant number of gene copies during the last ~63 million years (MYA) of primate evolution. We estimated that at least 1 new functional gene (retrogene) per MYA emerged by retroposition in the primate lineage leading to humans. Using a combination of comparative sequencing and evolutionary simulations, we obtained strong evidence of functionality for 7 primate specific retrogenes. Most of these genes are specifically expressed in testis suggesting that retroposition has contributed with genetic raw material necessary for the evolution ofmale-specific functions in primates. We characterized CDC14Bretro (identified in the previous survey) that originated from the retroposition of a cell cycle gene - CDC14B - in the common ancestor of humans and apes. We demonstrate that CDC14Bretro experienced a period of intense positive selection in the African ape ancestor. By virtue of the amino acid substitutions that occurred during this period CDC 14Bretro adapted to a new subcellular compartment in African apes. Further analyses indicate that this subcellular shift reflects the evolution of anew functional role of CDC 14Bretro. Prompted by this result, we used yeast (Saccharomyces cerevisiae) to investigate on a global scale the extent of functional diversification of duplicate genes through the subcellular adaptation of their encoded proteins. We found that duplicate proteins frequently evolved new cellular localization patterns, either by partitioning of ancestral localizations ("sublocalization"), or more frequently by relocalization to previously unoccupied compartments ("neolocalization"). Interestingly, proteins involved in processes with a wider subcellular distribution more frequently evolved new localization patterns suggesting that subcellular localization changes are dependent on progenitor gene functions. Relocated proteins adapted to their new subcellular environments and evolved new functional roles through changes of their physio-chemical properties, expression levels, and interaction partners. Our work suggests an important role of subcellular adaptation for the emergence of new gene functions.
Resumo:
ABSTRACT Poor outcome for glioblastoma patients is largely due to resistance to chemoradiation therapy. While epigenetic inactivation of MGMT mediated DNA repair is highly predictive for benefit from the alkylating agent therapy Temozolomide, additional mechanisms for resistance associated with molecular alterations exist. Furthermore, new concepts in cancer suggest that resistance to treatment may be linked to cancer stem cells that escape therapy and act as source for tumour recurrence. We determined gene expression signatures associated with outcome in glioblastoma patients enrolled in a phase II and phase III clinical trial establishing the new combination therapy of radiation plus concomitant and adjuvant Temozolomide. Correlating stable gene clusters emerging from unsupervised analysis with survival of 42 treated patients identified a number of biological processes associated with outcome. Most prominent, a gene cluster dominated by HOX genes and comprising PROM1, was associated with resistance. PROM1 encodes CD133, a marker for a subpopulation of tumour cells enriched for glioblastoma stem- like cells. The core of this correlated HOX cluster was comprised in the top genes of a "self-renewal signature" defined in a mouse model for MLL-AF9 initiated leukaemia. The association of the HOX gene cluster with tumour resistance was confirmed in two external data sets of 146 malignant glioma As additional resistance factors we identified over-expression of the epidermal growth factor receptor gene, EGFR, while increased gene expression related to biological features of tumour host interaction, including markers for tumour vascular and cell adhesion, and innate immune response, were associated with better outcome. The "self-renewal" signature associated with resistance to the new combination chemoradiation therapy provides first clinical evidence that glioma stem like cells may implicated in resistance in a uniformly treated cohort of glioblastoma patients. This study underlines the need to target the tumour stem cell compartment, and provides some testable hypothesis for biological mechanisms relevant for malignant behaviour of glioblastoma that may be targeted in new treatment approaches. Résumé Le glioblastome, tumeur cérébrale primaire maligne la plus fréquente, est connue pour son mauvais pronostique. Des avancées chimiothérapeutiques récentes avec des agents alkylants comme le témozolomide (TMZ), ont permis une amélioration notable dans la survie de certains patients. Les bénéficiaires ont la caractéristique commune de présenter une particularité génétique, la methylation du MGMT (methylguanine methyltransferase). Néanmoins, d'autres mécanismes de résistance en fonction des aberrations moléculaires existent. Nous avons établi les profils d'expressions génétiques des patients traités par irradiation et TMZ dans des études cliniques de phase II et III. En combinant des méthodes non-supervisées et supervisées, de l'étude de la cohorte des patients traités nous avons découvert des groupes de gènes associés à la survie. Un ensemble de gènes contenant les gènes Hox semble lié au mécanisme de résistance au traitement. Récemment, les gènes Hox ont été décrits comme faisant partie d"une signature d'autorenouvellement (self-renewal) des cellules souches cancéreuses de la leucémie. L'autorenouvellement est un processus grâce auquel les cellules souches se maintiennent tout au long de la vie. Cette association à la résistance est confirmée dans deux autres études indépendantes. Un autre facteur de résistance au traitement est la surexpression du gène EGFR. D'autre part, deux groupes de gènes associés à la relation entre hôte-tumeur tels que les marqueurs des vaisseaux tumoraux et de la réponse immunitaire innée s'avèrent avoir un effet positif sur la survie des patients traités. La découverte de la signature d'autorenouvellement comme facteur de résistance à la nouvelle chimio-radiothérapie offre une preuve clinique que les cellules souches cancéreuses sont impliquées dans la résistance au traitement. If est donc logique de penser que le traitement ciblé contre des cellules souches cancéreuses va dans l'avenir permettre des thérapies anticancéreuses plus performantes.
Resumo:
Constitutive activation of the nuclear factor-κ B (NF-κB) pathway is a hallmark of the activated B-cell-like (ABC) subtype of diffuse large B-cell lymphoma (DLBCL). Recurrent mutations of NF-κB regulators that cause constitutive activity of this oncogenic pathway have been identified. However, it remains unclear how specific target genes are regulated. We identified the atypical nuclear IκB protein IκB-ζ to be upregulated in ABC compared with germinal center B-cell-like (GCB) DLBCL primary patient samples. Knockdown of IκB-ζ by RNA interference was toxic to ABC but not to GCB DLBCL cell lines. Gene expression profiling after IκB-ζ knockdown demonstrated a significant downregulation of a large number of known NF-κB target genes, indicating an essential role of IκB-ζ in regulating a specific set of NF-κB target genes. To further investigate how IκB-ζ mediates NF-κB activity, we performed immunoprecipitations and detected a physical interaction of IκB-ζ with both p50 and p52 NF-κB subunits, indicating that IκB-ζ interacts with components of both the canonical and the noncanonical NF-κB pathway in ABC DLBCL. Collectively, our data demonstrate that IκB-ζ is essential for nuclear NF-κB activity in ABC DLBCL, and thus might represent a promising molecular target for future therapies.
Resumo:
Mammalian sex chromosomes have undergone profound changes since evolving from ancestral autosomes. By examining retroposed genes in the human and mouse genomes, we demonstrate that, during evolution, the mammalian X chromosome has generated and recruited a disproportionately high number of functional retroposed genes, whereas the autosomes experienced lower gene turnover. Most autosomal copies originating from X-linked genes exhibited testis-biased expression. Such export is incompatible with mutational bias and is likely driven by natural selection to attain male germline function. However, the excess recruitment is consistent with a combination of both natural selection and mutational bias.
Resumo:
Immunity-related GTPases (IRG) play an important role in defense against intracellular pathogens. One member of this gene family in humans, IRGM, has been recently implicated as a risk factor for Crohn's disease. We analyzed the detailed structure of this gene family among primates and showed that most of the IRG gene cluster was deleted early in primate evolution, after the divergence of the anthropoids from prosimians ( about 50 million years ago). Comparative sequence analysis of New World and Old World monkey species shows that the single-copy IRGM gene became pseudogenized as a result of an Alu retrotransposition event in the anthropoid common ancestor that disrupted the open reading frame (ORF). We find that the ORF was reestablished as a part of a polymorphic stop codon in the common ancestor of humans and great apes. Expression analysis suggests that this change occurred in conjunction with the insertion of an endogenous retrovirus, which altered the transcription initiation, splicing, and expression profile of IRGM. These data argue that the gene became pseudogenized and was then resurrected through a series of complex structural events and suggest remarkable functional plasticity where alleles experience diverse evolutionary pressures over time. Such dynamism in structure and evolution may be critical for a gene family locked in an arms race with an ever-changing repertoire of intracellular parasites.
Resumo:
Chromatin remodeling at specific genomic loci controls lymphoid differentiation. Here, we investigated the role played in this process by Kruppel-associated box (KRAB)-associated protein 1 (KAP1), the universal cofactor of KRAB-zinc finger proteins (ZFPs), a tetrapod-restricted family of transcriptional repressors. T-cell-specific Kap1-deleted mice displayed a significant expansion of immature thymocytes, imbalances in CD4(+)/CD8(+) cell ratios, and altered responses to TCR and TGFβ stimulation when compared to littermate KAP1 control mice. Transcriptome and chromatin studies revealed that KAP1 binds T-cell-specific cis-acting regulatory elements marked by the H3K9me3 repressive mark and enriched in Ikaros/NuRD complexes. Also, KAP1 directly controls the expression of several genes involved in TCR and cytokine signaling. Among these, regulation of FoxO1 seems to play a major role in this system. Likely responsible for tethering KAP1 to at least part of its genomic targets, a small number of KRAB-ZFPs are selectively expressed in T-lymphoid cells. These results reveal the so far unsuspected yet important role of KAP1-mediated epigenetic regulation in T-lymphocyte differentiation and activation.
Resumo:
The Gene Ontology (GO) (http://www.geneontology.org) is a community bioinformatics resource that represents gene product function through the use of structured, controlled vocabularies. The number of GO annotations of gene products has increased due to curation efforts among GO Consortium (GOC) groups, including focused literature-based annotation and ortholog-based functional inference. The GO ontologies continue to expand and improve as a result of targeted ontology development, including the introduction of computable logical definitions and development of new tools for the streamlined addition of terms to the ontology. The GOC continues to support its user community through the use of e-mail lists, social media and web-based resources.