986 resultados para Genomic Regions


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic islands, large potentially mobile regions of bacterial chromosomes, are a major contributor to bacteria evolution. Here, we investigated the fitness cost and phenotypic differences between the bacterium Pseudomonas aeruginosa PAO1 and a derivative carrying one integrated copy of the clc element, a 103-kb genomic island [and integrative and conjugative element (ICE)] originating in Pseudomonas sp. strain B13 and a close relative of genomic islands found in clinical and environmental isolates of P. aeruginosa. By using a combination of whole genome transcriptome profiling, phenotypic arrays, competition experiments, and biofilm formation studies, only few differences became apparent, such as reduced biofilm growth and fourfold stationary phase repression of genes involved in acetoin metabolism in PAO1 containing the clc element. In contrast, PAO1 carrying the clc element acquired the capacity to grow on 3-chlorobenzoate and 2-aminophenol as sole carbon and energy substrates. No fitness loss >1% was detectable in competition experiments between PAO1 and PAO1 carrying the clc element. The genes from the clc element were not silent in PAO1, and excision was observed, although transfer of clc from PAO1 to other recipient bacteria was reduced by two orders of magnitude. Our results indicate that newly acquired mobile DNA not necessarily invoke an important fitness cost on their host. Absence of immediate detriment to the host may have contributed to the wide distribution of genomic islands like clc in bacterial genomes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The basic functions of sleep are still unclear, however, recent advances in genomics and proteomics have begun to contribute to our understanding of both normal and pathological sleep. In this review, we focus primarily on normal sleep and wake that have been studied in model organisms such as mice. Mice have been especially valuable since many different inbred strains exist that differ in sleep-related traits, and genes can be altered by either mutagenesis or targeted approaches. Advances in QTL (Quantitative Trait Loci) analysis have also helped to identify important sleep related genes, and several other QTLs have been mapped as a first step toward finding the genes that underlie basic sleep traits. In addition to more traditional genetic approaches, the abundance of different mRNAs across sleep and wake can now be studied and compared in different brain regions much more thoroughly using microarray methods. Progress at the protein level has been more difficult, but a few studies have begun to investigate changes in proteins during sleep and wake, and we present some of our own preliminary data in this area. A knowledge of which genes and proteins control or respond to changes in sleep will not only help answer fundamental questions, but may also suggest novel drug targets for improving multiple aspects of sleep and wake.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Little is known about the relation between the genome organization and gene expression in Leishmania. Bioinformatic analysis can be used to predict genes and find homologies with known proteins. A model was proposed, in which genes are organized into large clusters and transcribed from only one strand, in the form of large polycistronic primary transcripts. To verify the validity of this model, we studied gene expression at the transcriptional, post-transcriptional and translational levels in a unique locus of 34kb located on chr27 and represented by cosmid L979. Sequence analysis revealed 115 ORFs on either DNA strand. Using computer programs developed for Leishmania genes, only nine of these ORFs, localized on the same strand, were predicted to code for proteins, some of which show homologies with known proteins. Additionally, one pseudogene, was identified. We verified the biological relevance of these predictions. mRNAs from nine predicted genes and proteins from seven were detected. Nuclear run-on analyses confirmed that the top strand is transcribed by RNA polymerase II and suggested that there is no polymerase entry site. Low levels of transcription were detected in regions of the bottom strand and stable transcripts were identified for four ORFs on this strand not predicted to be protein-coding. In conclusion, the transcriptional organization of the Leishmania genome is complex, raising the possibility that computer predictions may not be comprehensive.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are “genomic fossils” valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome’s structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction (∼80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic–stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to ∼2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3′-UTRs. While we estimate a significant false discovery rate of ∼50%–70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Germline genetic variation is associated with the differential expression of many human genes. The phenotypic effects of this type of variation may be important when considering susceptibility to common genetic diseases. Three regions at 8q24 have recently been identified to independently confer risk of prostate cancer. Variation at 8q24 has also recently been associated with risk of breast and colorectal cancer. However, none of the risk variants map at or relatively close to known genes, with c-MYC mapping a few hundred kilobases distally. Results: This study identifies cis-regulators of germline c-MYC expression in immortalized lymphocytes of HapMap individuals. Quantitative analysis of c-MYC expression in normal prostate tissues suggests an association between overexpression and variants in Region 1 of prostate cancer risk. Somatic c-MYC overexpression correlates with prostate cancer progression and more aggressive tumor forms, which was also a pathological variable associated with Region 1. Expression profiling analysis and modeling of transcriptional regulatory networks predicts a functional association between MYC and the prostate tumor suppressor KLF6. Analysis of MYC/Myc-driven cell transformation and tumorigenesis substantiates a model in which MYC overexpression promotes transformation by down-regulating KLF6. In this model, a feedback loop through E-cadherin down-regulation causes further transactivation of c-MYC.Conclusion: This study proposes that variation at putative 8q24 cis-regulator(s) of transcription can significantly alter germline c-MYC expression levels and, thus, contribute to prostate cancer susceptibility by down-regulating the prostate tumor suppressor KLF6 gene.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Age at menarche is a marker of timing of puberty in females. It varies widely between individuals, is a heritable trait and is associated with risks for obesity, type 2 diabetes, cardiovascular disease, breast cancer and all-cause mortality. Studies of rare human disorders of puberty and animal models point to a complex hypothalamic-pituitary-hormonal regulation, but the mechanisms that determine pubertal timing and underlie its links to disease risk remain unclear. Here, using genome-wide and custom-genotyping arrays in up to 182,416 women of European descent from 57 studies, we found robust evidence (P < 5 × 10(-8)) for 123 signals at 106 genomic loci associated with age at menarche. Many loci were associated with other pubertal traits in both sexes, and there was substantial overlap with genes implicated in body mass index and various diseases, including rare disorders of puberty. Menarche signals were enriched in imprinted regions, with three loci (DLK1-WDR25, MKRN3-MAGEL2 and KCNK9) demonstrating parent-of-origin-specific associations concordant with known parental expression patterns. Pathway analyses implicated nuclear hormone receptors, particularly retinoic acid and γ-aminobutyric acid-B2 receptor signalling, among novel mechanisms that regulate pubertal timing in humans. Our findings suggest a genetic architecture involving at least hundreds of common variants in the coordinated timing of the pubertal transition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background and aim of the study: Genomic gains and losses play a crucial role in the development and progression of DLBCL and are closely related to gene expression profiles (GEP), including the germinal center B-cell like (GCB) and activated B-cell like (ABC) cell of origin (COO) molecular signatures. To identify new oncogenes or tumor suppressor genes (TSG) involved in DLBCL pathogenesis and to determine their prognostic values, an integrated analysis of high-resolution gene expression and copy number profiling was performed. Patients and methods: Two hundred and eight adult patients with de novo CD20+ DLBCL enrolled in the prospective multicentric randomized LNH-03 GELA trials (LNH03-1B, -2B, -3B, 39B, -5B, -6B, -7B) with available frozen tumour samples, centralized reviewing and adequate DNA/RNA quality were selected. 116 patients were treated by Rituximab(R)-CHOP/R-miniCHOP and 92 patients were treated by the high dose (R)-ACVBP regimen dedicated to patients younger than 60 years (y) in frontline. Tumour samples were simultaneously analysed by high resolution comparative genomic hybridization (CGH, Agilent, 144K) and gene expression arrays (Affymetrix, U133+2). Minimal common regions (MCR), as defined by segments that affect the same chromosomal region in different cases, were delineated. Gene expression and MCR data sets were merged using Gene expression and dosage integrator algorithm (GEDI, Lenz et al. PNAS 2008) to identify new potential driver genes. Results: A total of 1363 recurrent (defined by a penetrance > 5%) MCRs within the DLBCL data set, ranging in size from 386 bp, affecting a single gene, to more than 24 Mb were identified by CGH. Of these MCRs, 756 (55%) showed a significant association with gene expression: 396 (59%) gains, 354 (52%) single-copy deletions, and 6 (67%) homozygous deletions. By this integrated approach, in addition to previously reported genes (CDKN2A/2B, PTEN, DLEU2, TNFAIP3, B2M, CD58, TNFRSF14, FOXP1, REL...), several genes targeted by gene copy abnormalities with a dosage effect and potential physiopathological impact were identified, including genes with TSG activity involved in cell cycle (HACE1, CDKN2C) immune response (CD68, CD177, CD70, TNFSF9, IRAK2), DNA integrity (XRCC2, BRCA1, NCOR1, NF1, FHIT) or oncogenic functions (CD79b, PTPRT, MALT1, AUTS2, MCL1, PTTG1...) with distinct distribution according to COO signature. The CDKN2A/2B tumor suppressor locus (9p21) was deleted homozygously in 27% of cases and hemizygously in 9% of cases. Biallelic loss was observed in 49% of ABC DLBCL and in 10% of GCB DLBCL. This deletion was strongly correlated to age and associated to a limited number of additional genetic abnormalities including trisomy 3, 18 and short gains/losses of Chr. 1, 2, 19 regions (FDR < 0.01), allowing to identify genes that may have synergistic effects with CDKN2A/2B inactivation. With a median follow-up of 42.9 months, only CDKN2A/2B biallelic deletion strongly correlates (FDR p.value < 0.01) to a poor outcome in the entire cohort (4y PFS = 44% [32-61] respectively vs. 74% [66-82] for patients in germline configuration; 4y OS = 53% [39-72] vs 83% [76-90]). In a Cox proportional hazard prediction of the PFS, CDKN2A/2B deletion remains predictive (HR = 1.9 [1.1-3.2], p = 0.02) when combined with IPI (HR = 2.4 [1.4-4.1], p = 0.001) and GCB status (HR = 1.3 [0.8-2.3], p = 0.31). This difference remains predictive in the subgroup of patients treated by R-CHOP (4y PFS = 43% [29-63] vs. 66% [55-78], p=0.02), in patients treated by R-ACVBP (4y PFS = 49% [28-84] vs. 83% [74-92], p=0.003), and in GCB (4y PFS = 50% [27-93] vs. 81% [73-90], p=0.02), or ABC/unclassified (5y PFS = 42% [28-61] vs. 67% [55-82] p = 0.009) molecular subtypes (Figure 1). Conclusion: We report for the first time an integrated genetic analysis of a large cohort of DLBCL patients included in a prospective multicentric clinical trial program allowing identifying new potential driver genes with pathogenic impact. However CDKN2A/2B deletion constitutes the strongest and unique prognostic factor of chemoresistance to R-CHOP, regardless the COO signature, which is not overcome by a more intensified immunochemotherapy. Patients displaying this frequent genomic abnormality warrant new and dedicated therapeutic approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Soft tissue sarcomas (STS) with complex genomic profiles (50% of all STS) are predominantly composed of spindle cell/pleomorphic sarcomas, including leiomyosarcoma, myxofibrosarcoma, pleomorphic liposarcoma, pleomorphic rhabdomyosarcoma, malignant peripheral nerve sheath tumor, angiosarcoma, extraskeletal osteosarcoma, and spindle cell/pleomorphic unclassified sarcoma (previously called spindle cell/pleomorphic malignant fibrous histiocytoma). These neoplasms show, characteristically, gains and losses of numerous chromosomes or chromosome regions, as well as amplifications. Many of them share recurrent aberrations (e.g., gain of 5p13-p15) that seem to play a significant role in tumor progression and/or metastatic dissemination. In this paper, we review the cytogenetic, molecular genetic, and clinicopathologic characteristics of the most common STS displaying complex genomic profiles. Features of diagnostic or prognostic relevance will be discussed when needed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Peroxisome proliferator activated receptors are ligand activated transcription factors belonging to the nuclear hormone receptor superfamily. Three cDNAs encoding such receptors have been isolated from Xenopus laevis (xPPAR alpha, beta, and gamma). Furthermore, the gene coding for xPPAR beta has been cloned, thus being the first member of this subfamily whose genomic organization has been solved. Functionally, xPPAR alpha as well as its mouse and rat homologs are thought to play an important role in lipid metabolism due to their ability to activate transcription of a reporter gene through the promoter of the acyl-CoA oxidase (ACO) gene. ACO catalyzes the rate limiting step in the peroxisomal beta-oxidation of fatty acids. Activation is achieved by the binding of xPPAR alpha on a regulatory element (DR1) found in the promoter region of this gene, xPPAR beta and gamma are also able to recognize the same type of element and are, as PPAR alpha, able to form heterodimers with retinoid X receptor. All three xPPARs appear to be activated by synthetic peroxisome proliferators as well as by naturally occurring fatty acids, suggesting that a common mode of action exists for all the members of this subfamily of nuclear hormone receptors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genomic islands (GEIs) are large DNA segments, present in most bacterial genomes, that are most likely acquired via horizontal gene transfer. Here, we study the self-transfer system of the integrative and conjugative element ICEclc of Pseudomonas knackmussii B13, which stands model for a larger group of ICE/GEI with syntenic core gene organization. Functional screening revealed that unlike conjugative plasmids and other ICEs ICEclc carries two separate origins of transfer, with different sequence context but containing a similar repeat motif. Conjugation experiments with GFP-labelled ICEclc variants showed that both oriTs are used for transfer and with indistinguishable efficiencies, but that having two oriTs results in an estimated fourfold increase of ICEclc transfer rates in a population compared with having a single oriT. A gene for a relaxase essential for ICEclc transfer was also identified, but in vivo strand exchange assays suggested that the relaxase processes both oriTs in a different manner. This unique dual origin of transfer system might have provided an evolutionary advantage for distribution of ICE, a hypothesis that is supported by the fact that both oriT regions are conserved in several GEIs related to ICEclc.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The P-type II ATPase gene family encodes proteins with an important role in adaptation of the cell to variation in external K+, Ca2+ and Na2+ concentrations. The presence of P-type II gene subfamilies that are specific for certain kingdoms has been reported but was sometimes contradicted by discovery of previously unknown homologous sequences in newly sequenced genomes. Members of this gene family have been sampled in all of the fungal phyla except the arbuscular mycorrhizal fungi (AMF; phylum Glomeromycota), which are known to play a key-role in terrestrial ecosystems and to be genetically highly variable within populations. Here we used highly degenerate primers on AMF genomic DNA to increase the sampling of fungal P-Type II ATPases and to test previous predictions about their evolution. In parallel, homologous sequences of the P-type II ATPases have been used to determine the nature and amount of polymorphism that is present at these loci among isolates of Glomus intraradices harvested from the same field. RESULTS: In this study, four P-type II ATPase sub-families have been isolated from three AMF species. We show that, contrary to previous predictions, P-type IIC ATPases are present in all basal fungal taxa. Additionally, P-Type IIE ATPases should no longer be considered as exclusive to the Ascomycota and the Basidiomycota, since we also demonstrate their presence in the Zygomycota. Finally, a comparison of homologous sequences encoding P-type IID ATPases showed unexpectedly that indel mutations among coding regions, as well as specific gene duplications occur among AMF individuals within the same field. CONCLUSION: On the basis of these results we suggest that the diversification of P-Type IIC and E ATPases followed the diversification of the extant fungal phyla with independent events of gene gains and losses. Consistent with recent findings on the human genome, but at a much smaller geographic scale, we provided evidence that structural genomic changes, such as exonic indel mutations and gene duplications are less rare than previously thought and that these also occur within fungal populations.