994 resultados para Gene annotation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Several studies have established Glioblastoma Multiforme (GBM) prognostic and predictive models based on age and Karnofsky Performance Status (KPS), while very few studies evaluated the prognostic and predictive significance of preoperative MR-imaging. However, to date, there is no simple preoperative GBM classification that also correlates with a highly prognostic genomic signature. Thus, we present for the first time a biologically relevant, and clinically applicable tumor Volume, patient Age, and KPS (VAK) GBM classification that can easily and non-invasively be determined upon patient admission. METHODS: We quantitatively analyzed the volumes of 78 GBM patient MRIs present in The Cancer Imaging Archive (TCIA) corresponding to patients in The Cancer Genome Atlas (TCGA) with VAK annotation. The variables were then combined using a simple 3-point scoring system to form the VAK classification. A validation set (N = 64) from both the TCGA and Rembrandt databases was used to confirm the classification. Transcription factor and genomic correlations were performed using the gene pattern suite and Ingenuity Pathway Analysis. RESULTS: VAK-A and VAK-B classes showed significant median survival differences in discovery (P = 0.007) and validation sets (P = 0.008). VAK-A is significantly associated with P53 activation, while VAK-B shows significant P53 inhibition. Furthermore, a molecular gene signature comprised of a total of 25 genes and microRNAs was significantly associated with the classes and predicted survival in an independent validation set (P = 0.001). A favorable MGMT promoter methylation status resulted in a 10.5 months additional survival benefit for VAK-A compared to VAK-B patients. CONCLUSIONS: The non-invasively determined VAK classification with its implication of VAK-specific molecular regulatory networks, can serve as a very robust initial prognostic tool, clinical trial selection criteria, and important step toward the refinement of genomics-based personalized therapy for GBM patients.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360,000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. RESULTS: The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified. CONCLUSION: This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes. Results: Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families. Conclusions: The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Gene Ontology (GO) Consortium (http://www.geneontology.org) (GOC) continues to develop, maintain and use a set of structured, controlled vocabularies for the annotation of genes, gene products and sequences. The GO ontologies are expanding both in content and in structure. Several new relationship types have been introduced and used, along with existing relationships, to create links between and within the GO domains. These improve the representation of biology, facilitate querying, and allow GO developers to systematically check for and correct inconsistencies within the GO. Gene product annotation using GO continues to increase both in the number of total annotations and in species coverage. GO tools, such as OBO-Edit, an ontology-editing tool, and AmiGO, the GOC ontology browser, have seen major improvements in functionality, speed and ease of use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

AbstractBACKGROUND: Scientists have been trying to understand the molecular mechanisms of diseases to design preventive and therapeutic strategies for a long time. For some diseases, it has become evident that it is not enough to obtain a catalogue of the disease-related genes but to uncover how disruptions of molecular networks in the cell give rise to disease phenotypes. Moreover, with the unprecedented wealth of information available, even obtaining such catalogue is extremely difficult.PRINCIPAL FINDINGS: We developed a comprehensive gene-disease association database by integrating associations from several sources that cover different biomedical aspects of diseases. In particular, we focus on the current knowledge of human genetic diseases including mendelian, complex and environmental diseases. To assess the concept of modularity of human diseases, we performed a systematic study of the emergent properties of human gene-disease networks by means of network topology and functional annotation analysis. The results indicate a highly shared genetic origin of human diseases and show that for most diseases, including mendelian, complex and environmental diseases, functional modules exist. Moreover, a core set of biological pathways is found to be associated with most human diseases. We obtained similar results when studying clusters of diseases, suggesting that related diseases might arise due to dysfunction of common biological processes in the cell.CONCLUSIONS: For the first time, we include mendelian, complex and environmental diseases in an integrated gene-disease association database and show that the concept of modularity applies for all of them. We furthermore provide a functional analysis of disease-related modules providing important new biological insights, which might not be discovered when considering each of the gene-disease association repositories independently. Hence, we present a suitable framework for the study of how genetic and environmental factors, such as drugs, contribute to diseases.AVAILABILITY: The gene-disease networks used in this study and part of the analysis are available at http://ibi.imim.es/DisGeNET/DisGeNETweb.html#Download

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: The GENCODE consortium was formed to identify and map all protein-coding genes within the ENCODE regions. This was achieved by a combination of initial manual annotation by the HAVANA team, experimental validation by the GENCODE consortium and a refinement of the annotation based on these experimental results. RESULTS: The GENCODE gene features are divided into eight different categories of which only the first two (known and novel coding sequence) are confidently predicted to be protein-coding genes. 5' rapid amplification of cDNA ends (RACE) and RT-PCR were used to experimentally verify the initial annotation. Of the 420 coding loci tested, 229 RACE products have been sequenced. They supported 5' extensions of 30 loci and new splice variants in 50 loci. In addition, 46 loci without evidence for a coding sequence were validated, consisting of 31 novel and 15 putative transcripts. We assessed the comprehensiveness of the GENCODE annotation by attempting to validate all the predicted exon boundaries outside the GENCODE annotation. Out of 1,215 tested in a subset of the ENCODE regions, 14 novel exon pairs were validated, only two of them in intergenic regions. CONCLUSION: In total, 487 loci, of which 434 are coding, have been annotated as part of the GENCODE reference set available from the UCSC browser. Comparison of GENCODE annotation with RefSeq and ENSEMBL show only 40% of GENCODE exons are contained within the two sets, which is a reflection of the high number of alternative splice forms with unique exons annotated. Over 50% of coding loci have been experimentally verified by 5' RACE for EGASP and the GENCODE collaboration is continuing to refine its annotation of 1% human genome with the aid of experimental validation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BackgroundBipolar disorder is a highly heritable polygenic disorder. Recent enrichment analyses suggest that there may be true risk variants for bipolar disorder in the expression quantitative trait loci (eQTL) in the brain.AimsWe sought to assess the impact of eQTL variants on bipolar disorder risk by combining data from both bipolar disorder genome-wide association studies (GWAS) and brain eQTL.MethodTo detect single nucleotide polymorphisms (SNPs) that influence expression levels of genes associated with bipolar disorder, we jointly analysed data from a bipolar disorder GWAS (7481 cases and 9250 controls) and a genome-wide brain (cortical) eQTL (193 healthy controls) using a Bayesian statistical method, with independent follow-up replications. The identified risk SNP was then further tested for association with hippocampal volume (n = 5775) and cognitive performance (n = 342) among healthy individuals.ResultsIntegrative analysis revealed a significant association between a brain eQTL rs6088662 on chromosome 20q11.22 and bipolar disorder (log Bayes factor = 5.48; bipolar disorder P = 5.85×10(-5)). Follow-up studies across multiple independent samples confirmed the association of the risk SNP (rs6088662) with gene expression and bipolar disorder susceptibility (P = 3.54×10(-8)). Further exploratory analysis revealed that rs6088662 is also associated with hippocampal volume and cognitive performance in healthy individuals.ConclusionsOur findings suggest that 20q11.22 is likely a risk region for bipolar disorder; they also highlight the informative value of integrating functional annotation of genetic variants for gene expression in advancing our understanding of the biological basis underlying complex disorders, such as bipolar disorder.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The tight junction (TJ) is one of the most important structures established during merozoite invasion of host cells and a large amount of proteins stored in Toxoplasma and Plasmodium parasites’ apical organelles are involved in forming the TJ. Plasmodium falciparum and Toxoplasma gondii apical membrane antigen 1 (AMA-1) and rhoptry neck proteins (RONs) are the two main TJ components. It has been shown that RON4 plays an essential role during merozoite and sporozoite invasion to target cells. This study has focused on characterizing a novel Plasmodium vivax rhoptry protein, RON4, which is homologous to PfRON4 and PkRON4. Methods: The ron4 gene was re-annotated in the P. vivax genome using various bioinformatics tools and taking PfRON4 and PkRON4 amino acid sequences as templates. Gene synteny, as well as identity and similarity values between open reading frames (ORFs) belonging to the three species were assessed. The gene transcription of pvron4, and the expression and localization of the encoded protein were also determined in the VCG-1 strain by molecular and immunological studies. Nucleotide and amino acid sequences obtained for pvron4 in VCG-1 were compared to those from strains coming from different geographical areas. Results: PvRON4 is a 733 amino acid long protein, which is encoded by three exons, having similar transcription and translation patterns to those reported for its homologue, PfRON4. Sequencing PvRON4 from the VCG-1 strain and comparing it to P. vivax strains from different geographical locations has shown two conserved regions separated by a low complexity variable region, possibly acting as a “smokescreen”. PvRON4 contains a predicted signal sequence, a coiled-coil α-helical motif, two tandem repeats and six conserved cysteines towards the carboxyterminus and is a soluble protein lacking predicted transmembranal domains or a GPI anchor. Indirect immunofluorescence assays have shown that PvRON4 is expressed at the apical end of schizonts and co-localizes at the rhoptry neck with PvRON2.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Establishing the mechanisms by which microbes interact with their environment, including eukaryotic hosts, is a major challenge that is essential for the economic utilisation of microbes and their products. Techniques for determining global gene expression profiles of microbes, such as microarray analyses, are often hampered by methodological restraints, particularly the recovery of bacterial transcripts (RNA) from complex mixtures and rapid degradation of RNA. A pioneering technology that avoids this problem is In Vivo Expression Technology (IVET). IVET is a 'promoter-trapping' methodology that can be used to capture nearly all bacterial promoters (genes) upregulated during a microbe-environment interaction. IVET is especially useful because there is virtually no limit to the type of environment used (examples to date include soil, oomycete, a host plant or animal) to select for active microbial promoters. Furthermore, IVET provides a powerful method to identify genes that are often overlooked during genomic annotation, and has proven to be a flexible technology that can provide even more information than identification of gene expression profiles. A derivative of IVET, termed resolvase-IVET (RIVET), can be used to provide spatio-temporal information about environment-specific gene expression. More recently, niche-specific genes captured during an IVET screen have been exploited to identify the regulatory mechanisms controlling their expression. Overall, IVET and its various spin-offs have proven to be a valuable and robust set of tools for analysing microbial gene expression in complex environments and providing new targets for biotechnological development.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Autism Spectrum Conditions (ASC) are neurodevelopmental conditions characterized by difficulties in communication and social interaction, alongside unusually repetitive behaviours and narrow interests. Asperger Syndrome (AS) is one subgroup of ASC and differs from classic autism in that in AS there is no language or general cognitive delay. Genetic, epigenetic and environmental factors are implicated in ASC and genes involved in neural connectivity and neurodevelopment are good candidates for studying the susceptibility to ASC. The aryl-hydrocarbon receptor nuclear translocator 2 (ARNT2) gene encodes a transcription factor involved in neurodevelopmental processes, neuronal connectivity and cellular responses to hypoxia. A mutation in this gene has been identified in individuals with ASC and single nucleotide polymorphisms (SNPs) have been nominally associated with AS and autistic traits in previous studies. Methods In this study, we tested 34 SNPs in ARNT2 for association with AS in 118 cases and 412 controls of Caucasian origin. P values were adjusted for multiple comparisons, and linkage disequilibrium (LD) among the SNPs analysed was calculated in our sample. Finally, SNP annotation allowed functional and structural analyses of the genetic variants in ARNT2. We tested the replicability of our result using the genome-wide association studies (GWAS) database of the Psychiatric Genomics Consortium (PGC). Results We report statistically significant association of rs17225178 with AS. This SNP modifies transcription factor binding sites and regions that regulate the chromatin state in neural cell lines. It is also included in a LD block in our sample, alongside other genetic variants that alter chromatin regulatory regions in neural cells. Conclusions These findings demonstrate that rs17225178 in the ARNT2 gene is associated with AS and support previous studies that pointed out an involvement of this gene in the predisposition to ASC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Laryngeal squamous cell carcinoma is very common in head and neck cancer, with high mortality rates and poor prognosis. In this study, we compared expression profiles of clinical samples from 13 larynx tumors and 10 non-neoplastic larynx tissues using a custom-built cDNA microarray containing 331 probes for 284 genes previously identified by informatics analysis of EST databases as markers of head and neck tumors. Thirty-five genes showed statistically significant differences (SNR >= 11.01, p <= 0.001) in the expression between tumor and non-tumor larynx tissue samples. Functional annotation indicated that these genes are involved in cellular processes relevant to the cancer phenotype, such as apoptosis, cell cycle, DNA repair, proteolysis, protease inhibition, signal transduction and transcriptional regulation. Six of the identified transcripts map to intronic regions of protein-coding genes and may comprise non-annotated exons or as yet uncharacterized long ncRNAs with a regulatory role in the gene expression program of larynx tissue. The differential expression of 10 of these genes (ADCY6, AES, AL2SCR3, CRR9, CSTB, DUSP1, MAP3K5, PLAT, UBL1 and ZNF706) was independently confirmed by quantitative real-time RT-PCR. Among these, the CSTB gene product has cysteine protease inhibitor activity that has been associated with an antimetastatic function. Interestingly, CSTB showed a low expression in the tumor samples analyzed (p<0.0001). The set of genes identified here contribute to a better understanding of the molecular basis of larynx cancer, and provide candidate markers for improving diagnosis, prognosis and treatment of this carcinoma.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Xanthomonas axonopodis pv. citri (Xac) causes citrus canker and the completion of the Xac genome sequence has opened up the possibility of investigating basic cellular mechanisms at the genomic level. Copper compounds have been extensively used in agriculture to control plant diseases. The copA and copB genes, identified by annotation of the Xac genome, encode homologues of proteins involved in copper resistance. A gene expression assay by Northern blotting revealed that copA and copB are expressed as a unique transcript specifically induced by copper. Synthesis of the gene products was also induced by copper, reaching a maximum level at 4 h after addition of copper to the culture medium. CopA was a cytosolic protein and CopB was detected in the cytoplasmic membrane. The gene encoding CopA was disrupted by the insertion of a transposon, leading to mutant strains that were unable to grow in culture medium containing copper, even at the lowest CUSO4 concentration tested (0.25 mM), whereas the wild-type strain was able to grow in the presence of 1 mM copper. Cell suspensions of the wild-type and mutant strains in different copper concentrations were inoculated in lemon leaves to analyse their ability to induce citrus canker symptoms. Cells of mutant strains showed higher sensitivity than the wild-type strain in the presence of copper, i.e. they were not able to induce citrus canker symptoms at high copper concentrations and exhibited a more retarded growth in planta.