114 resultados para Curation
Resumo:
Background: A major goal in the post-genomic era is to identify and characterise disease susceptibility genes and to apply this knowledge to disease prevention and treatment. Rodents and humans have remarkably similar genomes and share closely related biochemical, physiological and pathological pathways. In this work we utilised the latest information on the mouse transcriptome as revealed by the RIKEN FANTOM2 project to identify novel human disease-related candidate genes. We define a new term patholog to mean a homolog of a human disease-related gene encoding a product ( transcript, anti-sense or protein) potentially relevant to disease. Rather than just focus on Mendelian inheritance, we applied the analysis to all potential pathologs regardless of their inheritance pattern. Results: Bioinformatic analysis and human curation of 60,770 RIKEN full-length mouse cDNA clones produced 2,578 sequences that showed similarity ( 70 - 85% identity) to known human-disease genes. Using a newly developed biological information extraction and annotation tool ( FACTS) in parallel with human expert analysis of 17,051 MEDLINE scientific abstracts we identified 182 novel potential pathologs. Of these, 36 were identified by computational tools only, 49 by human expert analysis only and 97 by both methods. These pathologs were related to neoplastic ( 53%), hereditary ( 24%), immunological ( 5%), cardio-vascular (4%), or other (14%), disorders. Conclusions: Large scale genome projects continue to produce a vast amount of data with potential application to the study of human disease. For this potential to be realised we need intelligent strategies for data categorisation and the ability to link sequence data with relevant literature. This paper demonstrates the power of combining human expert annotation with FACTS, a newly developed bioinformatics tool, to identify novel pathologs from within large-scale mouse transcript datasets.
Resumo:
The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.
Resumo:
Expert curation and complete collection of mutations in genes that affect human health is essential for proper genetic healthcare and research. Expert curation is given by the curators of gene-specific mutation databases or locus-specific databases (LSDBs). While there are over 700 such databases, they vary in their content, completeness, time available for curation, and the expertise of the curator. Curation and LSDBs have been discussed, written about, and protocols have been provided for over 10 years, but there have been no formal recommendations for the ideal form of these entities. This work initiates a discussion on this topic to assist future efforts in human genetics. Further discussion is welcome.
Genetic Variations and Diseases in UniProtKB/Swiss-Prot: The Ins and Outs of Expert Manual Curation.
Resumo:
During the last few years, next-generation sequencing (NGS) technologies have accelerated the detection of genetic variants resulting in the rapid discovery of new disease-associated genes. However, the wealth of variation data made available by NGS alone is not sufficient to understand the mechanisms underlying disease pathogenesis and manifestation. Multidisciplinary approaches combining sequence and clinical data with prior biological knowledge are needed to unravel the role of genetic variants in human health and disease. In this context, it is crucial that these data are linked, organized, and made readily available through reliable online resources. The Swiss-Prot section of the Universal Protein Knowledgebase (UniProtKB/Swiss-Prot) provides the scientific community with a collection of information on protein functions, interactions, biological pathways, as well as human genetic diseases and variants, all manually reviewed by experts. In this article, we present an overview of the information content of UniProtKB/Swiss-Prot to show how this knowledgebase can support researchers in the elucidation of the mechanisms leading from a molecular defect to a disease phenotype.
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
The practitioners of bioinformatics require increasing sophistication from their software tools to take into account the particular characteristics that make their domain complex. For example, there is a great variation of experience of researchers, from novices who would like guidance from experts in the best resources to use to experts that wish to take greater management control of the tools used in their experiments. Also, the range of available, and conflicting, data formats is growing and there is a desire to automate the many trivial manual stages of in-silico experiments. Agent-oriented software development is one approach to tackling the design of complex applications. In this paper, we argue that, in fact, agent-oriented development is a particularly well-suited approach to developing bioinformatics tools that take into account the wider domain characteristics. To illustrate this, we design a data curation tool, which manages the format of experimental data, extend it to better account for the extra requirements placed by the domain characteristics, and show how the characteristics lead to a system well suited to an agent-oriented view.
Resumo:
The practitioners of bioinformatics require increasing sophistication from their software tools to take into account the particular characteristics that make their domain complex. For example, there is a great variation of experience of researchers, from novices who would like guidance from experts in the best resources to use to experts that wish to take greater management control of the tools used in their experiments. Also, the range of available, and conflicting, data formats is growing and there is a desire to automate the many trivial manual stages of in-silico experiments. Agent-oriented software development is one approach to tackling the design of complex applications. In this paper, we argue that, in fact, agent-oriented development is a particularly well-suited approach to developing bioinformatics tools that take into account the wider domain characteristics. To illustrate this, we design a data curation tool, which manages the format of experimental data, extend it to better account for the extra requirements placed by the domain characteristics, and show how the characteristics lead to a system well suited to an agent-oriented view.
Resumo:
Integration of experiential learning into the library and information science (LIS) courses has been a theme in LIS education, but the topic deserves renewed attention with an increasing demand for professionals in the digital library field and in light of the new initiative announced by the Library of Congress (LC) and the Institution of Museum and Library Services (IMLS) for national residency program in digital curation. The balance between theory and practice in digital library curricula, the challenges of incorporating practical projects into LIS coursework, and the current practice of teaching with hands on activities represent the primary areas of this panel discussion.