1000 resultados para Data curation
Resumo:
The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
The practitioners of bioinformatics require increasing sophistication from their software tools to take into account the particular characteristics that make their domain complex. For example, there is a great variation of experience of researchers, from novices who would like guidance from experts in the best resources to use to experts that wish to take greater management control of the tools used in their experiments. Also, the range of available, and conflicting, data formats is growing and there is a desire to automate the many trivial manual stages of in-silico experiments. Agent-oriented software development is one approach to tackling the design of complex applications. In this paper, we argue that, in fact, agent-oriented development is a particularly well-suited approach to developing bioinformatics tools that take into account the wider domain characteristics. To illustrate this, we design a data curation tool, which manages the format of experimental data, extend it to better account for the extra requirements placed by the domain characteristics, and show how the characteristics lead to a system well suited to an agent-oriented view.
Resumo:
The practitioners of bioinformatics require increasing sophistication from their software tools to take into account the particular characteristics that make their domain complex. For example, there is a great variation of experience of researchers, from novices who would like guidance from experts in the best resources to use to experts that wish to take greater management control of the tools used in their experiments. Also, the range of available, and conflicting, data formats is growing and there is a desire to automate the many trivial manual stages of in-silico experiments. Agent-oriented software development is one approach to tackling the design of complex applications. In this paper, we argue that, in fact, agent-oriented development is a particularly well-suited approach to developing bioinformatics tools that take into account the wider domain characteristics. To illustrate this, we design a data curation tool, which manages the format of experimental data, extend it to better account for the extra requirements placed by the domain characteristics, and show how the characteristics lead to a system well suited to an agent-oriented view.
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Jahnke and Asher explore workflows and methodologies at a variety of academic data curation sites, and Keralis delves into the academic milieu of library and information schools that offer instruction in data curation. Their conclusions point to the urgent need for a reliable and increasingly sophisticated professional cohort to support data-intensive research in our colleges, universities, and research centers.
Resumo:
Purpose – Linked data is gaining great interest in the cultural heritage domain as a new way for publishing, sharing and consuming data. The paper aims to provide a detailed method and MARiMbA a tool for publishing linked data out of library catalogues in the MARC 21 format, along with their application to the catalogue of the National Library of Spain in the datos.bne.es project. Design/methodology/approach – First, the background of the case study is introduced. Second, the method and process of its application are described. Third, each of the activities and tasks are defined and a discussion of their application to the case study is provided. Findings – The paper shows that the FRBR model can be applied to MARC 21 records following linked data best practices, librarians can successfully participate in the process of linked data generation following a systematic method, and data sources quality can be improved as a result of the process. Originality/value – The paper proposes a detailed method for publishing and linking linked data from MARC 21 records, provides practical examples, and discusses the main issues found in the application to a real case. Also, it proposes the integration of a data curation activity and the participation of librarians in the linked data generation process.
Resumo:
Evolutionary developmental biology has grown historically from the capacity to relate patterns of evolution in anatomy to patterns of evolution of expression of specific genes, whether between very distantly related species, or very closely related species or populations. Scaling up such studies by taking advantage of modern transcriptomics brings promising improvements, allowing us to estimate the overall impact and molecular mechanisms of convergence, constraint or innovation in anatomy and development. But it also presents major challenges, including the computational definitions of anatomical homology and of organ function, the criteria for the comparison of developmental stages, the annotation of transcriptomics data to proper anatomical and developmental terms, and the statistical methods to compare transcriptomic data between species to highlight significant conservation or changes. In this article, we review these challenges, and the ongoing efforts to address them, which are emerging from bioinformatics work on ontologies, evolutionary statistics, and data curation, with a focus on their implementation in the context of the development of our database Bgee (http://bgee.org). J. Exp. Zool. (Mol. Dev. Evol.) 324B: 372-382, 2015. © 2015 Wiley Periodicals, Inc.
Resumo:
Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
El presente trabajo se ha centrado en la investigación de soluciones para automatizar la tarea del enriquecimiento de fuentes de datos sobre redes de sensores con descripciones lingüísticas, con el fin de facilitar la posterior generación de textos en lenguaje natural. El uso de descripciones en lenguaje natural facilita el acceso a los datos a una mayor diversidad de usuarios y, como consecuencia, permite aprovechar mejor las inversiones en redes de sensores. En el trabajo se ha considerado el uso de bases de datos abiertas para abordar la necesidad de disponer de un gran volumen y diversidad de conocimiento geográfico. Se ha analizado también el enriquecimiento de datos dentro de enfoques metodológicos de curación de datos y métodos de generación de lenguaje natural. Como resultado del trabajo, se ha planteado un método general basado en una estrategia de generación y prueba que incluye una forma de representación y uso del conocimiento heurístico con varias etapas de razonamiento para la construcción de descripciones lingüísticas de enriquecimiento de datos. En la evaluación de la propuesta general se han manejado tres escenarios, dos de ellos para generación de referencias geográficas sobre redes de sensores complejas de dimensión real y otro para la generación de referencias temporales. Los resultados de la evaluación han mostrado la validez práctica de la propuesta general exhibiendo mejoras de rendimiento respecto a otros enfoques. Además, el análisis de los resultados ha permitido identificar y cuantificar el impacto previsible de diversas líneas de mejora en bases de datos abiertas. ABSTRACT This work has focused on the search for solutions to automate the task of enrichment sensor-network-based data sources with textual descriptions, so as to facilitate the generation of natural language texts. Using natural language descriptions facilitates data access to a wider range of users and, therefore, allows better leveraging investments in sensor networks. In this work we have considered the use of open databases to address the need for a large volume and diversity of geographical knowledge. We have also analyzed data enrichment in methodological approaches and data curation methods of natural language generation. As a result, it has raised a general method based on a strategy of generating and testing that includes a representation using heuristic knowledge with several stages of reasoning for the construction of linguistic descriptions of data enrichment. In assessing the overall proposal three scenarios have been addressed, two of them in the environmental domain with complex sensor networks and another real dimension in the time domain. The evaluation results have shown the validity and practicality of our proposal, showing performance improvements over other approaches. Furthermore, the analysis of the results has allowed identifying and quantifying the expected impact of various lines of improvement in open databases.