998 resultados para data curation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The International Molecular Exchange (IMEx) consortium is an international collaboration between major public interaction data providers to share literature-curation efforts and make a nonredundant set of protein interactions available in a single search interface on a common website (http://www.imexconsortium.org/). Common curation rules have been developed, and a central registry is used to manage the selection of articles to enter into the dataset. We discuss the advantages of such a service to the user, our quality-control measures and our data-distribution practices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The practitioners of bioinformatics require increasing sophistication from their software tools to take into account the particular characteristics that make their domain complex. For example, there is a great variation of experience of researchers, from novices who would like guidance from experts in the best resources to use to experts that wish to take greater management control of the tools used in their experiments. Also, the range of available, and conflicting, data formats is growing and there is a desire to automate the many trivial manual stages of in-silico experiments. Agent-oriented software development is one approach to tackling the design of complex applications. In this paper, we argue that, in fact, agent-oriented development is a particularly well-suited approach to developing bioinformatics tools that take into account the wider domain characteristics. To illustrate this, we design a data curation tool, which manages the format of experimental data, extend it to better account for the extra requirements placed by the domain characteristics, and show how the characteristics lead to a system well suited to an agent-oriented view.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The practitioners of bioinformatics require increasing sophistication from their software tools to take into account the particular characteristics that make their domain complex. For example, there is a great variation of experience of researchers, from novices who would like guidance from experts in the best resources to use to experts that wish to take greater management control of the tools used in their experiments. Also, the range of available, and conflicting, data formats is growing and there is a desire to automate the many trivial manual stages of in-silico experiments. Agent-oriented software development is one approach to tackling the design of complex applications. In this paper, we argue that, in fact, agent-oriented development is a particularly well-suited approach to developing bioinformatics tools that take into account the wider domain characteristics. To illustrate this, we design a data curation tool, which manages the format of experimental data, extend it to better account for the extra requirements placed by the domain characteristics, and show how the characteristics lead to a system well suited to an agent-oriented view.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Görzig, H., Engel, F., Brocks, H., Vogel, T. & Hemmje, M. (2015, August). Towards Data Management Planning Support for Research Data. Paper presented at the ASE International Conference on Data Science, Stanford, United States of America.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Jahnke and Asher explore workflows and methodologies at a variety of academic data curation sites, and Keralis delves into the academic milieu of library and information schools that offer instruction in data curation. Their conclusions point to the urgent need for a reliable and increasingly sophisticated professional cohort to support data-intensive research in our colleges, universities, and research centers.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Purpose – Linked data is gaining great interest in the cultural heritage domain as a new way for publishing, sharing and consuming data. The paper aims to provide a detailed method and MARiMbA a tool for publishing linked data out of library catalogues in the MARC 21 format, along with their application to the catalogue of the National Library of Spain in the datos.bne.es project. Design/methodology/approach – First, the background of the case study is introduced. Second, the method and process of its application are described. Third, each of the activities and tasks are defined and a discussion of their application to the case study is provided. Findings – The paper shows that the FRBR model can be applied to MARC 21 records following linked data best practices, librarians can successfully participate in the process of linked data generation following a systematic method, and data sources quality can be improved as a result of the process. Originality/value – The paper proposes a detailed method for publishing and linking linked data from MARC 21 records, provides practical examples, and discusses the main issues found in the application to a real case. Also, it proposes the integration of a data curation activity and the participation of librarians in the linked data generation process.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

"Contemporary society is in the midst of the boundless generation and collection of data, data that is produced from almost any measurable act. Be it weather or transport data sets published by government agencies, or the individual and interpersonal data generated by our digital interactions; a server somewhere is collating. With the rise of this digital data phenomenon comes questions of comprehension, purpose, ownership and translation. Without mediation digital data is an immense abstract list of text and numbers and in this abstracted form data sets become detached from the circumstances of their creation. Artists and digital creatives are building works from these constantly evolving data sets to develop a discourse that investigates, appropriates, reveals and reflects upon the society and environment that generates this medium. Datascape presents a range of works that use data as building blocks to facilitate connections and understanding around a range of personal, social and worldly issues. The exhibition is concerned with creating an opportunity for experiential discovery through engaging with work from some of the world’s prominent creatives in this field of practice. Utilising three thematic lenses: Generative Currents, the Anti-Sublime and the Human Context, the works offer a variety of pathways to traverse the Datascape. Lubi Thomas and Rachael Parsons, QUT Creative Industries Precinct"

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El presente trabajo se ha centrado en la investigación de soluciones para automatizar la tarea del enriquecimiento de fuentes de datos sobre redes de sensores con descripciones lingüísticas, con el fin de facilitar la posterior generación de textos en lenguaje natural. El uso de descripciones en lenguaje natural facilita el acceso a los datos a una mayor diversidad de usuarios y, como consecuencia, permite aprovechar mejor las inversiones en redes de sensores. En el trabajo se ha considerado el uso de bases de datos abiertas para abordar la necesidad de disponer de un gran volumen y diversidad de conocimiento geográfico. Se ha analizado también el enriquecimiento de datos dentro de enfoques metodológicos de curación de datos y métodos de generación de lenguaje natural. Como resultado del trabajo, se ha planteado un método general basado en una estrategia de generación y prueba que incluye una forma de representación y uso del conocimiento heurístico con varias etapas de razonamiento para la construcción de descripciones lingüísticas de enriquecimiento de datos. En la evaluación de la propuesta general se han manejado tres escenarios, dos de ellos para generación de referencias geográficas sobre redes de sensores complejas de dimensión real y otro para la generación de referencias temporales. Los resultados de la evaluación han mostrado la validez práctica de la propuesta general exhibiendo mejoras de rendimiento respecto a otros enfoques. Además, el análisis de los resultados ha permitido identificar y cuantificar el impacto previsible de diversas líneas de mejora en bases de datos abiertas. ABSTRACT This work has focused on the search for solutions to automate the task of enrichment sensor-network-based data sources with textual descriptions, so as to facilitate the generation of natural language texts. Using natural language descriptions facilitates data access to a wider range of users and, therefore, allows better leveraging investments in sensor networks. In this work we have considered the use of open databases to address the need for a large volume and diversity of geographical knowledge. We have also analyzed data enrichment in methodological approaches and data curation methods of natural language generation. As a result, it has raised a general method based on a strategy of generating and testing that includes a representation using heuristic knowledge with several stages of reasoning for the construction of linguistic descriptions of data enrichment. In assessing the overall proposal three scenarios have been addressed, two of them in the environmental domain with complex sensor networks and another real dimension in the time domain. The evaluation results have shown the validity and practicality of our proposal, showing performance improvements over other approaches. Furthermore, the analysis of the results has allowed identifying and quantifying the expected impact of various lines of improvement in open databases.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

WormBase (http://www.wormbase.org) is a web-based resource for the Caenorhabditis elegans genome and its biology. It builds upon the existing ACeDB database of the C.elegans genome by providing data curation services, a significantly expanded range of subject areas and a user-friendly front end.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present the Hungarian National Scientific Bibliography project: the MTMT. We argue that presently available commercial systems cannot be used as a comprehensive national bibliometric tool. The new database was created from existing databases of the Hungarian Academy of Sciences, but expected to be re-engineered in the future. The data curation model includes harvesting, the work of expert bibliographers and author feedback. MTMT will work together with the other services in the web of scientific information, using standard protocols and formats, and act as a hub. It will present the scientific output of Hungary together with the repositories containing the full text, wherever available. The database will be open, but not freely harvestable, and only for non-commercial use.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present some recent trends in the field of digital cultural heritage management and applications including digital cultural data curation, interoperability, open linked data publishing, crowd sourcing, visualization, platforms for digital cultural heritage, and applications. We present some examples from research and development projects of MUSIC/TUC in those areas.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Se presentan los resultados de un estudio sobre la formación universitaria que reciben los profesionales de la información sobre la materia gestión de proyectos, tras el análisis de las titulaciones en Información y Documentación a nivel internacional. Para ello, se han utilizado fuentes y directorios sobre la educación internacional en Library and Information Science y se ha creado una base de datos con 106 registros de asignaturas sobre gestión de proyectos incluidas en las titulaciones en Información y Documentación. Como resultado de este proceso, los parámetros de análisis de la investigación han sido la ubicación geográfica, las instituciones de educación superior, las titulaciones académicas y las propias asignaturas sobre gestión de proyectos. Entre las conclusiones más notables, destaca la obligatoriedad de las asignaturas, la enseñanza de las mismas en modo presencial y el caso de las universidades públicas de Estados Unidos, Alemania y Francia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

QUT Library Research Support has simplified and streamlined the process of research data management planning, storage, discovery and reuse through collaboration and the use of integrated and tailored online tools, and a simplification of the metadata schema. This poster presents the integrated data management services a QUT, including QUT’s Data Management Planning Tool, Research Data Finder, Spatial Data Finder and Software Finder, and information on the simplified Registry Interchange Format – Collections and Services (RIF-CS) Schema. The QUT Data Management Planning (DMP) Tool was built using the Digital Curation Centre’s DMP Online Tool and modified to QUT’s needs and policies. The tool allows researchers and Higher Degree Research students to plan how to handle research data throughout the active phase of their research. The plan is promoted as a ‘live’ document’ and researchers are encouraged to update it as required. The information entered into the plan can be made private or shared with supervisors, project members and external examiners. A plan is mandatory when requesting storage space on the QUT Research Data Storage Service. QUT’s Research Data Finder is integrated with QUT’s Academic Profiles and the Data Management Planning Tool to create a seamless data management process. This process aims to encourage the creation of high quality rich records which facilitate discovery and reuse of quality data. The Registry Interchange Format – Collections and Services (RIF-CS) Schema that is used in the QUT Research Data Finder was simplified to “RIF-CS lite” to reflect mandatory and optional metadata requirements. RIF-CS lite removed schema fields that were underused or extra to the needs of the users and system. This has reduced the amount of metadata fields required from users and made integration of systems a far more simple process where field content is easily shared across services making the process of collecting metadata as transparent as possible.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study has investigated the medium to long term costs to Higher Education Institutions (HEIs) of the preservation of research data and developed guidance to HEFCE and institutions on these issues. It has provided an essential methodological foundation on research data costs for the forthcoming HEFCE-sponsored feasibility study for a UK Research Data Service.It will also assist HEIs and funding bodies wishing to establish strategies and TRAC costings for long-term data management and archiving. The rising tide of digital research data raises issues relating to access, curation and preservation for HEIs and within the UK a growing number of research funders are now implementing policies requiring researchers to submit data management, preservation or data sharing plans with their funding applications.