114 resultados para curation


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Jahnke and Asher explore workflows and methodologies at a variety of academic data curation sites, and Keralis delves into the academic milieu of library and information schools that offer instruction in data curation. Their conclusions point to the urgent need for a reliable and increasingly sophisticated professional cohort to support data-intensive research in our colleges, universities, and research centers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background. Over the last years, the number of available informatics resources in medicine has grown exponentially. While specific inventories of such resources have already begun to be developed for Bioinformatics (BI), comparable inventories are as yet not available for Medical Informatics (MI) field, so that locating and accessing them currently remains a hard and time-consuming task. Description. We have created a repository of MI resources from the scientific literature, providing free access to its contents through a web-based service. Relevant information describing the resources is automatically extracted from manuscripts published in top-ranked MI journals. We used a pattern matching approach to detect the resources? names and their main features. Detected resources are classified according to three different criteria: functionality, resource type and domain. To facilitate these tasks, we have built three different taxonomies by following a novel approach based on folksonomies and social tagging. We adopted the terminology most frequently used by MI researchers in their publications to create the concepts and hierarchical relationships belonging to the taxonomies. The classification algorithm identifies the categories associated to resources and annotates them accordingly. The database is then populated with this data after manual curation and validation. Conclusions. We have created an online repository of MI resources to assist researchers in locating and accessing the most suitable resources to perform specific tasks. The database contained 282 resources at the time of writing. We are continuing to expand the number of available resources by taking into account further publications as well as suggestions from users and resource developers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the domain of eScience, investigations are increasingly collaborative. Most scientific and engineering domains benefit from building on top of the outputs of other research: By sharing information to reason over and data to incorporate in the modelling task at hand. This raises the need to provide means for preserving and sharing entire eScience workflows and processes for later reuse. It is required to define which information is to be collected, create means to preserve it and approaches to enable and validate the re-execution of a preserved process. This includes and goes beyond preserving the data used in the experiments, as the process underlying its creation and use is essential. This tutorial thus provides an introduction to the problem domain and discusses solutions for the curation of eScience processes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose – Linked data is gaining great interest in the cultural heritage domain as a new way for publishing, sharing and consuming data. The paper aims to provide a detailed method and MARiMbA a tool for publishing linked data out of library catalogues in the MARC 21 format, along with their application to the catalogue of the National Library of Spain in the datos.bne.es project. Design/methodology/approach – First, the background of the case study is introduced. Second, the method and process of its application are described. Third, each of the activities and tasks are defined and a discussion of their application to the case study is provided. Findings – The paper shows that the FRBR model can be applied to MARC 21 records following linked data best practices, librarians can successfully participate in the process of linked data generation following a systematic method, and data sources quality can be improved as a result of the process. Originality/value – The paper proposes a detailed method for publishing and linking linked data from MARC 21 records, provides practical examples, and discusses the main issues found in the application to a real case. Also, it proposes the integration of a data curation activity and the participation of librarians in the linked data generation process.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La rápida evolución experimentada en los últimos años por las tecnologías de Internet ha estimulado la proliferación de recursos software en varias disciplinas científicas, especialmente en bioinformática. En la mayoría de los casos, la tendencia actual es publicar dichos recursos como servicios accesibles libremente a través de Internet, utilizando tecnologías y patrones de diseño definidos para la implementación de Arquitecturas Orientadas a Servicios (SOA). La combinación simultánea de múltiples servicios dentro de un mismo flujo de trabajo abre la posibilidad de crear aplicaciones potencialmente más útiles y complejas. La integración de dichos servicios plantea grandes desafíos, tanto desde un punto de vista teórico como práctico, como por ejemplo, la localización y acceso a los recursos disponibles o la coordinación entre ellos. En esta tesis doctoral se aborda el problema de la identificación, localización, clasificación y acceso a los recursos informáticos disponibles en Internet. Con este fin, se ha definido un modelo genérico para la construcción de índices de recursos software con información extraída automáticamente de artículos de la literatura científica especializada en un área. Este modelo consta de seis fases que abarcan desde la selección de las fuentes de datos hasta el acceso a los índices creados, pasando por la identificación, extracción, clasificación y “curación” de la información relativa a los recursos. Para verificar la viabilidad, idoneidad y eficiencia del modelo propuesto, éste ha sido evaluado en dos dominios científicos diferentes—la BioInformática y la Informática Médica—dando lugar a dos índices de recursos denominados BioInformatics Resource Inventory (BIRI) y electronic-Medical Informatics Repository of Resources(e-MIR2) respectivamente. Los resultados obtenidos de estas aplicaciones son presentados a lo largo de la presente tesis doctoral y han dado lugar a varias publicaciones científicas en diferentes revistas JCR y congresos internacionales. El impacto potencial y la utilidad de esta tesis doctoral podrían resultar muy importantes teniendo en cuenta que, gracias a la generalidad del modelo propuesto, éste podría ser aplicado en cualquier disciplina científica. Algunas de las líneas de investigación futuras más relevantes derivadas de este trabajo son esbozadas al final en el último capítulo de este libro. ABSTRACT The rapid evolution experimented in the last years by the Internet technologies has stimulated the proliferation of heterogeneous software resources in most scientific disciplines, especially in the bioinformatics area. In most cases, current trends aim to publish those resources as services freely available over the Internet, using technologies and design patterns defined for the implementation of Service-Oriented Architectures (SOA). Simultaneous combination of various services into the same workflow opens the opportunity of creating more complex and useful applications. Integration of services raises great challenges, both from a theoretical to a practical point of view such as, for instance, the location and access to the available resources or the orchestration among them. This PhD thesis deals with the problem of identification, location, classification and access to informatics resources available over the Internet. On this regard, a general model has been defined for building indexes of software resources, with information extracted automatically from scientific articles from the literature specialized in the area. Such model consists of six phases ranging from the selection of data sources to the access to the indexes created, covering the identification, extraction, classification and curation of the information related to the software resources. To verify the viability, feasibility and efficiency of the proposed model, it has been evaluated in two different scientific domains—Bioinformatics and Medical Informatics—producing two resources indexes named BioInformatics Resources Inventory (BIRI) and electronic-Medical Informatics Repository of Resources (e-MIR2) respectively. The results and evaluation of those systems are presented along this PhD thesis, and they have produced different scientific publications in several JCR journals and international conferences. The potential impact and utility of this PhD thesis could be of great relevance considering that, thanks to the generality of the proposed model, it could be successfully extended to any scientific discipline. Some of the most relevant future research lines derived from this work are outlined at the end of this book.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El presente trabajo se ha centrado en la investigación de soluciones para automatizar la tarea del enriquecimiento de fuentes de datos sobre redes de sensores con descripciones lingüísticas, con el fin de facilitar la posterior generación de textos en lenguaje natural. El uso de descripciones en lenguaje natural facilita el acceso a los datos a una mayor diversidad de usuarios y, como consecuencia, permite aprovechar mejor las inversiones en redes de sensores. En el trabajo se ha considerado el uso de bases de datos abiertas para abordar la necesidad de disponer de un gran volumen y diversidad de conocimiento geográfico. Se ha analizado también el enriquecimiento de datos dentro de enfoques metodológicos de curación de datos y métodos de generación de lenguaje natural. Como resultado del trabajo, se ha planteado un método general basado en una estrategia de generación y prueba que incluye una forma de representación y uso del conocimiento heurístico con varias etapas de razonamiento para la construcción de descripciones lingüísticas de enriquecimiento de datos. En la evaluación de la propuesta general se han manejado tres escenarios, dos de ellos para generación de referencias geográficas sobre redes de sensores complejas de dimensión real y otro para la generación de referencias temporales. Los resultados de la evaluación han mostrado la validez práctica de la propuesta general exhibiendo mejoras de rendimiento respecto a otros enfoques. Además, el análisis de los resultados ha permitido identificar y cuantificar el impacto previsible de diversas líneas de mejora en bases de datos abiertas. ABSTRACT This work has focused on the search for solutions to automate the task of enrichment sensor-network-based data sources with textual descriptions, so as to facilitate the generation of natural language texts. Using natural language descriptions facilitates data access to a wider range of users and, therefore, allows better leveraging investments in sensor networks. In this work we have considered the use of open databases to address the need for a large volume and diversity of geographical knowledge. We have also analyzed data enrichment in methodological approaches and data curation methods of natural language generation. As a result, it has raised a general method based on a strategy of generating and testing that includes a representation using heuristic knowledge with several stages of reasoning for the construction of linguistic descriptions of data enrichment. In assessing the overall proposal three scenarios have been addressed, two of them in the environmental domain with complex sensor networks and another real dimension in the time domain. The evaluation results have shown the validity and practicality of our proposal, showing performance improvements over other approaches. Furthermore, the analysis of the results has allowed identifying and quantifying the expected impact of various lines of improvement in open databases.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The thesis investigates if with the free news production, people who post information on collaborative content sites, known as interacting, tend to reproduce information that was scheduled for Tv news. This study is a comparison of the collaborative content vehicles Vc reporter, Vc no G1 and Eu reporter with TV news SBT Brasil, Jornal Nacional, Jornal da Record and Jornal da Band. We sought to determine whether those newscasts guide the collaborative platforms. The hypothesis assumes that Brazilian TV news have been building over time a credible relationship with the viewer, so it is possible to think that the interacting use the same criteria for selecting the broadcasts and reproduce similar information in collaborative content sites. The method used was content analysis, based on the study of Laurence Bardin and the type of research used was quantitative. This research concluded that, within a small portion of the universe surveyed, there are schedules of television news across the collaborative content.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

WormBase (http://www.wormbase.org) is a web-based resource for the Caenorhabditis elegans genome and its biology. It builds upon the existing ACeDB database of the C.elegans genome by providing data curation services, a significantly expanded range of subject areas and a user-friendly front end.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ARKdb genome databases provide comprehensive public repositories for genome mapping data from farmed species and other animals (http://www.thearkdb.org) providing a resource similar in function to that offered by GDB or MGD for human or mouse genome mapping data, respectively. Because we have attempted to build a generic mapping database, the system has wide utility, particularly for those species for which development of a specific resource would be prohibitive. The ARKdb genome database model has been implemented for 10 species to date. These are pig, chicken, sheep, cattle, horse, deer, tilapia, cat, turkey and salmon. Access to the ARKdb databases is effected via the World Wide Web using the ARKdb browser and Anubis map viewer. The information stored includes details of loci, maps, experimental methods and the source references. Links to other information sources such as PubMed and EMBL/GenBank are provided. Responsibility for data entry and curation is shared amongst scientists active in genome research in the species of interest. Mirror sites in the United States are maintained in addition to the central genome server at Roslin.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

GOBASE (http://megasun.bch.umontreal.ca/gobase/) is a network-accessible biological database, which is unique in bringing together diverse biological data on organelles with taxonomically broad coverage, and in furnishing data that have been exhaustively verified and completed by experts. So far, we have focused on mitochondrial data: GOBASE contains all published nucleotide and protein sequences encoded by mitochondrial genomes, selected RNA secondary structures of mitochondria-encoded molecules, genetic maps of completely sequenced genomes, taxonomic information for all species whose sequences are present in the database and organismal descriptions of key protistan eukaryotes. All of these data have been integrated and organized in a formal database structure to allow sophisticated biological queries using terms that are inherent in biological concepts. Most importantly, data have been validated, completed, corrected and standardized, a prerequisite of meaningful analysis. In addition, where critical data are lacking, such as genetic maps and RNA secondary structures, they are generated by the GOBASE team and collaborators, and added to the database. The database is implemented in a relational database management system, but features an object-oriented view of the biological data through a Web/Genera-generated World Wide Web interface. Finally, we have developed software for database curation (i.e. data updates, validation and correction), which will be described in some detail in this paper.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The amount of genomic and proteomic data that is entered each day into databases and the experimental literature is outstripping the ability of experimental scientists to keep pace. While generic databases derived from automated curation efforts are useful, most biological scientists tend to focus on a class or family of molecules and their biological impact. Consequently, there is a need for molecular class-specific or other specialized databases. Such databases collect and organize data around a single topic or class of molecules. If curated well, such systems are extremely useful as they allow experimental scientists to obtain a large portion of the available data most relevant to their needs from a single source. We are involved in the development of two such databases with substantial pharmacological relevance. These are the GPCRDB and NucleaRDB information systems, which collect and disseminate data related to G protein-coupled receptors and intra-nuclear hormone receptors, respectively. The GPCRDB was a pilot project aimed at building a generic molecular class-specific database capable of dealing with highly heterogeneous data. A first version of the GPCRDB project has been completed and it is routinely used by thousands of scientists. The NucleaRDB was started recently as an application of the concept for the generalization of this technology. The GPCRDB is available via the WWW at http://www.gpcr.org/7tm/ and the NucleaRDB at http://www.receptors.org/NR/.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Molecular Biology Database Collection is an online resource listing key databases of value to the biological community. This Collection is intended to bring fellow scientists’ attention to high-quality databases that are available throughout the world, rather than just be a lengthy listing of all available databases. As such, this up-to-date listing is intended to serve as the initial point from which to find specialized databases that may be of use in biological research. The databases included in this Collection provide new value to the underlying data by virtue of curation, new data connections or other innovative approaches. Short, searchable summaries of each of the databases included in the Collection are available through the Nucleic Acids Research Web site, at http://www.nar.oupjournals.org.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La ricerca ha scelto di affrontare una serie di problemi connessi alla valorizzazione e alla conservazione del materiale pubblicitario rispetto a un caso studio selezionato, l’archivio dell’Art Directors Club Italiano conservato presso il Centro Studi e Archivio della Comunicazione dell’Università di Parma. Questo archivio è costituito principalmente da materiali della comunicazione pubblicitaria - suddivisi in categorie corrispondenti a media e tecniche - iscritti dal 1998 al 2003 agli ADCI Awards, il premio italiano di riferimento dedicato alla pubblicità e organizzato dall’Art Directors Club Italiano - ADCI a partire dall’anno della sua fondazione, il 1985. La sua storia è quindi connessa strettamente con quella dell’associazione, che rappresenta e riunisce professionisti della pubblicità che condividono obiettivi comuni, e in particolare il riconoscimento e la valorizzazione della creatività come elemento fondante della comunicazione d’impresa e istituzionale. Lo CSAC in parallelo, il contesto archivistico all’interno del quale questi fondi sono venuti a trovarsi in seguito alla donazione da parte dell’ADCI nel 2002-2003, è un centro di ricerca dell’Università di Parma dedicato alla conservazione e allo studio di archivi provenienti da diversi ambiti culturali. A partire da una mappatura dell’archivio e dalla ricostruzione di contesti e dibattiti fondamentali per studiare e organizzare i fondi, questa tesi si propone di individuare, in particolare attraverso gli strumenti digitali, possibili modalità di analisi, esposizione e accesso ai materiali, che possano aprire una rete di connessioni verso altri ambiti di ricerca e nuove prospettive in funzione delle storie della pubblicità.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A wealth of open educational resources (OER) focused on green topics is currently available through a variety of sources, including learning portals, digital repositories and web sites. However, in most cases these resources are not easily accessible and retrievable, while additional issues further complicate this issue. This paper presents an overview of a number of portals hosting OER, as well as a number of “green” thematic portals that provide access to green OER. It also discusses the case of a new collection that aims to support and populate existing green collections and learning portals respectively, providing information on aspects such as quality assurance/collection and curation policies, workflow and tools for both the content and metadata records that apply to the collection. Two case studies of the integration of this new collection to existing learning portals are also presented.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Performing Pedagogies was a week-long performance and exhibition series I organized that took place in Kingston, Ontario between March 15th - March 20th 2016. The motivation for this project came from a desire to explore performative modes of experiencing critical, embodied knowledge. The series featured five performances, a long distance collaboration between thirty-one Queen’s undergraduate students and a Vancouver artist-run free school (The School for Eventual Vacancy), a subsequent exhibition, a panel discussion, and a radical performance pedagogy workshop led by co-artistic director of the international performance art troupe, La Pocha Nostra. Artists featured included Golboo Amani, Basil AlZeri, Caitlin Chaisson, Justin Langlois, Saul Garcia-Lopez, Francisco-Fernando Granados, and Andrew Rabyniuk. By curating examples of performance art that variously incorporated embodied pedagogical interventions, I examined the processes of performance as pedagogy. Performing Pedagogies explored interventions into contemporary contours of neoliberal education paradigms through embodied encounters—fostering conversations about the meanings and limitations of knowledge dissemination and education today and posing questions about possibilities for radical pedagogies, embodied knowledge, and counter curricula.