965 resultados para Data Repository


Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For sign languages used by deaf communities, linguistic corpora have until recently been unavailable, due to the lack of a writing system and a written culture in these communities, and the very recent advent of digital video. Recent improvements in video and computer technology have now made larger sign language datasets possible; however, large sign language datasets that are fully machine-readable are still elusive. This is due to two challenges. 1. Inconsistencies that arise when signs are annotated by means of spoken/written language. 2. The fact that many parts of signed interaction are not necessarily fully composed of lexical signs (equivalent of words), instead consisting of constructions that are less conventionalised. As sign language corpus building progresses, the potential for some standards in annotation is beginning to emerge. But before this project, there were no attempts to standardise these practices across corpora, which is required to be able to compare data crosslinguistically. This project thus had the following aims: 1. To develop annotation standards for glosses (lexical/word level) 2. To test their reliability and validity 3. To improve current software tools that facilitate a reliable workflow Overall the project aimed not only to set a standard for the whole field of sign language studies throughout the world but also to make significant advances toward two of the world’s largest machine-readable datasets for sign languages – specifically the BSL Corpus (British Sign Language, http://bslcorpusproject.org) and the Corpus NGT (Sign Language of the Netherlands, http://www.ru.nl/corpusngt).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

On Wednesday 17th June, the UK projects funded under round 3 of the Digging into Data challenge gathered together at Paddington for the mid-term progress meeting. This workshop provided projects with the opportunity to present, not just on progress to their plan, but on highlights, issues, challenges and share this information with the funders and other projects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study was undertaken by UKOLN on behalf of the Joint Information Systems Committee (JISC) in the period April to September 2008. Application profiles are metadata schemata which consist of data elements drawn from one or more namespaces, optimized for a particular local application. They offer a way for particular communities to base the interoperability specifications they create and use for their digital material on established open standards. This offers the potential for digital materials to be accessed, used and curated effectively both within and beyond the communities in which they were created. The JISC recognized the need to undertake a scoping study to investigate metadata application profile requirements for scientific data in relation to digital repositories, and specifically concerning descriptive metadata to support resource discovery and other functions such as preservation. This followed on from the development of the Scholarly Works Application Profile (SWAP) undertaken within the JISC Digital Repositories Programme and led by Andy Powell (Eduserv Foundation) and Julie Allinson (RRT UKOLN) on behalf of the JISC. Aims and Objectives 1.To assess whether a single metadata AP for research data, or a small number thereof, would improve resource discovery or discovery-to-delivery in any useful or significant way. 2.If so, then to:a.assess whether the development of such AP(s) is practical and if so, how much effort it would take; b.scope a community uptake strategy that is likely to be successful, identifying the main barriers and key stakeholders. 3.Otherwise, to investigate how best to improve cross-discipline, cross-community discovery-to-delivery for research data, and make recommendations to the JISC and others as appropriate. Approach The Study used a broad conception of what constitutes scientific data, namely data gathered, collated, structured and analysed using a recognizably scientific method, with a bias towards quantitative methods. The approach taken was to map out the landscape of existing data centres, repositories and associated projects, and conduct a survey of the discovery-to-delivery metadata they use or have defined, alongside any insights they have gained from working with this metadata. This was followed up by a series of unstructured interviews, discussing use cases for a Scientific Data Application Profile, and how widely a single profile might be applied. On the latter point, matters of granularity, the experimental/measurement contrast, the quantitative/qualitative contrast, the raw/derived data contrast, and the homogeneous/heterogeneous data collection contrast were discussed. The Study report was loosely structured according to the Singapore Framework for Dublin Core Application Profiles, and in turn considered: the possible use cases for a Scientific Data Application Profile; existing domain models that could either be used or adapted for use within such a profile; and a comparison existing metadata profiles and standards to identify candidate elements for inclusion in the description set profile for scientific data. The report also considered how the application profile might be implemented, its relationship to other application profiles, the alternatives to constructing a Scientific Data Application Profile, the development effort required, and what could be done to encourage uptake in the community. The conclusions of the Study were validated through a reference group of stakeholders.

Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Presentation on the journal research data policy registry at the Repository Fringe 2015.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study has investigated the medium to long term costs to Higher Education Institutions (HEIs) of the preservation of research data and developed guidance to HEFCE and institutions on these issues. It has provided an essential methodological foundation on research data costs for the forthcoming HEFCE-sponsored feasibility study for a UK Research Data Service.It will also assist HEIs and funding bodies wishing to establish strategies and TRAC costings for long-term data management and archiving. The rising tide of digital research data raises issues relating to access, curation and preservation for HEIs and within the UK a growing number of research funders are now implementing policies requiring researchers to submit data management, preservation or data sharing plans with their funding applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data has always been fundamental to many areas of research but in recent years it has become central to more disciplines and inter-disciplinary projects and grown substantially in scale and complexity. There is increasing awareness of its strategic importance as a resource in addressing modern global challenges and the possibilities being unlocked by rapid technological advances and their application in research (NAS2009). The first Keeping Research Data Safe study funded by JISC made a major contribution to understanding of long-term preservation costs for research data by developing a cost model and identifying cost variables for preserving research data in UK universities (Beagrie et al, 2008). The Keeping Research Data Safe 2 (KRDS2) project has built on this work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

DNA microarray, or DNA chip, is a technology that allows us to obtain the expression level of many genes in a single experiment. The fact that numerical expression values can be easily obtained gives us the possibility to use multiple statistical techniques of data analysis. In this project microarray data is obtained from Gene Expression Omnibus, the repository of National Center for Biotechnology Information (NCBI). Then, the noise is removed and data is normalized, also we use hypothesis tests to find the most relevant genes that may be involved in a disease and use machine learning methods like KNN, Random Forest or Kmeans. For performing the analysis we use Bioconductor, packages in R for the analysis of biological data, and we conduct a case study in Alzheimer disease. The complete code can be found in https://github.com/alberto-poncelas/ bioc-alzheimer