928 resultados para Research Data Management
Design, recruitment, logistics, and data management of the GEHA (Genetics of Healthy Ageing) project
Resumo:
In 2004, the integrated European project GEHA (Genetics of Healthy Ageing) was initiated with the aim of identifying genes involved in healthy ageing and longevity. The first step in the project was the recruitment of more than 2500 pairs of siblings aged 90 years or more together with one younger control person from 15 areas in 11 European countries through a coordinated and standardised effort. A biological sample, preferably a blood sample, was collected from each participant, and basic physical and cognitive measures were obtained together with information about health, life style, and family composition. From 2004 to 2008 a total of 2535 families comprising 5319 nonagenarian siblings were identified and included in the project. In addition, 2548 younger control persons aged 50-75 years were recruited. A total of 2249 complete trios with blood samples from at least two old siblings and the younger control were formed and are available for genetic analyses (e.g. linkage studies and genome-wide association studies). Mortality follow-up improves the possibility of identifying families with the most extreme longevity phenotypes. With a mean follow-up time of 3.7 years the number of families with all participating siblings aged 95 years or more has increased by a factor of 5 to 750 families compared to when interviews were conducted. Thus, the GEHA project represents a unique source in the search for genes related to healthy ageing and longevity.
Resumo:
Master data management (MDM) integrates data from multiple
structured data sources and builds a consolidated 360-
degree view of business entities such as customers and products.
Today’s MDM systems are not prepared to integrate
information from unstructured data sources, such as news
reports, emails, call-center transcripts, and chat logs. However,
those unstructured data sources may contain valuable
information about the same entities known to MDM from
the structured data sources. Integrating information from
unstructured data into MDM is challenging as textual references
to existing MDM entities are often incomplete and
imprecise and the additional entity information extracted
from text should not impact the trustworthiness of MDM
data.
In this paper, we present an architecture for making MDM
text-aware and showcase its implementation as IBM InfoSphere
MDM Extension for Unstructured Text Correlation,
an add-on to IBM InfoSphere Master Data Management
Standard Edition. We highlight how MDM benefits from
additional evidence found in documents when doing entity
resolution and relationship discovery. We experimentally
demonstrate the feasibility of integrating information from
unstructured data sources into MDM.
Resumo:
The telemetry data processing operation intended for a given mission are pre-defined by an onboard telemetry configuration, mission trajectory and overall telemetry methodology have stabilized lately for ISRO vehicles. The given problem on telemetry data processing is reduced through hierarchical problem reduction whereby the sequencing of operations evolves as the control task and operations on data as the function task. The function task Input, Output and execution criteria are captured into tables which are examined by the control task and then schedules when the function task when the criteria is being met.
Resumo:
Los programas de inmersión lingüística han constituido y constituyen dentro del Sistema Educativo catalán la principal forma para que el alumnado de lengua familiar no-catalana aprenda una nueva lengua, el catalán, sin que, en su proceso de aprendizaje, vea mermado ni el desarrollo de su propia lengua ni su rendimiento académico. El éxito de la inmersión lingüística en las décadas anteriores ha sido frecuentemente utilizado como uno de los argumentos orientativos para justificar la política lingüística que se sigue en la escolarización de la infancia extranjera. Sin embargo, los resultados obtenidos por investigaciones recientes parece que no avalan empíricamente dicho argumento. Este artículo analiza dichos resultados y expone, a partir del Plan para la Lengua y Cohesión Social puesto en marcha por el Departamento de Educación de la Generalitat de Cataluña, cuáles son los retos que se presentan a su Sistema Educativo dentro del nuevo marco que supone el aumento de la diversidad cultural y lingüística en la actual sociedad catalana
Resumo:
Speaker: Dr Kieron O'Hara Organiser: Time: 04/02/2015 11:00-11:45 Location: B32/3077 Abstract In order to reap the potential societal benefits of big and broad data, it is essential to share and link personal data. However, privacy and data protection considerations mean that, to be shared, personal data must be anonymised, so that the data subject cannot be identified from the data. Anonymisation is therefore a vital tool for data sharing, but deanonymisation, or reidentification, is always possible given sufficient auxiliary information (and as the amount of data grows, both in terms of creation, and in terms of availability in the public domain, the probability of finding such auxiliary information grows). This creates issues for the management of anonymisation, which are exacerbated not only by uncertainties about the future, but also by misunderstandings about the process(es) of anonymisation. This talk discusses these issues in relation to privacy, risk management and security, reports on recent theoretical tools created by the UKAN network of statistics professionals (on which the author is one of the leads), and asks how long anonymisation can remain a useful tool, and what might replace it.
Resumo:
data analysis table
Resumo:
The iRODS system, created by the San Diego Supercomputing Centre, is a rule oriented data management system that allows the user to create sets of rules to define how the data is to be managed. Each rule corresponds to a particular action or operation (such as checksumming a file) and the system is flexible enough to allow the user to create new rules for new types of operations. The iRODS system can interface to any storage system (provided an iRODS driver is built for that system) and relies on its’ metadata catalogue to provide a virtual file-system that can handle files of any size and type. However, some storage systems (such as tape systems) do not handle small files efficiently and prefer small files to be packaged up (or “bundled”) into larger units. We have developed a system that can bundle small data files of any type into larger units - mounted collections. The system can create collection families and contains its’ own extensible metadata, including metadata on which family the collection belongs to. The mounted collection system can work standalone and is being incorporated into the iRODS system to enhance the systems flexibility to handle small files. In this paper we describe the motivation for creating a mounted collection system, its’ architecture and how it has been incorporated into the iRODS system. We describe different technologies used to create the mounted collection system and provide some performance numbers.
Resumo:
Pervasive computing is a continually, and rapidly, growing field, although still remains in relative infancy. The possible applications for the technology are numerous, and stand to fundamentally change the way users interact with technology. However, alongside these are equally numerous potential undesirable effects and risks. The lack of empirical naturalistic data in the real world makes studying the true impacts of this technology difficult. This paper describes how two independent research projects shared such valuable empirical data on the relationship between pervasive technologies and users. Each project had different aims and adopted different methods, but successfully used the same data and arrived at the same conclusions. This paper demonstrates the benefit of sharing research data in multidisciplinary pervasive computing research where real world implementations are not widely available.
Resumo:
In geophysics and seismology, raw data need to be processed to generate useful information that can be turned into knowledge by researchers. The number of sensors that are acquiring raw data is increasing rapidly. Without good data management systems, more time can be spent in querying and preparing datasets for analyses than in acquiring raw data. Also, a lot of good quality data acquired at great effort can be lost forever if they are not correctly stored. Local and international cooperation will probably be reduced, and a lot of data will never become scientific knowledge. For this reason, the Seismological Laboratory of the Institute of Astronomy, Geophysics and Atmospheric Sciences at the University of São Paulo (IAG-USP) has concentrated fully on its data management system. This report describes the efforts of the IAG-USP to set up a seismology data management system to facilitate local and international cooperation. © 2011 by the Istituto Nazionale di Geofisica e Vulcanologia. All rights reserved.