5 resultados para Data anonymization and sanitization

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Each plasma physics laboratory has a proprietary scheme to control and data acquisition system. Usually, it is different from one laboratory to another. It means that each laboratory has its own way to control the experiment and retrieving data from the database. Fusion research relies to a great extent on international collaboration and this private system makes it difficult to follow the work remotely. The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The choice of MDSplus (Model Driven System plus) is proved by the fact that it is widely utilized, and the scientists from different institutions may use the same system in different experiments in different tokamaks without the need to know how each system treats its acquisition system and data analysis. Another important point is the fact that the MDSplus has a library system that allows communication between different types of language (JAVA, Fortran, C, C++, Python) and programs such as MATLAB, IDL, OCTAVE. In the case of tokamak TCABR interfaces (object of this paper) between the system already in use and MDSplus were developed, instead of using the MDSplus at all stages, from the control, and data acquisition to the data analysis. This was done in the way to preserve a complex system already in operation and otherwise it would take a long time to migrate. This implementation also allows add new components using the MDSplus fully at all stages. (c) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The CUPID (Cultural and Psychosocial Influences on Disability) study was established to explore the hypothesis that common musculoskeletal disorders (MSDs) and associated disability are importantly influenced by culturally determined health beliefs and expectations. This paper describes the methods of data collection and various characteristics of the study sample. Methods/Principal Findings: A standardised questionnaire covering musculoskeletal symptoms, disability and potential risk factors, was used to collect information from 47 samples of nurses, office workers, and other (mostly manual) workers in 18 countries from six continents. In addition, local investigators provided data on economic aspects of employment for each occupational group. Participation exceeded 80% in 33 of the 47 occupational groups, and after pre-specified exclusions, analysis was based on 12,426 subjects (92 to 1018 per occupational group). As expected, there was high usage of computer keyboards by office workers, while nurses had the highest prevalence of heavy manual lifting in all but one country. There was substantial heterogeneity between occupational groups in economic and psychosocial aspects of work; three-to fivefold variation in awareness of someone outside work with musculoskeletal pain; and more than ten-fold variation in the prevalence of adverse health beliefs about back and arm pain, and in awareness of terms such as "repetitive strain injury" (RSI). Conclusions/Significance: The large differences in psychosocial risk factors (including knowledge and beliefs about MSDs) between occupational groups should allow the study hypothesis to be addressed effectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Last Glacial Maximum simulated sea surface temperature from the Paleo-Climate version of the National Center for Atmospheric Research Coupled Climate Model (NCAR-CCSM) are compared with available reconstructions and data-based products in the tropical and south Atlantic region. Model results are compared to data proxies based on the Multiproxy Approach for the Reconstruction of the Glacial Ocean surface product (MARGO). Results show that the model sea surface temperature is not consistent with the proxy-data in all of the region of interest. Discrepancies are found in the eastern, equatorial and in the high-latitude South Atlantic. The model overestimates the cooling in the southern South Atlantic (near 50 degrees S) shown by the proxy-data. Near the equator, model and proxies are in better agreement. In the eastern part of the equatorial basin the model underestimates the cooling shown by all proxies. A northward shift in the position of the subtropical convergence zone in the simulation suggests a compression or/and an equatorward shift of the subtropical gyre at the surface, consistent with what is observed in the proxy reconstruction. (C) 2008 Elsevier B.V. All rights reserved

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose nonlinear elliptical models for correlated data with heteroscedastic and/or autoregressive structures. Our aim is to extend the models proposed by Russo et al. [22] by considering a more sophisticated scale structure to deal with variations in data dispersion and/or a possible autocorrelation among measurements taken throughout the same experimental unit. Moreover, to avoid the possible influence of outlying observations or to take into account the non-normal symmetric tails of the data, we assume elliptical contours for the joint distribution of random effects and errors, which allows us to attribute different weights to the observations. We propose an iterative algorithm to obtain the maximum-likelihood estimates for the parameters and derive the local influence curvatures for some specific perturbation schemes. The motivation for this work comes from a pharmacokinetic indomethacin data set, which was analysed previously by Bocheng and Xuping [1] under normality.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.