3 resultados para Peer-to-peer data-sharing

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper provides a brief but comprehensive guide to creating, preparing and dissecting a 'virtual' fossil, using a worked example to demonstrate some standard data processing techniques. Computed tomography (CT) is a 3D imaging modality for producing 'virtual' models of an object on a computer. In the last decade, CT technology has greatly improved, allowing bigger and denser objects to be scanned increasingly rapidly. The technique has now reached a stage where systems can facilitate large-scale, non-destructive comparative studies of extinct fossils and their living relatives. Consequently the main limiting factor in CT-based analyses is no longer scanning, but the hurdles of data processing (see disclaimer). The latter comprises the techniques required to convert a 3D CT volume (stack of digital slices) into a virtual image of the fossil that can be prepared (separated) from the matrix and 'dissected' into its anatomical parts. This technique can be applied to specimens or part of specimens embedded in the rock matrix that until now have been otherwise impossible to visualise. This paper presents a suggested workflow explaining the steps required, using as example a fossil tooth of Sphenacanthus hybodoides (Egerton), a shark from the Late Carboniferous of England. The original NHMUK copyrighted CT slice stack can be downloaded for practice of the described techniques, which include segmentation, rendering, movie animation, stereo-anaglyphy, data storage and dissemination. Fragile, rare specimens and type materials in university and museum collections can therefore be virtually processed for a variety of purposes, including virtual loans, website illustrations, publications and digital collections. Micro-CT and other 3D imaging techniques are increasingly utilized to facilitate data sharing among scientists and on education and outreach projects. Hence there is the potential to usher in a new era of global scientific collaboration and public communication using specimens in museum collections.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical methods have been widely employed to assess the capabilities of credit scoring classification models in order to reduce the risk of wrong decisions when granting credit facilities to clients. The predictive quality of a classification model can be evaluated based on measures such as sensitivity, specificity, predictive values, accuracy, correlation coefficients and information theoretical measures, such as relative entropy and mutual information. In this paper we analyze the performance of a naive logistic regression model (Hosmer & Lemeshow, 1989) and a logistic regression with state-dependent sample selection model (Cramer, 2004) applied to simulated data. Also, as a case study, the methodology is illustrated on a data set extracted from a Brazilian bank portfolio. Our simulation results so far revealed that there is no statistically significant difference in terms of predictive capacity between the naive logistic regression models and the logistic regression with state-dependent sample selection models. However, there is strong difference between the distributions of the estimated default probabilities from these two statistical modeling techniques, with the naive logistic regression models always underestimating such probabilities, particularly in the presence of balanced samples. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background The use of the knowledge produced by sciences to promote human health is the main goal of translational medicine. To make it feasible we need computational methods to handle the large amount of information that arises from bench to bedside and to deal with its heterogeneity. A computational challenge that must be faced is to promote the integration of clinical, socio-demographic and biological data. In this effort, ontologies play an essential role as a powerful artifact for knowledge representation. Chado is a modular ontology-oriented database model that gained popularity due to its robustness and flexibility as a generic platform to store biological data; however it lacks supporting representation of clinical and socio-demographic information. Results We have implemented an extension of Chado – the Clinical Module - to allow the representation of this kind of information. Our approach consists of a framework for data integration through the use of a common reference ontology. The design of this framework has four levels: data level, to store the data; semantic level, to integrate and standardize the data by the use of ontologies; application level, to manage clinical databases, ontologies and data integration process; and web interface level, to allow interaction between the user and the system. The clinical module was built based on the Entity-Attribute-Value (EAV) model. We also proposed a methodology to migrate data from legacy clinical databases to the integrative framework. A Chado instance was initialized using a relational database management system. The Clinical Module was implemented and the framework was loaded using data from a factual clinical research database. Clinical and demographic data as well as biomaterial data were obtained from patients with tumors of head and neck. We implemented the IPTrans tool that is a complete environment for data migration, which comprises: the construction of a model to describe the legacy clinical data, based on an ontology; the Extraction, Transformation and Load (ETL) process to extract the data from the source clinical database and load it in the Clinical Module of Chado; the development of a web tool and a Bridge Layer to adapt the web tool to Chado, as well as other applications. Conclusions Open-source computational solutions currently available for translational science does not have a model to represent biomolecular information and also are not integrated with the existing bioinformatics tools. On the other hand, existing genomic data models do not represent clinical patient data. A framework was developed to support translational research by integrating biomolecular information coming from different “omics” technologies with patient’s clinical and socio-demographic data. This framework should present some features: flexibility, compression and robustness. The experiments accomplished from a use case demonstrated that the proposed system meets requirements of flexibility and robustness, leading to the desired integration. The Clinical Module can be accessed in http://dcm.ffclrp.usp.br/caib/pg=iptrans webcite.