Biblioteca Digital

22 resultados para Data anonymization and sanitization

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)

Crystallization, data collection and data processing of maltose-binding protein (MalE) from the phytopathogen Xanthomonas axonopodis pv. citri

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maltose-binding protein is the periplasmic component of the ABC transporter responsible for the uptake of maltose/maltodextrins. The Xanthomonas axonopodis pv. citri maltose-binding protein MalE has been crystallized at 293 Kusing the hanging-drop vapour-diffusion method. The crystal belonged to the primitive hexagonal space group P6(1)22, with unit-cell parameters a = 123.59, b = 123.59, c = 304.20 angstrom, and contained two molecules in the asymetric unit. It diffracted to 2.24 angstrom resolution.

Log-modified Weibull regression models with censored data: Sensitivity and residual analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a regression model considering the modified Weibull distribution. This distribution can be used to model bathtub-shaped failure rate functions. Assuming censored data, we consider maximum likelihood and Jackknife estimators for the parameters of the model. We derive the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and we also present some ways to perform global influence. Besides, for different parameter settings, sample sizes and censoring percentages, various simulations are performed and the empirical distribution of the modified deviance residual is displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended for a martingale-type residual in log-modified Weibull regression models with censored data. Finally, we analyze a real data set under log-modified Weibull regression models. A diagnostic analysis and a model checking based on the modified deviance residual are performed to select appropriate models. (c) 2008 Elsevier B.V. All rights reserved.

Regression models for grouped survival data: Estimation and sensitivity analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this study, regression models are evaluated for grouped survival data when the effect of censoring time is considered in the model and the regression structure is modeled through four link functions. The methodology for grouped survival data is based on life tables, and the times are grouped in k intervals so that ties are eliminated. Thus, the data modeling is performed by considering the discrete models of lifetime regression. The model parameters are estimated by using the maximum likelihood and jackknife methods. To detect influential observations in the proposed models, diagnostic measures based on case deletion, which are denominated global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to those measures, the local influence and the total influential estimate are also employed. Various simulation studies are performed and compared to the performance of the four link functions of the regression models for grouped survival data for different parameter settings, sample sizes and numbers of intervals. Finally, a data set is analyzed by using the proposed regression models. (C) 2010 Elsevier B.V. All rights reserved.

Brazilian Network of Food Data Systems and LATINFOODS Regional Technical Compilation Committee: Food composition activities (2006-2009)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Brazilian Network of Food Data Systems (BRASILFOODS) has been keeping the Brazilian Food Composition Database-USP (TBCA-USP) (http://www.fcf.usp.br/tabela) since 1998. Besides the constant compilation, analysis and update work in the database, the network tries to innovate through the introduction of food information that may contribute to decrease the risk for non-transmissible chronic diseases, such as the profile of carbohydrates and flavonoids in foods. In 2008, data on carbohydrates, individually analyzed, of 112 foods, and 41 data related to the glycemic response produced by foods widely consumed in the country were included in the TBCA-USP. Data (773) about the different flavonoid subclasses of 197 Brazilian foods were compiled and the quality of each data was evaluated according to the USDAs data quality evaluation system. In 2007, BRASILFOODS/USP and INFOODS/FAO organized the 7th International Food Data Conference ""Food Composition and Biodiversity"". This conference was a unique opportunity for interaction between renowned researchers and participants from several countries and it allowed the discussion of aspects that may improve the food composition area. During the period, the LATINFOODS Regional Technical Compilation Committee and BRASILFOODS disseminated to Latin America the Form and Manual for Data Compilation, version 2009, ministered a Food Composition Data Compilation course and developed many activities related to data production and compilation. (C) 2010 Elsevier Inc. All rights reserved.

Upgrading a TCABR data analysis and acquisition system for remote participation using Java, XML, RCP and modern client/server communication/authentication

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The architecture of the new system uses Java language as programming environment. Since application parameters and hardware in a joint experiment are complex with a large variability of components, requirements and specification solutions need to be flexible and modular, independent from operating system and computer architecture. To describe and organize the information on all the components and the connections among them, systems are developed using the extensible Markup Language (XML) technology. The communication between clients and servers uses remote procedure call (RPC) based on the XML (RPC-XML technology). The integration among Java language, XML and RPC-XML technologies allows to develop easily a standard data and communication access layer between users and laboratories using common software libraries and Web application. The libraries allow data retrieval using the same methods for all user laboratories in the joint collaboration, and the Web application allows a simple graphical user interface (GUI) access. The TCABR tokamak team in collaboration with the IPFN (Instituto de Plasmas e Fusao Nuclear, Instituto Superior Tecnico, Universidade Tecnica de Lisboa) is implementing this remote participation technologies. The first version was tested at the Joint Experiment on TCABR (TCABRJE), a Host Laboratory Experiment, organized in cooperation with the IAEA (International Atomic Energy Agency) in the framework of the IAEA Coordinated Research Project (CRP) on ""Joint Research Using Small Tokamaks"". (C) 2010 Elsevier B.V. All rights reserved.

The Schenberg data acquisition and analysis: results from its first commissioning run

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Mario Schenberg gravitational wave detector has started its commissioning phase at the Physics Institute of the University of Sao Paulo. We have collected almost 200 h of data from the instrument in order to check out its behavior and performance. We have also been developing a data acquisition system for it under a VXI System. Such a system is composed of an analog-to-digital converter and a GPS receiver for time synchronization. We have been building the software that controls and sets up the data acquisition. Here we present an overview of the Mario Schenberg detector and its data acquisition system, some results from the first commissioning run and solutions for some problems we have identified.

Missing data mechanisms and their implications on the analysis of categorical data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space.

Qualitative Analysis of the Interdisciplinary Interaction between Data Analysis Specialists and Novice Clinical Researchers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The inherent complexity of statistical methods and clinical phenomena compel researchers with diverse domains of expertise to work in interdisciplinary teams, where none of them have a complete knowledge in their counterpart's field. As a result, knowledge exchange may often be characterized by miscommunication leading to misinterpretation, ultimately resulting in errors in research and even clinical practice. Though communication has a central role in interdisciplinary collaboration and since miscommunication can have a negative impact on research processes, to the best of our knowledge, no study has yet explored how data analysis specialists and clinical researchers communicate over time. Methods/Principal Findings: We conducted qualitative analysis of encounters between clinical researchers and data analysis specialists (epidemiologist, clinical epidemiologist, and data mining specialist). These encounters were recorded and systematically analyzed using a grounded theory methodology for extraction of emerging themes, followed by data triangulation and analysis of negative cases for validation. A policy analysis was then performed using a system dynamics methodology looking for potential interventions to improve this process. Four major emerging themes were found. Definitions using lay language were frequently employed as a way to bridge the language gap between the specialties. Thought experiments presented a series of ""what if'' situations that helped clarify how the method or information from the other field would behave, if exposed to alternative situations, ultimately aiding in explaining their main objective. Metaphors and analogies were used to translate concepts across fields, from the unfamiliar to the familiar. Prolepsis was used to anticipate study outcomes, thus helping specialists understand the current context based on an understanding of their final goal. Conclusion/Significance: The communication between clinical researchers and data analysis specialists presents multiple challenges that can lead to errors.

Integration and visualization of systems biology data in context of the genome

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment. Results: The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data. A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome. Conclusions: Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment.

Evaluation of FAO Penman-Monteith and alternative methods for estimating reference evapotranspiration with missing data in Southern Ontario, Canada

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Grass reference evapotranspiration (ETo) is an important agrometeorological parameter for climatological and hydrological studies, as well as for irrigation planning and management. There are several methods to estimate ETo, but their performance in different environments is diverse, since all of them have some empirical background. The FAO Penman-Monteith (FAD PM) method has been considered as a universal standard to estimate ETo for more than a decade. This method considers many parameters related to the evapotranspiration process: net radiation (Rn), air temperature (7), vapor pressure deficit (Delta e), and wind speed (U); and has presented very good results when compared to data from lysimeters Populated with short grass or alfalfa. In some conditions, the use of the FAO PM method is restricted by the lack of input variables. In these cases, when data are missing, the option is to calculate ETo by the FAD PM method using estimated input variables, as recommended by FAD Irrigation and Drainage Paper 56. Based on that, the objective of this study was to evaluate the performance of the FAO PM method to estimate ETo when Rn, Delta e, and U data are missing, in Southern Ontario, Canada. Other alternative methods were also tested for the region: Priestley-Taylor, Hargreaves, and Thornthwaite. Data from 12 locations across Southern Ontario, Canada, were used to compare ETo estimated by the FAD PM method with a complete data set and with missing data. The alternative ETo equations were also tested and calibrated for each location. When relative humidity (RH) and U data were missing, the FAD PM method was still a very good option for estimating ETo for Southern Ontario, with RMSE smaller than 0.53 mm day(-1). For these cases, U data were replaced by the normal values for the region and Delta e was estimated from temperature data. The Priestley-Taylor method was also a good option for estimating ETo when U and Delta e data were missing, mainly when calibrated locally (RMSE = 0.40 mm day(-1)). When Rn was missing, the FAD PM method was not good enough for estimating ETo, with RMSE increasing to 0.79 mm day(-1). When only T data were available, adjusted Hargreaves and modified Thornthwaite methods were better options to estimate ETo than the FAO) PM method, since RMSEs from these methods, respectively 0.79 and 0.83 mm day(-1), were significantly smaller than that obtained by FAO PM (RMSE = 1.12 mm day(-1). (C) 2009 Elsevier B.V. All rights reserved.

Mitochondrial sequence data and Dipsacales phylogeny: Mixed models, partitioned Bayesian analyses, and model selection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Phylogenetic analyses of chloroplast DNA sequences, morphology, and combined data have provided consistent support for many of the major branches within the angiosperm, clade Dipsacales. Here we use sequences from three mitochondrial loci to test the existing broad scale phylogeny and in an attempt to resolve several relationships that have remained uncertain. Parsimony, maximum likelihood, and Bayesian analyses of a combined mitochondrial data set recover trees broadly consistent with previous studies, although resolution and support are lower than in the largest chloroplast analyses. Combining chloroplast and mitochondrial data results in a generally well-resolved and very strongly supported topology but the previously recognized problem areas remain. To investigate why these relationships have been difficult to resolve we conducted a series of experiments using different data partitions and heterogeneous substitution models. Usually more complex modeling schemes are favored regardless of the partitions recognized but model choice had little effect on topology or support values. In contrast there are consistent but weakly supported differences in the topologies recovered from coding and non-coding matrices. These conflicts directly correspond to relationships that were poorly resolved in analyses of the full combined chloroplast-mitochondrial data set. We suggest incongruent signal has contributed to our inability to confidently resolve these problem areas. (c) 2007 Elsevier Inc. All rights reserved.

Association between neuromuscular tests and kumite performance on the Brazilian Karate National Team

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to verify the relationship of strength and power with performance on an international level karate team during official kumite simulations. Fourteen male black belt karate athletes were submitted to anthropometric data collection and then performed the following tests on two different days: vertical jump test, bench press and squat maximum dynamic strength (1RM) tests. We also tested power production for both exercises at 30 and 60% 1RM and performed a kumite match simulation. Blood samples were obtained at rest and immediately after the kumite matches to measure blood lactate concentration. Karate players were separated by performance (winners vs. defeated) on the kumite matches. We found no significant differences between winners and defeated for strength, vertical jump height, anthropometric data and blood lactate concentration. Interestingly, winners were more powerful in the bench press and squat exercises at 30% 1RM. Maximum strength was correlated with absolute (30% 1RM r = 0.92; 60% 1RM r = 0.63) and relative power (30% 1RM r = 0.74; 60% 1RM r = 0.11, p > 0.05) for the bench press exercise. We concluded that international level karate players' kumite match performance are influenced by higher levels of upper and lower limbs power production.

NGC 7097: THE ACTIVE GALACTIC NUCLEUS AND ITS MIRROR, REVEALED BY PRINCIPAL COMPONENT ANALYSIS TOMOGRAPHY

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Three-dimensional spectroscopy techniques are becoming more and more popular, producing an increasing number of large data cubes. The challenge of extracting information from these cubes requires the development of new techniques for data processing and analysis. We apply the recently developed technique of principal component analysis (PCA) tomography to a data cube from the center of the elliptical galaxy NGC 7097 and show that this technique is effective in decomposing the data into physically interpretable information. We find that the first five principal components of our data are associated with distinct physical characteristics. In particular, we detect a low-ionization nuclear-emitting region (LINER) with a weak broad component in the Balmer lines. Two images of the LINER are present in our data, one seen through a disk of gas and dust, and the other after scattering by free electrons and/or dust particles in the ionization cone. Furthermore, we extract the spectrum of the LINER, decontaminated from stellar and extended nebular emission, using only the technique of PCA tomography. We anticipate that the scattered image has polarized light due to its scattered nature.

Cluster and cluster galaxy evolution history from IR to X-ray observations of the young cluster RX J1257.2+4738 at z=0.866

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Context. The cosmic time around the z similar to 1 redshift range appears crucial in the cluster and galaxy evolution, since it is probably the epoch of the first mature galaxy clusters. Our knowledge of the properties of the galaxy populations in these clusters is limited because only a handful of z similar to 1 clusters are presently known. Aims. In this framework, we report the discovery of a z similar to 0.87 cluster and study its properties at various wavelengths. Methods. We gathered X-ray and optical data (imaging and spectroscopy), and near and far infrared data (imaging) in order to confirm the cluster nature of our candidate, to determine its dynamical state, and to give insight on its galaxy population evolution. Results. Our candidate structure appears to be a massive z similar to 0.87 dynamically young cluster with an atypically high X-ray temperature as compared to its X-ray luminosity. It exhibits a significant percentage (similar to 90%) of galaxies that are also detected in the 24 mu m band. Conclusions. The cluster RXJ1257.2+4738 appears to be still in the process of collapsing. Its relatively high temperature is probably the consequence of significant energy input into the intracluster medium besides the regular gravitational infall contribution. A significant part of its galaxies are red objects that are probably dusty with on-going star formation.

Proposal for DICOM Multiframe Medical Image Integrity and Authenticity

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel algorithm to successfully achieve viable integrity and authenticity addition and verification of n-frame DICOM medical images using cryptographic mechanisms. The aim of this work is the enhancement of DICOM security measures, especially for multiframe images. Current approaches have limitations that should be properly addressed for improved security. The algorithm proposed in this work uses data encryption to provide integrity and authenticity, along with digital signature. Relevant header data and digital signature are used as inputs to cipher the image. Therefore, one can only retrieve the original data if and only if the images and the inputs are correct. The encryption process itself is a cascading scheme, where a frame is ciphered with data related to the previous frames, generating also additional data on image integrity and authenticity. Decryption is similar to encryption, featuring also the standard security verification of the image. The implementation was done in JAVA, and a performance evaluation was carried out comparing the speed of the algorithm with other existing approaches. The evaluation showed a good performance of the algorithm, which is an encouraging result to use it in a real environment.

«
1
2
»