879 resultados para Security of data
Resumo:
A study of how the machine learning technique, known as gentleboost, could improve different digital watermarking methods such as LSB, DWT, DCT2 and Histogram shifting.
Resumo:
Human T-cell lymphotropic virus type 1 (HTLV-1) is mainly associated with two diseases: tropical spastic paraparesis/HTLV-1-associated myelopathy (TSP/HAM) and adult T-cell leukaemia/lymphoma. This retrovirus infects five-10 million individuals throughout the world. Previously, we developed a database that annotates sequence data from GenBank and the present study aimed to describe the clinical, molecular and epidemiological scenarios of HTLV-1 infection through the stored sequences in this database. A total of 2,545 registered complete and partial sequences of HTLV-1 were collected and 1,967 (77.3%) of those sequences represented unique isolates. Among these isolates, 93% contained geographic origin information and only 39% were related to any clinical status. A total of 1,091 sequences contained information about the geographic origin and viral subtype and 93% of these sequences were identified as subtype “a”. Ethnicity data are very scarce. Regarding clinical status data, 29% of the sequences were generated from TSP/HAM and 67.8% from healthy carrier individuals. Although the data mining enabled some inferences about specific aspects of HTLV-1 infection to be made, due to the relative scarcity of data of available sequences, it was not possible to delineate a global scenario of HTLV-1 infection.
Resumo:
Objectives: We are interested in the numerical simulation of the anastomotic region comprised between outflow canula of LVAD and the aorta. Segmenta¬tion, geometry reconstruction and grid generation from patient-specific data remain an issue because of the variable quality of DICOM images, in particular CT-scan (e.g. metallic noise of the device, non-aortic contrast phase). We pro¬pose a general framework to overcome this problem and create suitable grids for numerical simulations.Methods: Preliminary treatment of images is performed by reducing the level window and enhancing the contrast of the greyscale image using contrast-limited adaptive histogram equalization. A gradient anisotropic diffusion filter is applied to reduce the noise. Then, watershed segmentation algorithms and mathematical morphology filters allow reconstructing the patient geometry. This is done using the InsightToolKit library (www.itk.org). Finally the Vascular Model¬ing ToolKit (www.vmtk.org) and gmsh (www.geuz.org/gmsh) are used to create the meshes for the fluid (blood) and structure (arterial wall, outflow canula) and to a priori identify the boundary layers. The method is tested on five different patients with left ventricular assistance and who underwent a CT-scan exam.Results: This method produced good results in four patients. The anastomosis area is recovered and the generated grids are suitable for numerical simulations. In one patient the method failed to produce a good segmentation because of the small dimension of the aortic arch with respect to the image resolution.Conclusions: The described framework allows the use of data that could not be otherwise segmented by standard automatic segmentation tools. In particular the computational grids that have been generated are suitable for simulations that take into account fluid-structure interactions. Finally the presented method features a good reproducibility and fast application.
Resumo:
Background: Systematic approaches for identifying proteins involved in different types of cancer are needed. Experimental techniques such as microarrays are being used to characterize cancer, but validating their results can be a laborious task. Computational approaches are used to prioritize between genes putatively involved in cancer, usually based on further analyzing experimental data. Results: We implemented a systematic method using the PIANA software that predicts cancer involvement of genes by integrating heterogeneous datasets. Specifically, we produced lists of genes likely to be involved in cancer by relying on: (i) protein-protein interactions; (ii) differential expression data; and (iii) structural and functional properties of cancer genes. The integrative approach that combines multiple sources of data obtained positive predictive values ranging from 23% (on a list of 811 genes) to 73% (on a list of 22 genes), outperforming the use of any of the data sources alone. We analyze a list of 20 cancer gene predictions, finding that most of them have been recently linked to cancer in literature. Conclusion: Our approach to identifying and prioritizing candidate cancer genes can be used to produce lists of genes likely to be involved in cancer. Our results suggest that differential expression studies yielding high numbers of candidate cancer genes can be filtered using protein interaction networks.
Resumo:
The increasing volume of data describing humandisease processes and the growing complexity of understanding, managing, and sharing such data presents a huge challenge for clinicians and medical researchers. This paper presents the@neurIST system, which provides an infrastructure for biomedical research while aiding clinical care, by bringing together heterogeneous data and complex processing and computing services. Although @neurIST targets the investigation and treatment of cerebral aneurysms, the system’s architecture is generic enough that it could be adapted to the treatment of other diseases.Innovations in @neurIST include confining the patient data pertaining to aneurysms inside a single environment that offers cliniciansthe tools to analyze and interpret patient data and make use of knowledge-based guidance in planning their treatment. Medicalresearchers gain access to a critical mass of aneurysm related data due to the system’s ability to federate distributed informationsources. A semantically mediated grid infrastructure ensures that both clinicians and researchers are able to seamlessly access andwork on data that is distributed across multiple sites in a secure way in addition to providing computing resources on demand forperforming computationally intensive simulations for treatment planning and research.
Resumo:
In this paper we use a variety of data sources, both micro and macro, time series, crosssection, and panel data to provide an empirical evaluation of the current level of economicwellbeing of the Spanish elderly, and of its determinants. We focus, in particular on the role played by the pension system and its generosity in terms of minimum pension supplements and non-contributive pensions. In an IV context, we find that actual Social Security benefits contribute substantially to explain income and consumption poverty levels and trends of low income and consumption percentiles. Thus we offer support to previous evidence for Spain emphasizing the role of minimum benefit policies.
Resumo:
BACKGROUND: Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS: We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS: Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.
Resumo:
SUMMARY: We present a tool designed for visualization of large-scale genetic and genomic data exemplified by results from genome-wide association studies. This software provides an integrated framework to facilitate the interpretation of SNP association studies in genomic context. Gene annotations can be retrieved from Ensembl, linkage disequilibrium data downloaded from HapMap and custom data imported in BED or WIG format. AssociationViewer integrates functionalities that enable the aggregation or intersection of data tracks. It implements an efficient cache system and allows the display of several, very large-scale genomic datasets. AVAILABILITY: The Java code for AssociationViewer is distributed under the GNU General Public Licence and has been tested on Microsoft Windows XP, MacOSX and GNU/Linux operating systems. It is available from the SourceForge repository. This also includes Java webstart, documentation and example datafiles.
Resumo:
Contamination of weather radar echoes by anomalous propagation (anaprop) mechanisms remains a serious issue in quality control of radar precipitation estimates. Although significant progress has been made identifying clutter due to anaprop there is no unique method that solves the question of data reliability without removing genuine data. The work described here relates to the development of a software application that uses a numerical weather prediction (NWP) model to obtain the temperature, humidity and pressure fields to calculate the three dimensional structure of the atmospheric refractive index structure, from which a physically based prediction of the incidence of clutter can be made. This technique can be used in conjunction with existing methods for clutter removal by modifying parameters of detectors or filters according to the physical evidence for anomalous propagation conditions. The parabolic equation method (PEM) is a well established technique for solving the equations for beam propagation in a non-uniformly stratified atmosphere, but although intrinsically very efficient, is not sufficiently fast to be practicable for near real-time modelling of clutter over the entire area observed by a typical weather radar. We demonstrate a fast hybrid PEM technique that is capable of providing acceptable results in conjunction with a high-resolution terrain elevation model, using a standard desktop personal computer. We discuss the performance of the method and approaches for the improvement of the model profiles in the lowest levels of the troposphere.
Resumo:
Prediction of species' distributions is central to diverse applications in ecology, evolution and conservation science. There is increasing electronic access to vast sets of occurrence records in museums and herbaria, yet little effective guidance on how best to use this information in the context of numerous approaches for modelling distributions. To meet this need, we compared 16 modelling methods over 226 species from 6 regions of the world, creating the most comprehensive set of model comparisons to date. We used presence-only data to fit models, and independent presence-absence data to evaluate the predictions. Along with well-established modelling methods such as generalised additive models and GARP and BIOCLIM, we explored methods that either have been developed recently or have rarely been applied to modelling species' distributions. These include machine-learning methods and community models, both of which have features that may make them particularly well suited to noisy or sparse information, as is typical of species' occurrence data. Presence-only data were effective for modelling species' distributions for many species and regions. The novel methods consistently outperformed more established methods. The results of our analysis are promising for the use of data from museums and herbaria, especially as methods suited to the noise inherent in such data improve.
Resumo:
Trees are a great bank of data, named sometimes for this reason as the "silentwitnesses" of the past. Due to annual formation of rings, which is normally influenced directly by of climate parameters (generally changes in temperature and moisture or precipitation) and other environmental factors; these changes, occurred in the past, are"written" in the tree "archives" and can be "decoded" in order to interpret what hadhappened before, mainly applied for the past climate reconstruction.Using dendrochronological methods for obtaining samples of Pinus nigra fromthe Catalonian PrePirineous region, the cores of 15 trees with total time spine of about 100 - 250 years were analyzed for the tree ring width (TRW) patterns and had quite high correlation between them (0.71 ¿ 0.84), corresponding to a common behaviour for the environmental changes in their annual growth.After different trials with raw TRW data for standardization in order to take outthe negative exponential growth curve dependency, the best method of doubledetrending (power transformation and smoothing line of 32 years) were selected for obtaining the indexes for further analysis.Analyzing the cross-correlations between obtained tree ring width indexes andclimate data, significant correlations (p<0.05) were observed in some lags, as forexample, annual precipitation in lag -1 (previous year) had negative correlation with TRW growth in the Pallars region. Significant correlation coefficients are between 0.27- 0.51 (with positive or negative signs) for many cases; as for recent (but very short period) climate data of Seu d¿Urgell meteorological station, some significant correlation coefficients were observed, of the order of 0.9.These results confirm the hypothesis of using dendrochronological data as aclimate signal for further analysis, such as reconstruction of climate in the past orprediction in the future for the same locality.
Resumo:
With the trend in molecular epidemiology towards both genome-wide association studies and complex modelling, the need for large sample sizes to detect small effects and to allow for the estimation of many parameters within a model continues to increase. Unfortunately, most methods of association analysis have been restricted to either a family-based or a case-control design, resulting in the lack of synthesis of data from multiple studies. Transmission disequilibrium-type methods for detecting linkage disequilibrium from family data were developed as an effective way of preventing the detection of association due to population stratification. Because these methods condition on parental genotype, however, they have precluded the joint analysis of family and case-control data, although methods for case-control data may not protect against population stratification and do not allow for familial correlations. We present here an extension of a family-based association analysis method for continuous traits that will simultaneously test for, and if necessary control for, population stratification. We further extend this method to analyse binary traits (and therefore family and case-control data together) and accurately to estimate genetic effects in the population, even when using an ascertained family sample. Finally, we present the power of this binary extension for both family-only and joint family and case-control data, and demonstrate the accuracy of the association parameter and variance components in an ascertained family sample.
Resumo:
Introduction Health care professionals' perception of risk mayimpact on therapeutic management of women during pregnancy.Since the thalidomide tragedy, the use of drugs during pregnancygenerates fear. This concern might affect the estimation of the riskassociated with drug intake during pregnancy, leading to prematurediscontinuation of a required treatment, superfluous anxiety orpointless termination of a desired pregnancy. Although data regardingthe security of drugs during pregnancy are still scarce, a few specializedinformation sources exist providing reliable recommendationsfor daily practice. This study aimed at characterizing therisk perception associated with drugs during pregnancy in a sample ofSwiss health care professionals.Materials & Methods An online French and German survey was sentby email to the Swiss professional societies of Pharmacists, Gynecologists,Mid-wives and Pediatricians. The questionnaire wasconstructed to assess (a) the characteristics of the population and theopinion of the professionals regarding the medication use pattern intheir pregnant patients, (b) to evaluate the sources of information usedduring their practice and finally (c) to assess their risk perceptionassociated with drugs during pregnancy. Results were analyzed bydescriptive statistics.Results A total of 1,310 questionnaires were collected (18% responserate). Most health care professionals believe that 30-60% of theirpregnant patients are taking at least one treatment during their pregnancyand that 80% are adherent to it. A large majority think,however, that women are anxious when they must take their medication.More than 80% of health professionals commonly use theSwiss Drug Reference Book (Compendium) to assess the risk associatedwith drugs during pregnancy, despite the uniformly low levelof credibility and utility they express about this reference. Except forsome gynecologists, the majority of professionals are not aware of ordo not use specialized books. The majority of participants thinkwrongly that more than 30% of drugs are teratogenic. About 20% ofthem are not aware of the risk associated with paracetamol intakeduring pregnancy. More than 70% agree that phytotherapeutic mixturesare not safer than conventional drugs, with the exception of midwiveswho tend to overestimate the safety of such drugs. With thenotable exception of gynecologists, the risk related to drug intake wasoverall overestimated.Discussion & Conclusion Swiss professionals differ in their perceptionof the risk associated with drugs during pregnancy and tend tooverestimate it. The differences might be attributed to the level oftraining and awareness of specialized sources offering a realisticestimation of the risk. Further efforts are needed to expand thetraining and the tools for health care professionals to optimize druguse during pregnancy.
Resumo:
The graphical representation of spatial soil properties in a digital environment is complex because it requires a conversion of data collected in a discrete form onto a continuous surface. The objective of this study was to apply three-dimension techniques of interpolation and visualization on soil texture and fertility properties and establish relationships with pedogenetic factors and processes in a slope area. The GRASS Geographic Information System was used to generate three-dimensional models and ParaView software to visualize soil volumes. Samples of the A, AB, BA, and B horizons were collected in a regular 122-point grid in an area of 13 ha, in Pinhais, PR, in southern Brazil. Geoprocessing and graphic computing techniques were effective in identifying and delimiting soil volumes of distinct ranges of fertility properties confined within the soil matrix. Both three-dimensional interpolation and the visualization tool facilitated interpretation in a continuous space (volumes) of the cause-effect relationships between soil texture and fertility properties and pedological factors and processes, such as higher clay contents following the drainage lines of the area. The flattest part with more weathered soils (Oxisols) had the highest pH values and lower Al3+ concentrations. These techniques of data interpolation and visualization have great potential for use in diverse areas of soil science, such as identification of soil volumes occurring side-by-side but that exhibit different physical, chemical, and mineralogical conditions for plant root growth, and monitoring of plumes of organic and inorganic pollutants in soils and sediments, among other applications. The methodological details for interpolation and a three-dimensional view of soil data are presented here.
Resumo:
Many of the most interesting questions ecologists ask lead to analyses of spatial data. Yet, perhaps confused by the large number of statistical models and fitting methods available, many ecologists seem to believe this is best left to specialists. Here, we describe the issues that need consideration when analysing spatial data and illustrate these using simulation studies. Our comparative analysis involves using methods including generalized least squares, spatial filters, wavelet revised models, conditional autoregressive models and generalized additive mixed models to estimate regression coefficients from synthetic but realistic data sets, including some which violate standard regression assumptions. We assess the performance of each method using two measures and using statistical error rates for model selection. Methods that performed well included generalized least squares family of models and a Bayesian implementation of the conditional auto-regressive model. Ordinary least squares also performed adequately in the absence of model selection, but had poorly controlled Type I error rates and so did not show the improvements in performance under model selection when using the above methods. Removing large-scale spatial trends in the response led to poor performance. These are empirical results; hence extrapolation of these findings to other situations should be performed cautiously. Nevertheless, our simulation-based approach provides much stronger evidence for comparative analysis than assessments based on single or small numbers of data sets, and should be considered a necessary foundation for statements of this type in future.