948 resultados para Cryptography Statistical methods
Resumo:
Human brain imaging techniques, such as Magnetic Resonance Imaging (MRI) or Diffusion Tensor Imaging (DTI), have been established as scientific and diagnostic tools and their adoption is growing in popularity. Statistical methods, machine learning and data mining algorithms have successfully been adopted to extract predictive and descriptive models from neuroimage data. However, the knowledge discovery process typically requires also the adoption of pre-processing, post-processing and visualisation techniques in complex data workflows. Currently, a main problem for the integrated preprocessing and mining of MRI data is the lack of comprehensive platforms able to avoid the manual invocation of preprocessing and mining tools, that yields to an error-prone and inefficient process. In this work we present K-Surfer, a novel plug-in of the Konstanz Information Miner (KNIME) workbench, that automatizes the preprocessing of brain images and leverages the mining capabilities of KNIME in an integrated way. K-Surfer supports the importing, filtering, merging and pre-processing of neuroimage data from FreeSurfer, a tool for human brain MRI feature extraction and interpretation. K-Surfer automatizes the steps for importing FreeSurfer data, reducing time costs, eliminating human errors and enabling the design of complex analytics workflow for neuroimage data by leveraging the rich functionalities available in the KNIME workbench.
Resumo:
This paper presents an approximate closed form sample size formula for determining non-inferiority in active-control trials with binary data. We use the odds-ratio as the measure of the relative treatment effect, derive the sample size formula based on the score test and compare it with a second, well-known formula based on the Wald test. Both closed form formulae are compared with simulations based on the likelihood ratio test. Within the range of parameter values investigated, the score test closed form formula is reasonably accurate when non-inferiority margins are based on odds-ratios of about 0.5 or above and when the magnitude of the odds ratio under the alternative hypothesis lies between about 1 and 2.5. The accuracy generally decreases as the odds ratio under the alternative hypothesis moves upwards from 1. As the non-inferiority margin odds ratio decreases from 0.5, the score test closed form formula increasingly overestimates the sample size irrespective of the magnitude of the odds ratio under the alternative hypothesis. The Wald test closed form formula is also reasonably accurate in the cases where the score test closed form formula works well. Outside these scenarios, the Wald test closed form formula can either underestimate or overestimate the sample size, depending on the magnitude of the non-inferiority margin odds ratio and the odds ratio under the alternative hypothesis. Although neither approximation is accurate for all cases, both approaches lead to satisfactory sample size calculation for non-inferiority trials with binary data where the odds ratio is the parameter of interest.
Resumo:
Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.
Resumo:
Recruitment of patients to a clinical trial usually occurs over a period of time, resulting in the steady accumulation of data throughout the trial's duration. Yet, according to traditional statistical methods, the sample size of the trial should be determined in advance, and data collected on all subjects before analysis proceeds. For ethical and economic reasons, the technique of sequential testing has been developed to enable the examination of data at a series of interim analyses. The aim is to stop recruitment to the study as soon as there is sufficient evidence to reach a firm conclusion. In this paper we present the advantages and disadvantages of conducting interim analyses in phase III clinical trials, together with the key steps to enable the successful implementation of sequential methods in this setting. Examples are given of completed trials, which have been carried out sequentially, and references to relevant literature and software are provided.
Resumo:
The Natural History of Human Papillomavirus (HPV) Infection in Men: The HIM Study is a prospective multi-center cohort study that, among other factors, analyzes participants` diet. A parallel cross-sectional study was designed to evaluate the validity and reproducibility of the quantitative food frequency questionnaire (QFFQ) used in the Brazilian center from the HIM Study. For this, a convenience subsample of 98 men aged 18 to 70 years from the HIM Study in Brazil answered three 54-item QFFQ and three 24-hour recall interviews, with 6-month intervals between them (data collection January to September 2007). A Bland-Altman analysis indicated that the difference between instruments was dependent on the magnitude of the intake for energy and most nutrients included in the validity analysis, with the exception of carbohydrates, fiber, polyunsaturated fat, vitamin C, and vitamin E. The correlation between the QFFQ and the 24-hour recall for the deattenuated and energy-adjusted data ranged from 0.05 (total fat) to 0.57 (calcium). For the energy and nutrients consumption included in the validity analysis, 33.5% of participants on average were correctly classified into quartiles, and the average value of 0.26 for weighted kappa shows a reasonable agreement. The intraclass correlation coefficients for all nutrients were greater than 0.40 in the reproducibility analysis. The QFFQ demonstrated good reproducibility and acceptable validity. The results support the use of this instrument in the HIM Study. J Am Diet Assoc. 2011;111:1045-1051.
Resumo:
The correlation between the breaks in the metallicity distribution and the corotation radius of spiral galaxies has been already advocated in the past and is predicted by a chemodynamical model of our Galaxy that effectively introduces the role of spiral arms in the star formation rate. In this work, we present photometric and spectroscopic observations made with the Gemini Telescope for three of the best candidates of spiral galaxies to have the corotation inside the optical disc: IC 0167, NGC 1042 and NGC 6907. We observed the most intense and well-distributed H ii regions of these galaxies, deriving reliable galactocentric distances and oxygen abundances by applying different statistical methods. From these results, we confirm the presence of variations in the gradients of metallicity of these galaxies that are possibly correlated with the corotation resonance.
Resumo:
This paper deals with the morphological features of the tracheary elements of the vegetative organs in four Portulaca species (Portulaca hirsutissima Camb., P. halimoides L., P. wedermannii Poelln. and P. mucronata Link.) occurring in Southeast and Northeast Brazil. The vessel elements are small (< 25 mu m) and with simple perforation plate. The pattern of wall thickening varied from bordered pitting (in roots) to scalariform and helicoidal (stem and leaves). Statistical methods show variation in vessel-element diameter in different vegetative organs; wider elements were observed in roots. Tracheids occurring in leaves of P. hirsutissima and P. wedermannii, have morphological features that are similar to terminal tracheids or tracheoid idiolasts frequently associated with xerophytes. The paedomorphic features (juvenlism) observed here may be related, in part, to aspects of water transport and storage as described in Cactaceae.
Resumo:
We investigated the evolution of anuran locomotor performance and its morphological correlates as a function of habitat use and lifestyles. We reanalysed a subset of the data reported by Zug (Smithson. Contrib. Zool. 1978; 276: 1-31) employing phylogenetically explicit statistical methods (n = 56 species), and assembled morphological data on the ratio between hind-limb length and snout-vent length (SVL) from the literature and museum specimens for a large subgroup of the species from the original paper (n = 43 species). Analyses using independent contrasts revealed that classifying anurans into terrestrial, semi-aquatic, and arboreal categories cannot distinguish between the effects of phylogeny and ecological diversification in anuran locomotor performance. However, a more refined classification subdividing terrestrial species into `fossorials` and `non-fossorials`, and arboreal species into `open canopy`, `low canopy` and `high canopy`, suggests that part of the variation in locomotor performance and in hind-limb morphology can be attributed to ecological diversification. In particular, fossorial species had significantly lower jumping performances and shorter hind limbs than other species after controlling for SVL, illustrating how the trade-off between burrowing efficiency and jumping performance has resulted in morphological specialization in this group.
Resumo:
Due to idiosyncrasies in their syntax, semantics or frequency, Multiword Expressions (MWEs) have received special attention from the NLP community, as the methods and techniques developed for the treatment of simplex words are not necessarily suitable for them. This is certainly the case for the automatic acquisition of MWEs from corpora. A lot of effort has been directed to the task of automatically identifying them, with considerable success. In this paper, we propose an approach for the identification of MWEs in a multilingual context, as a by-product of a word alignment process, that not only deals with the identification of possible MWE candidates, but also associates some multiword expressions with semantics. The results obtained indicate the feasibility and low costs in terms of tools and resources demanded by this approach, which could, for example, facilitate and speed up lexicographic work.
A bivariate regression model for matched paired survival data: local influence and residual analysis
Resumo:
The use of bivariate distributions plays a fundamental role in survival and reliability studies. In this paper, we consider a location scale model for bivariate survival times based on the proposal of a copula to model the dependence of bivariate survival data. For the proposed model, we consider inferential procedures based on maximum likelihood. Gains in efficiency from bivariate models are also examined in the censored data setting. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the bivariate regression model for matched paired survival data. Sensitivity analysis methods such as local and total influence are presented and derived under three perturbation schemes. The martingale marginal and the deviance marginal residual measures are used to check the adequacy of the model. Furthermore, we propose a new measure which we call modified deviance component residual. The methodology in the paper is illustrated on a lifetime data set for kidney patients.
Resumo:
Deviations from the average can provide valuable insights about the organization of natural systems. The present article extends this important principle to the systematic identification and analysis of singular motifs in complex networks. Six measurements quantifying different and complementary features of the connectivity around each node of a network were calculated, and multivariate statistical methods applied to identify singular nodes. The potential of the presented concepts and methodology was illustrated with respect to different types of complex real-world networks, namely the US air transportation network, the protein-protein interactions of the yeast Saccharomyces cerevisiae and the Roget thesaurus networks. The obtained singular motifs possessed unique functional roles in the networks. Three classic theoretical network models were also investigated, with the Barabasi-Albert model resulting in singular motifs corresponding to hubs, confirming the potential of the approach. Interestingly, the number of different types of singular node motifs as well as the number of their instances were found to be considerably higher in the real-world networks than in any of the benchmark networks. Copyright (C) EPLA, 2009
Resumo:
The aim of this study was to evaluate the presence of nutrients and toxic elements in coffees cultivated during the process of conversion, on organic agriculture, in southwest Bahia, Brazil. Levels of the nutrients and toxic elements were determined in samples of soils and coffee tissues from two transitional organic farms by atomic absorption spectrometry (FAAS). The metals in soil samples were extracted by Mehlich1 and USEPA-3050 procedures. Coffee samples from both farms presented relatively high levels of Cd, Zn and Cu (0.75,45.4 and 14.9 mu g g(-1). respectively), but were still below the limits specified by the Brazilian Food Legislation. The application of statistical methods showed that this finding can be attributed to the addition of high amounts of organic matter during the flowering tree period which can act on the bioavailability of metal ions in soils. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
An abnormality in neurodevelopment is one of the most robust etiologic hypotheses in schizophrenia (SZ). There is also strong evidence that genetic factors may influence abnormal neurodevelopment in the disease. The present study evaluated in SZ patients, whose brain structural data had been obtained with magnetic resonance imaging (MRI), the possible association between structural brain measures, and 32 DNA polymorphisms,located in 30 genes related to neurogenesis and brain development. DNA was extracted from peripheral blood cells of 25 patients with schizophrenia, genotyping was performed using diverse procedures, and putative associations were evaluated by standard statistical methods (using the software Statistical Package for Social Sciences - SPSS) with a modified Bonferroni adjustment. For reelin (RELN), a protease that guides neurons in the developing brain and underlies neurotransmission and synaptic plasticity in adults, an association was found for a non-synonymous polymorphism (Va1997Leu) with left and right ventricular enlargement. A putative association was also found between protocadherin 12 (PCDH12), a cell adhesion molecule involved in axonal guidance and synaptic specificity, and cortical folding (asymmetry coefficient of gyrification index). Although our results are preliminary, due to the small number of individuals analyzed, such an approach could reveal new candidate genes implicated in anomalous neurodevelopment in schizophrenia. (c) 2007 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Notions of Class and Gender in the Employment Service Job Descriptions This article examines whether job descriptions emphasize different characteristics and competences depending on the occupations’ social class and gender relations. The study is partly a replication of a similar analysis conducted by Gesser in the 1970s. The purpose is to examine the prevalence of stereotypes in occupational descriptions provided by the Swedish state, and if the descriptions contribute to class and gender labeling of occupations and, by extension, its practitioners. Previous research has shown that career guiding materials are characterized by notions of the appropriate practitioner’s class and gender. In this study we depart from the concept of doxa and argue that stereotypical images of occupations are based on common sense that remains unquestioned. The study draws on a quantitative content analysis of 420 job descriptions analyzed by various statistical methods. The overall results show that there are systematic differences. In general, social class seems to have greater impact than gender on what kind of competences that are emphasized in the descriptions. Social skills are emphasized in female dominated occupations, while physical abilities are highlighted in male-dominated occupations. To some extent, these results are uncontroversial, as it also portraits abilities necessary to do the work in different kind of occupations
Resumo:
Background: A test battery consisting of self-assessments and motor tests (tapping and spiral drawing) was developed for a hand computer with touch screen in a telemedicine setting. Objectives: To develop and evaluate a web-based system that delivers decision support information to the treating clinical staff for assessing PD symptoms in their patients based on the test battery data. Methods: The test battery is currently being used in a clinical trial (DAPHNE, EudraCT No. 2005-002654-21) by sixty five patients with advanced Parkinson’s disease (PD) on 9991 test occasions (four tests per day during in all 362 week-long test periods) at nine clinics around Sweden. Test results are sent continuously from the hand unit over a mobile net to a central computer and processed with statistical methods. They are summarized into scores for different dimensions of the symptom state and an ‘overall test score’ reflecting the overall condition of the patient during a test period. The information in the web application is organized and presented graphically in a way that the general overview of the patient performance per test period is emphasized. Focus is on the overall test score, symptom dimensions and daily summaries. In a recent preliminary user evaluation, the web application was demonstrated to the fifteen study nurses who had used the test battery in the clinical trial. At least one patient per clinic was shown. Results: In general, the responses from nurses were positive. They claimed that the test results shown in the system were consistent with their own clinical observations. They could follow complications, changes and trends within their patients. Discussion: In conclusion, the system is able to summarise the various time series of motor test results and self-assessments during test periods and present them in a useful manner. Its main contribution is a novel and reliable way to capture and easily access symptom information from patients’ home environment. The convenient access to current symptom profile as well as symptom history provides a basis for individualized evaluation and adjustment of treatments.