38 resultados para statistical methods
Resumo:
Airborne high resolution in situ measurements of a large set of trace gases including ozone (O3) and total water (H2O) in the upper troposphere and the lowermost stratosphere (UT/LMS) have been performed above Europe within the SPURT project. SPURT provides an extensive data coverage of the UT/LMS in each season within the time period between November 2001 and July 2003. In the LMS a distinct spring maximum and autumn minimum is observed in O3, whereas its annual cycle in the UT is shifted by 2–3 months later towards the end of the year. The more variable H2O measurements reveal a maximum during summer and a minimum during autumn/winter with no phase shift between the two atmospheric compartments. For a comprehensive insight into trace gas composition and variability in the UT/LMS several statistical methods are applied using chemical, thermal and dynamical vertical coordinates. In particular, 2-dimensional probability distribution functions serve as a tool to transform localised aircraft data to a more comprehensive view of the probed atmospheric region. It appears that both trace gases, O3 and H2O, reveal the most compact arrangement and are best correlated in the view of potential vorticity (PV) and distance to the local tropopause, indicating an advanced mixing state on these surfaces. Thus, strong gradients of PV seem to act as a transport barrier both in the vertical and the horizontal direction. The alignment of trace gas isopleths reflects the existence of a year-round extra-tropical tropopause transition layer. The SPURT measurements reveal that this layer is mainly affected by stratospheric air during winter/spring and by tropospheric air during autumn/summer. Normalised mixing entropy values for O3 and H2O in the LMS appear to be maximal during spring and summer, respectively, indicating highest variability of these trace gases during the respective seasons.
Resumo:
The Homeric epics are among the greatest masterpieces of literature, but when they were produced is not known with certainty. Here we apply evolutionary-linguistic phylogenetic statistical methods to differences in Homeric, Modern Greek and ancient Hittite vocabulary items to estimate a date of approximately 710–760 BCE for these great works. Our analysis compared a common set of vocabulary items among the three pairs of languages, recording for each item whether the words in the two languages were cognate – derived from a shared ancestral word – or not. We then used a likelihood-based Markov chain Monte Carlo procedure to estimate the most probable times in years separating these languages given the percentage of words they shared, combined with knowledge of the rates at which different words change. Our date for the epics is in close agreement with historians' and classicists' beliefs derived from historical and archaeological sources.
Resumo:
Market failure can be corrected using different regulatory approaches ranging from high to low intervention. Recently, classic regulations have been criticized as costly and economically irrational and thus policy makers are giving more consideration to soft regulatory techniques such as information remedies. However, despite the plethora of food information conveyed by different media there appears to be a lack of studies exploring how consumers evaluate this information and how trust towards publishers influence their choices for food information. In order to fill such a gap, this study investigates questions related to topics which are more relevant to consumers, who should disseminate trustful food information, and how communication should be conveyed and segmented. Primary data were collected both through qualitative (in depth interviews and focus groups) and quantitative research (web and mail surveys). Attitudes, willingness to pay for food information and trust towards public and private sources conveying information through a new food magazine were assessed using both multivariate statistical methods and econometric analysis. The study shows that consumer attitudes towards food information topics can be summarized along three cognitive-affective dimensions: the agro-food system, enjoyment and wellness. Information related to health risks caused by nutritional disorders and food safety issues caused by bacteria and chemical substances is the most important for about 90% of respondents. Food information related to regulations and traditions is also considered important for more than two thirds of respondents, while information about food production and processing techniques, life style and food fads are considered less important by the majority of respondents. Trust towards food information disseminated by public bodies is higher than that observed for private bodies. This behavior directly affects willingness to pay (WTP) for food information provided by public and private publishers when markets are shocked by a food safety incident. WTP for consumer association (€ 1.80) and the European Food Safety Authority (€ 1.30) are higher than WTP for the independent and food industry publishers which cluster around zero euro. Furthermore, trust towards the type of publisher also plays a key role in food information market segmentation together with socio-demographic and economic variables such as gender, age, presence of children and income. These findings invite policy makers to reflect on the possibility of using information remedies conveyed using trusted sources of information to specific segments of consumers as an interesting soft alternative to the classic way of regulating modern food markets.
Resumo:
Human brain imaging techniques, such as Magnetic Resonance Imaging (MRI) or Diffusion Tensor Imaging (DTI), have been established as scientific and diagnostic tools and their adoption is growing in popularity. Statistical methods, machine learning and data mining algorithms have successfully been adopted to extract predictive and descriptive models from neuroimage data. However, the knowledge discovery process typically requires also the adoption of pre-processing, post-processing and visualisation techniques in complex data workflows. Currently, a main problem for the integrated preprocessing and mining of MRI data is the lack of comprehensive platforms able to avoid the manual invocation of preprocessing and mining tools, that yields to an error-prone and inefficient process. In this work we present K-Surfer, a novel plug-in of the Konstanz Information Miner (KNIME) workbench, that automatizes the preprocessing of brain images and leverages the mining capabilities of KNIME in an integrated way. K-Surfer supports the importing, filtering, merging and pre-processing of neuroimage data from FreeSurfer, a tool for human brain MRI feature extraction and interpretation. K-Surfer automatizes the steps for importing FreeSurfer data, reducing time costs, eliminating human errors and enabling the design of complex analytics workflow for neuroimage data by leveraging the rich functionalities available in the KNIME workbench.
Resumo:
This paper presents an approximate closed form sample size formula for determining non-inferiority in active-control trials with binary data. We use the odds-ratio as the measure of the relative treatment effect, derive the sample size formula based on the score test and compare it with a second, well-known formula based on the Wald test. Both closed form formulae are compared with simulations based on the likelihood ratio test. Within the range of parameter values investigated, the score test closed form formula is reasonably accurate when non-inferiority margins are based on odds-ratios of about 0.5 or above and when the magnitude of the odds ratio under the alternative hypothesis lies between about 1 and 2.5. The accuracy generally decreases as the odds ratio under the alternative hypothesis moves upwards from 1. As the non-inferiority margin odds ratio decreases from 0.5, the score test closed form formula increasingly overestimates the sample size irrespective of the magnitude of the odds ratio under the alternative hypothesis. The Wald test closed form formula is also reasonably accurate in the cases where the score test closed form formula works well. Outside these scenarios, the Wald test closed form formula can either underestimate or overestimate the sample size, depending on the magnitude of the non-inferiority margin odds ratio and the odds ratio under the alternative hypothesis. Although neither approximation is accurate for all cases, both approaches lead to satisfactory sample size calculation for non-inferiority trials with binary data where the odds ratio is the parameter of interest.
Resumo:
Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.
Resumo:
Recruitment of patients to a clinical trial usually occurs over a period of time, resulting in the steady accumulation of data throughout the trial's duration. Yet, according to traditional statistical methods, the sample size of the trial should be determined in advance, and data collected on all subjects before analysis proceeds. For ethical and economic reasons, the technique of sequential testing has been developed to enable the examination of data at a series of interim analyses. The aim is to stop recruitment to the study as soon as there is sufficient evidence to reach a firm conclusion. In this paper we present the advantages and disadvantages of conducting interim analyses in phase III clinical trials, together with the key steps to enable the successful implementation of sequential methods in this setting. Examples are given of completed trials, which have been carried out sequentially, and references to relevant literature and software are provided.
Resumo:
A precipitation downscaling method is presented using precipitation from a general circulation model (GCM) as predictor. The method extends a previous method from monthly to daily temporal resolution. The simplest form of the method corrects for biases in wet-day frequency and intensity. A more sophisticated variant also takes account of flow-dependent biases in the GCM. The method is flexible and simple to implement. It is proposed here as a correction of GCM output for applications where sophisticated methods are not available, or as a benchmark for the evaluation of other downscaling methods. Applied to output from reanalyses (ECMWF, NCEP) in the region of the European Alps, the method is capable of reducing large biases in the precipitation frequency distribution, even for high quantiles. The two variants exhibit similar performances, but the ideal choice of method can depend on the GCM/reanalysis and it is recommended to test the methods in each case. Limitations of the method are found in small areas with unresolved topographic detail that influence higher-order statistics (e.g. high quantiles). When used as benchmark for three regional climate models (RCMs), the corrected reanalysis and the RCMs perform similarly in many regions, but the added value of the latter is evident for high quantiles in some small regions.