32 resultados para statistic
Resumo:
The paper introduces an efficient construction algorithm for obtaining sparse linear-in-the-weights regression models based on an approach of directly optimizing model generalization capability. This is achieved by utilizing the delete-1 cross validation concept and the associated leave-one-out test error also known as the predicted residual sums of squares (PRESS) statistic, without resorting to any other validation data set for model evaluation in the model construction process. Computational efficiency is ensured using an orthogonal forward regression, but the algorithm incrementally minimizes the PRESS statistic instead of the usual sum of the squared training errors. A local regularization method can naturally be incorporated into the model selection procedure to further enforce model sparsity. The proposed algorithm is fully automatic, and the user is not required to specify any criterion to terminate the model construction procedure. Comparisons with some of the existing state-of-art modeling methods are given, and several examples are included to demonstrate the ability of the proposed algorithm to effectively construct sparse models that generalize well.
Resumo:
This letter introduces a new robust nonlinear identification algorithm using the Predicted REsidual Sums of Squares (PRESS) statistic and for-ward regression. The major contribution is to compute the PRESS statistic within a framework of a forward orthogonalization process and hence construct a model with a good generalization property. Based on the properties of the PRESS statistic the proposed algorithm can achieve a fully automated procedure without resort to any other validation data set for iterative model evaluation.
Resumo:
An automatic nonlinear predictive model-construction algorithm is introduced based on forward regression and the predicted-residual-sums-of-squares (PRESS) statistic. The proposed algorithm is based on the fundamental concept of evaluating a model's generalisation capability through crossvalidation. This is achieved by using the PRESS statistic as a cost function to optimise model structure. In particular, the proposed algorithm is developed with the aim of achieving computational efficiency, such that the computational effort, which would usually be extensive in the computation of the PRESS statistic, is reduced or minimised. The computation of PRESS is simplified by avoiding a matrix inversion through the use of the orthogonalisation procedure inherent in forward regression, and is further reduced significantly by the introduction of a forward-recursive formula. Based on the properties of the PRESS statistic, the proposed algorithm can achieve a fully automated procedure without resort to any other validation data set for iterative model evaluation. Numerical examples are used to demonstrate the efficacy of the algorithm.
Resumo:
The clustering in time (seriality) of extratropical cyclones is responsible for large cumulative insured losses in western Europe, though surprisingly little scientific attention has been given to this important property. This study investigates and quantifies the seriality of extratropical cyclones in the Northern Hemisphere using a point-process approach. A possible mechanism for serial clustering is the time-varying effect of the large-scale flow on individual cyclone tracks. Another mechanism is the generation by one parent cyclone of one or more offspring through secondary cyclogenesis. A long cyclone-track database was constructed for extended October March winters from 1950 to 2003 using 6-h analyses of 850-mb relative vorticity derived from the NCEP NCAR reanalysis. A dispersion statistic based on the varianceto- mean ratio of monthly cyclone counts was used as a measure of clustering. It reveals extensive regions of statistically significant clustering in the European exit region of the North Atlantic storm track and over the central North Pacific. Monthly cyclone counts were regressed on time-varying teleconnection indices with a log-linear Poisson model. Five independent teleconnection patterns were found to be significant factors over Europe: the North Atlantic Oscillation (NAO), the east Atlantic pattern, the Scandinavian pattern, the east Atlantic western Russian pattern, and the polar Eurasian pattern. The NAO alone is not sufficient for explaining the variability of cyclone counts in the North Atlantic region and western Europe. Rate dependence on time-varying teleconnection indices accounts for the variability in monthly cyclone counts, and a cluster process did not need to be invoked.
Resumo:
A novel statistic for local wave amplitude of the 500-hPa geopotential height field is introduced. The statistic uses a Hilbert transform to define a longitudinal wave envelope and dynamical latitude weighting to define the latitudes of interest. Here it is used to detect the existence, or otherwise, of multimodality in its distribution function. The empirical distribution function for the 1960-2000 period is close to a Weibull distribution with shape parameters between 2 and 3. There is substantial interdecadal variability but no apparent local multimodality or bimodality. The zonally averaged wave amplitude, akin to the more usual wave amplitude index, is close to being normally distributed. This is consistent with the central limit theorem, which applies to the construction of the wave amplitude index. For the period 1960-70 it is found that there is apparent bimodality in this index. However, the different amplitudes are realized at different longitudes, so there is no bimodality at any single longitude. As a corollary, it is found that many commonly used statistics to detect multimodality in atmospheric fields potentially satisfy the assumptions underlying the central limit theorem and therefore can only show approximately normal distributions. The author concludes that these techniques may therefore be suboptimal to detect any multimodality.
Resumo:
While over-dispersion in capture–recapture studies is well known to lead to poor estimation of population size, current diagnostic tools to detect the presence of heterogeneity have not been specifically developed for capture–recapture studies. To address this, a simple and efficient method of testing for over-dispersion in zero-truncated count data is developed and evaluated. The proposed method generalizes an over-dispersion test previously suggested for un-truncated count data and may also be used for testing residual over-dispersion in zero-inflation data. Simulations suggest that the asymptotic distribution of the test statistic is standard normal and that this approximation is also reasonable for small sample sizes. The method is also shown to be more efficient than an existing test for over-dispersion adapted for the capture–recapture setting. Studies with zero-truncated and zero-inflated count data are used to illustrate the test procedures.
Resumo:
The skill of numerical Lagrangian drifter trajectories in three numerical models is assessed by comparing these numerically obtained paths to the trajectories of drifting buoys in the real ocean. The skill assessment is performed using the two-sample Kolmogorov–Smirnov statistical test. To demonstrate the assessment procedure, it is applied to three different models of the Agulhas region. The test can either be performed using crossing positions of one-dimensional sections in order to test model performance in specific locations, or using the total two-dimensional data set of trajectories. The test yields four quantities: a binary decision of model skill, a confidence level which can be used as a measure of goodness-of-fit of the model, a test statistic which can be used to determine the sensitivity of the confidence level, and cumulative distribution functions that aid in the qualitative analysis. The ordering of models by their confidence levels is the same as the ordering based on the qualitative analysis, which suggests that the method is suited for model validation. Only one of the three models, a 1/10° two-way nested regional ocean model, might have skill in the Agulhas region. The other two models, a 1/2° global model and a 1/8° assimilative model, might have skill only on some sections in the region
Resumo:
We introduce a technique for assessing the diurnal development of convective storm systems based on outgoing longwave radiation fields. Using the size distribution of the storms measured from a series of images, we generate an array in the lengthscale-time domain based on the standard score statistic. It demonstrates succinctly the size evolution of storms as well as the dissipation kinematics. It also provides evidence related to the temperature evolution of the cloud tops. We apply this approach to a test case comparing observations made by the Geostationary Earth Radiation Budget instrument to output from the Met Office Unified Model run at two resolutions. The 12km resolution model produces peak convective activity on all lengthscales significantly earlier in the day than shown by the observations and no evidence for storms growing in size. The 4km resolution model shows realistic timing and growth evolution although the dissipation mechanism still differs from the observed data.
Resumo:
This article explores how data envelopment analysis (DEA), along with a smoothed bootstrap method, can be used in applied analysis to obtain more reliable efficiency rankings for farms. The main focus is the smoothed homogeneous bootstrap procedure introduced by Simar and Wilson (1998) to implement statistical inference for the original efficiency point estimates. Two main model specifications, constant and variable returns to scale, are investigated along with various choices regarding data aggregation. The coefficient of separation (CoS), a statistic that indicates the degree of statistical differentiation within the sample, is used to demonstrate the findings. The CoS suggests a substantive dependency of the results on the methodology and assumptions employed. Accordingly, some observations are made on how to conduct DEA in order to get more reliable efficiency rankings, depending on the purpose for which they are to be used. In addition, attention is drawn to the ability of the SLICE MODEL, implemented in GAMS, to enable researchers to overcome the computational burdens of conducting DEA (with bootstrapping).
Resumo:
Two models for predicting Septoria tritici on winter wheat (cv. Ri-band) were developed using a program based on an iterative search of correlations between disease severity and weather. Data from four consecutive cropping seasons (1993/94 until 1996/97) at nine sites throughout England were used. A qualitative model predicted the presence or absence of Septoria tritici (at a 5% severity threshold within the top three leaf layers) using winter temperature (January/February) and wind speed to about the first node detectable growth stage. For sites above the disease threshold, a quantitative model predicted severity of Septoria tritici using rainfall during stern elongation. A test statistic was derived to test the validity of the iterative search used to obtain both models. This statistic was used in combination with bootstrap analyses in which the search program was rerun using weather data from previous years, therefore uncorrelated with the disease data, to investigate how likely correlations such as the ones found in our models would have been in the absence of genuine relationships.
Resumo:
Most statistical methodology for phase III clinical trials focuses on the comparison of a single experimental treatment with a control. An increasing desire to reduce the time before regulatory approval of a new drug is sought has led to development of two-stage or sequential designs for trials that combine the definitive analysis associated with phase III with the treatment selection element of a phase II study. In this paper we consider a trial in which the most promising of a number of experimental treatments is selected at the first interim analysis. This considerably reduces the computational load associated with the construction of stopping boundaries compared to the approach proposed by Follman, Proschan and Geller (Biometrics 1994; 50: 325-336). The computational requirement does not exceed that for the sequential comparison of a single experimental treatment with a control. Existing methods are extended in two ways. First, the use of the efficient score as a test statistic makes the analysis of binary, normal or failure-time data, as well as adjustment for covariates or stratification straightforward. Second, the question of trial power is also considered, enabling the determination of sample size required to give specified power. Copyright © 2003 John Wiley & Sons, Ltd.
Resumo:
There is increasing interest in combining Phases II and III of clinical development into a single trial in which one of a small number of competing experimental treatments is ultimately selected and where a valid comparison is made between this treatment and the control treatment. Such a trial usually proceeds in stages, with the least promising experimental treatments dropped as soon as possible. In this paper we present a highly flexible design that uses adaptive group sequential methodology to monitor an order statistic. By using this approach, it is possible to design a trial which can have any number of stages, begins with any number of experimental treatments, and permits any number of these to continue at any stage. The test statistic used is based upon efficient scores, so the method can be easily applied to binary, ordinal, failure time, or normally distributed outcomes. The method is illustrated with an example, and simulations are conducted to investigate its type I error rate and power under a range of scenarios.
Resumo:
Sequential methods provide a formal framework by which clinical trial data can be monitored as they accumulate. The results from interim analyses can be used either to modify the design of the remainder of the trial or to stop the trial as soon as sufficient evidence of either the presence or absence of a treatment effect is available. The circumstances under which the trial will be stopped with a claim of superiority for the experimental treatment, must, however, be determined in advance so as to control the overall type I error rate. One approach to calculating the stopping rule is the group-sequential method. A relatively recent alternative to group-sequential approaches is the adaptive design method. This latter approach provides considerable flexibility in changes to the design of a clinical trial at an interim point. However, a criticism is that the method by which evidence from different parts of the trial is combined means that a final comparison of treatments is not based on a sufficient statistic for the treatment difference, suggesting that the method may lack power. The aim of this paper is to compare two adaptive design approaches with the group-sequential approach. We first compare the form of the stopping boundaries obtained using the different methods. We then focus on a comparison of the power of the different trials when they are designed so as to be as similar as possible. We conclude that all methods acceptably control type I error rate and power when the sample size is modified based on a variance estimate, provided no interim analysis is so small that the asymptotic properties of the test statistic no longer hold. In the latter case, the group-sequential approach is to be preferred. Provided that asymptotic assumptions hold, the adaptive design approaches control the type I error rate even if the sample size is adjusted on the basis of an estimate of the treatment effect, showing that the adaptive designs allow more modifications than the group-sequential method.
Resumo:
A score test is developed for binary clinical trial data, which incorporates patient non-compliance while respecting randomization. It is assumed in this paper that compliance is all-or-nothing, in the sense that a patient either accepts all of the treatment assigned as specified in the protocol, or none of it. Direct analytic comparisons of the adjusted test statistic for both the score test and the likelihood ratio test are made with the corresponding test statistics that adhere to the intention-to-treat principle. It is shown that no gain in power is possible over the intention-to-treat analysis, by adjusting for patient non-compliance. Sample size formulae are derived and simulation studies are used to demonstrate that the sample size approximation holds. Copyright © 2003 John Wiley & Sons, Ltd.