59 resultados para ESTIMATORS
em CentAUR: Central Archive University of Reading - UK
Resumo:
In a recently published paper. spherical nonparametric estimators were applied to feature-track ensembles to determine a range of statistics for the atmospheric features considered. This approach obviates the types of bias normally introduced with traditional estimators. New spherical isotropic kernels with local support were introduced. Ln this paper the extension to spherical nonisotropic kernels with local support is introduced, together with a means of obtaining the shape and smoothing parameters in an objective way. The usefulness of spherical nonparametric estimators based on nonisotropic kernels is demonstrated with an application to an oceanographic feature-track ensemble.
Resumo:
The aim of this paper is essentially twofold: first, to describe the use of spherical nonparametric estimators for determining statistical diagnostic fields from ensembles of feature tracks on a global domain, and second, to report the application of these techniques to data derived from a modern general circulation model. New spherical kernel functions are introduced that are more efficiently computed than the traditional exponential kernels. The data-driven techniques of cross-validation to determine the amount elf smoothing objectively, and adaptive smoothing to vary the smoothing locally, are also considered. Also introduced are techniques for combining seasonal statistical distributions to produce longer-term statistical distributions. Although all calculations are performed globally, only the results for the Northern Hemisphere winter (December, January, February) and Southern Hemisphere winter (June, July, August) cyclonic activity are presented, discussed, and compared with previous studies. Overall, results for the two hemispheric winters are in good agreement with previous studies, both for model-based studies and observational studies.
Resumo:
Two simple and frequently used capture–recapture estimates of the population size are compared: Chao's lower-bound estimate and Zelterman's estimate allowing for contaminated distributions. In the Poisson case it is shown that if there are only counts of ones and twos, the estimator of Zelterman is always bounded above by Chao's estimator. If counts larger than two exist, the estimator of Zelterman is becoming larger than that of Chao's, if only the ratio of the frequencies of counts of twos and ones is small enough. A similar analysis is provided for the binomial case. For a two-component mixture of Poisson distributions the asymptotic bias of both estimators is derived and it is shown that the Zelterman estimator can experience large overestimation bias. A modified Zelterman estimator is suggested and also the bias-corrected version of Chao's estimator is considered. All four estimators are compared in a simulation study.
Resumo:
This paper investigates the applications of capture–recapture methods to human populations. Capture–recapture methods are commonly used in estimating the size of wildlife populations but can also be used in epidemiology and social sciences, for estimating prevalence of a particular disease or the size of the homeless population in a certain area. Here we focus on estimating the prevalence of infectious diseases. Several estimators of population size are considered: the Lincoln–Petersen estimator and its modified version, the Chapman estimator, Chao’s lower bound estimator, the Zelterman’s estimator, McKendrick’s moment estimator and the maximum likelihood estimator. In order to evaluate these estimators, they are applied to real, three-source, capture-recapture data. By conditioning on each of the sources of three source data, we have been able to compare the estimators with the true value that they are estimating. The Chapman and Chao estimators were compared in terms of their relative bias. A variance formula derived through conditioning is suggested for Chao’s estimator, and normal 95% confidence intervals are calculated for this and the Chapman estimator. We then compare the coverage of the respective confidence intervals. Furthermore, a simulation study is included to compare Chao’s and Chapman’s estimator. Results indicate that Chao’s estimator is less biased than Chapman’s estimator unless both sources are independent. Chao’s estimator has also the smaller mean squared error. Finally, the implications and limitations of the above methods are discussed, with suggestions for further development.
Resumo:
This note considers the variance estimation for population size estimators based on capture–recapture experiments. Whereas a diversity of estimators of the population size has been suggested, the question of estimating the associated variances is less frequently addressed. This note points out that the technique of conditioning can be applied here successfully which also allows us to identify sources of variation: the variance due to estimation of the model parameters and the binomial variance due to sampling n units from a population of size N. It is applied to estimators typically used in capture–recapture experiments in continuous time including the estimators of Zelterman and Chao and improves upon previously used variance estimators. In addition, knowledge of the variances associated with the estimators by Zelterman and Chao allows the suggestion of a new estimator as the weighted sum of the two. The decomposition of the variance into the two sources allows also a new understanding of how resampling techniques like the Bootstrap could be used appropriately. Finally, the sample size question for capture–recapture experiments is addressed. Since the variance of population size estimators increases with the sample size, it is suggested to use relative measures such as the observed-to-hidden ratio or the completeness of identification proportion for approaching the question of sample size choice.
Resumo:
Proportion estimators are quite frequently used in many application areas. The conventional proportion estimator (number of events divided by sample size) encounters a number of problems when the data are sparse as will be demonstrated in various settings. The problem of estimating its variance when sample sizes become small is rarely addressed in a satisfying framework. Specifically, we have in mind applications like the weighted risk difference in multicenter trials or stratifying risk ratio estimators (to adjust for potential confounders) in epidemiological studies. It is suggested to estimate p using the parametric family (see PDF for character) and p(1 - p) using (see PDF for character), where (see PDF for character). We investigate the estimation problem of choosing c 0 from various perspectives including minimizing the average mean squared error of (see PDF for character), average bias and average mean squared error of (see PDF for character). The optimal value of c for minimizing the average mean squared error of (see PDF for character) is found to be independent of n and equals c = 1. The optimal value of c for minimizing the average mean squared error of (see PDF for character) is found to be dependent of n with limiting value c = 0.833. This might justifiy to use a near-optimal value of c = 1 in practice which also turns out to be beneficial when constructing confidence intervals of the form (see PDF for character).
Resumo:
Two simple and frequently used capture–recapture estimates of the population size are compared: Chao's lower-bound estimate and Zelterman's estimate allowing for contaminated distributions. In the Poisson case it is shown that if there are only counts of ones and twos, the estimator of Zelterman is always bounded above by Chao's estimator. If counts larger than two exist, the estimator of Zelterman is becoming larger than that of Chao's, if only the ratio of the frequencies of counts of twos and ones is small enough. A similar analysis is provided for the binomial case. For a two-component mixture of Poisson distributions the asymptotic bias of both estimators is derived and it is shown that the Zelterman estimator can experience large overestimation bias. A modified Zelterman estimator is suggested and also the bias-corrected version of Chao's estimator is considered. All four estimators are compared in a simulation study.
Resumo:
This paper investigates the applications of capture-recapture methods to human populations. Capture-recapture methods are commonly used in estimating the size of wildlife populations but can also be used in epidemiology and social sciences, for estimating prevalence of a particular disease or the size of the homeless population in a certain area. Here we focus on estimating the prevalence of infectious diseases. Several estimators of population size are considered: the Lincoln-Petersen estimator and its modified version, the Chapman estimator, Chao's lower bound estimator, the Zelterman's estimator, McKendrick's moment estimator and the maximum likelihood estimator. In order to evaluate these estimators, they are applied to real, three-source, capture-recapture data. By conditioning on each of the sources of three source data, we have been able to compare the estimators with the true value that they are estimating. The Chapman and Chao estimators were compared in terms of their relative bias. A variance formula derived through conditioning is suggested for Chao's estimator, and normal 95% confidence intervals are calculated for this and the Chapman estimator. We then compare the coverage of the respective confidence intervals. Furthermore, a simulation study is included to compare Chao's and Chapman's estimator. Results indicate that Chao's estimator is less biased than Chapman's estimator unless both sources are independent. Chao's estimator has also the smaller mean squared error. Finally, the implications and limitations of the above methods are discussed, with suggestions for further development.
Resumo:
This note considers the variance estimation for population size estimators based on capture–recapture experiments. Whereas a diversity of estimators of the population size has been suggested, the question of estimating the associated variances is less frequently addressed. This note points out that the technique of conditioning can be applied here successfully which also allows us to identify sources of variation: the variance due to estimation of the model parameters and the binomial variance due to sampling n units from a population of size N. It is applied to estimators typically used in capture–recapture experiments in continuous time including the estimators of Zelterman and Chao and improves upon previously used variance estimators. In addition, knowledge of the variances associated with the estimators by Zelterman and Chao allows the suggestion of a new estimator as the weighted sum of the two. The decomposition of the variance into the two sources allows also a new understanding of how resampling techniques like the Bootstrap could be used appropriately. Finally, the sample size question for capture–recapture experiments is addressed. Since the variance of population size estimators increases with the sample size, it is suggested to use relative measures such as the observed-to-hidden ratio or the completeness of identification proportion for approaching the question of sample size choice.
Resumo:
An improved algorithm for the generation of gridded window brightness temperatures is presented. The primary data source is the International Satellite Cloud Climatology Project, level B3 data, covering the period from July 1983 to the present. The algorithm rakes window brightness, temperatures from multiple satellites, both geostationary and polar orbiting, which have already been navigated and normalized radiometrically to the National Oceanic and Atmospheric Administration's Advanced Very High Resolution Radiometer, and generates 3-hourly global images on a 0.5 degrees by 0.5 degrees latitude-longitude grid. The gridding uses a hierarchical scheme based on spherical kernel estimators. As part of the gridding procedure, the geostationary data are corrected for limb effects using a simple empirical correction to the radiances, from which the corrected temperatures are computed. This is in addition to the application of satellite zenith angle weighting to downweight limb pixels in preference to nearer-nadir pixels. The polar orbiter data are windowed on the target time with temporal weighting to account for the noncontemporaneous nature of the data. Large regions of missing data are interpolated from adjacent processed images using a form of motion compensated interpolation based on the estimation of motion vectors using an hierarchical block matching scheme. Examples are shown of the various stages in the process. Also shown are examples of the usefulness of this type of data in GCM validation.
Resumo:
The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. The first paper of this series examined the effects of the former on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to create normally distributed random fields of different size that are realizations of known processes described by variograms with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were computed from these data by Matheron's estimator and by three robust estimators. The effects of standard data transformations on the coefficient of skewness and on the variogram were also investigated. Cross-validation was used to assess the performance of models fitted to experimental variograms computed from a range of data contaminated by outliers for kriging. The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both square roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near to the edge or the centre of the field. The results of cross-validation showed that the robust estimators and the removal of outliers were the most effective ways of dealing with outliers for variogram estimation and kriging. (C) 2007 Elsevier Ltd. All rights reserved.