49 resultados para ratio estimators
em CentAUR: Central Archive University of Reading - UK
Resumo:
Proportion estimators are quite frequently used in many application areas. The conventional proportion estimator (number of events divided by sample size) encounters a number of problems when the data are sparse as will be demonstrated in various settings. The problem of estimating its variance when sample sizes become small is rarely addressed in a satisfying framework. Specifically, we have in mind applications like the weighted risk difference in multicenter trials or stratifying risk ratio estimators (to adjust for potential confounders) in epidemiological studies. It is suggested to estimate p using the parametric family (see PDF for character) and p(1 - p) using (see PDF for character), where (see PDF for character). We investigate the estimation problem of choosing c 0 from various perspectives including minimizing the average mean squared error of (see PDF for character), average bias and average mean squared error of (see PDF for character). The optimal value of c for minimizing the average mean squared error of (see PDF for character) is found to be independent of n and equals c = 1. The optimal value of c for minimizing the average mean squared error of (see PDF for character) is found to be dependent of n with limiting value c = 0.833. This might justifiy to use a near-optimal value of c = 1 in practice which also turns out to be beneficial when constructing confidence intervals of the form (see PDF for character).
Resumo:
Two simple and frequently used capture–recapture estimates of the population size are compared: Chao's lower-bound estimate and Zelterman's estimate allowing for contaminated distributions. In the Poisson case it is shown that if there are only counts of ones and twos, the estimator of Zelterman is always bounded above by Chao's estimator. If counts larger than two exist, the estimator of Zelterman is becoming larger than that of Chao's, if only the ratio of the frequencies of counts of twos and ones is small enough. A similar analysis is provided for the binomial case. For a two-component mixture of Poisson distributions the asymptotic bias of both estimators is derived and it is shown that the Zelterman estimator can experience large overestimation bias. A modified Zelterman estimator is suggested and also the bias-corrected version of Chao's estimator is considered. All four estimators are compared in a simulation study.
Resumo:
This note considers the variance estimation for population size estimators based on capture–recapture experiments. Whereas a diversity of estimators of the population size has been suggested, the question of estimating the associated variances is less frequently addressed. This note points out that the technique of conditioning can be applied here successfully which also allows us to identify sources of variation: the variance due to estimation of the model parameters and the binomial variance due to sampling n units from a population of size N. It is applied to estimators typically used in capture–recapture experiments in continuous time including the estimators of Zelterman and Chao and improves upon previously used variance estimators. In addition, knowledge of the variances associated with the estimators by Zelterman and Chao allows the suggestion of a new estimator as the weighted sum of the two. The decomposition of the variance into the two sources allows also a new understanding of how resampling techniques like the Bootstrap could be used appropriately. Finally, the sample size question for capture–recapture experiments is addressed. Since the variance of population size estimators increases with the sample size, it is suggested to use relative measures such as the observed-to-hidden ratio or the completeness of identification proportion for approaching the question of sample size choice.
Resumo:
Two simple and frequently used capture–recapture estimates of the population size are compared: Chao's lower-bound estimate and Zelterman's estimate allowing for contaminated distributions. In the Poisson case it is shown that if there are only counts of ones and twos, the estimator of Zelterman is always bounded above by Chao's estimator. If counts larger than two exist, the estimator of Zelterman is becoming larger than that of Chao's, if only the ratio of the frequencies of counts of twos and ones is small enough. A similar analysis is provided for the binomial case. For a two-component mixture of Poisson distributions the asymptotic bias of both estimators is derived and it is shown that the Zelterman estimator can experience large overestimation bias. A modified Zelterman estimator is suggested and also the bias-corrected version of Chao's estimator is considered. All four estimators are compared in a simulation study.
Resumo:
This note considers the variance estimation for population size estimators based on capture–recapture experiments. Whereas a diversity of estimators of the population size has been suggested, the question of estimating the associated variances is less frequently addressed. This note points out that the technique of conditioning can be applied here successfully which also allows us to identify sources of variation: the variance due to estimation of the model parameters and the binomial variance due to sampling n units from a population of size N. It is applied to estimators typically used in capture–recapture experiments in continuous time including the estimators of Zelterman and Chao and improves upon previously used variance estimators. In addition, knowledge of the variances associated with the estimators by Zelterman and Chao allows the suggestion of a new estimator as the weighted sum of the two. The decomposition of the variance into the two sources allows also a new understanding of how resampling techniques like the Bootstrap could be used appropriately. Finally, the sample size question for capture–recapture experiments is addressed. Since the variance of population size estimators increases with the sample size, it is suggested to use relative measures such as the observed-to-hidden ratio or the completeness of identification proportion for approaching the question of sample size choice.
Resumo:
Statistical graphics are a fundamental, yet often overlooked, set of components in the repertoire of data analytic tools. Graphs are quick and efficient, yet simple instruments of preliminary exploration of a dataset to understand its structure and to provide insight into influential aspects of inference such as departures from assumptions and latent patterns. In this paper, we present and assess a graphical device for choosing a method for estimating population size in capture-recapture studies of closed populations. The basic concept is derived from a homogeneous Poisson distribution where the ratios of neighboring Poisson probabilities multiplied by the value of the larger neighbor count are constant. This property extends to the zero-truncated Poisson distribution which is of fundamental importance in capture–recapture studies. In practice however, this distributional property is often violated. The graphical device developed here, the ratio plot, can be used for assessing specific departures from a Poisson distribution. For example, simple contaminations of an otherwise homogeneous Poisson model can be easily detected and a robust estimator for the population size can be suggested. Several robust estimators are developed and a simulation study is provided to give some guidance on which should be used in practice. More systematic departures can also easily be detected using the ratio plot. In this paper, the focus is on Gamma mixtures of the Poisson distribution which leads to a linear pattern (called structured heterogeneity) in the ratio plot. More generally, the paper shows that the ratio plot is monotone for arbitrary mixtures of power series densities.
Resumo:
In a recently published paper. spherical nonparametric estimators were applied to feature-track ensembles to determine a range of statistics for the atmospheric features considered. This approach obviates the types of bias normally introduced with traditional estimators. New spherical isotropic kernels with local support were introduced. Ln this paper the extension to spherical nonisotropic kernels with local support is introduced, together with a means of obtaining the shape and smoothing parameters in an objective way. The usefulness of spherical nonparametric estimators based on nonisotropic kernels is demonstrated with an application to an oceanographic feature-track ensemble.
Resumo:
The aim of this paper is essentially twofold: first, to describe the use of spherical nonparametric estimators for determining statistical diagnostic fields from ensembles of feature tracks on a global domain, and second, to report the application of these techniques to data derived from a modern general circulation model. New spherical kernel functions are introduced that are more efficiently computed than the traditional exponential kernels. The data-driven techniques of cross-validation to determine the amount elf smoothing objectively, and adaptive smoothing to vary the smoothing locally, are also considered. Also introduced are techniques for combining seasonal statistical distributions to produce longer-term statistical distributions. Although all calculations are performed globally, only the results for the Northern Hemisphere winter (December, January, February) and Southern Hemisphere winter (June, July, August) cyclonic activity are presented, discussed, and compared with previous studies. Overall, results for the two hemispheric winters are in good agreement with previous studies, both for model-based studies and observational studies.
Resumo:
Climate model simulations consistently show that in response to greenhouse gas forcing surface temperatures over land increase more rapidly than over sea. The enhanced warming over land is not simply a transient effect, since it is also present in equilibrium conditions. We examine 20 models from the IPCC AR4 database. The global land/sea warming ratio varies in the range 1.36–1.84, independent of global mean temperature change. In the presence of increasing radiative forcing, the warming ratio for a single model is fairly constant in time, implying that the land/sea temperature difference increases with time. The warming ratio varies with latitude, with a minimum in equatorial latitudes, and maxima in the subtropics. A simple explanation for these findings is provided, and comparisons are made with observations. For the low-latitude (40°S–40°N) mean, the models suggest a warming ratio of 1.51 ± 0.13, while recent observations suggest a ratio of 1.54 ± 0.09.
Resumo:
Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. The first paper of this series examined the effects of the former on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to create normally distributed random fields of different size that are realizations of known processes described by variograms with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were computed from these data by Matheron's estimator and by three robust estimators. The effects of standard data transformations on the coefficient of skewness and on the variogram were also investigated. Cross-validation was used to assess the performance of models fitted to experimental variograms computed from a range of data contaminated by outliers for kriging. The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both square roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near to the edge or the centre of the field. The results of cross-validation showed that the robust estimators and the removal of outliers were the most effective ways of dealing with outliers for variogram estimation and kriging. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
The use of special units for logarithmic ratio quantities is reviewed. The neper is used with a natural logarithm (logarithm to the base e) to express the logarithm of the amplitude ratio of two pure sinusoidal signals, particularly in the context of linear systems where it is desired to represent the gain or loss in amplitude of a single-frequency signal between the input and output. The bel, and its more commonly used submultiple, the decibel, are used with a decadic logarithm (logarithm to the base 10) to measure the ratio of two power-like quantities, such as a mean square signal or a mean square sound pressure in acoustics. Thus two distinctly different quantities are involved. In this review we define the quantities first, without reference to the units, as is standard practice in any system of quantities and units. We show that two different definitions of the quantity power level, or logarithmic power ratio, are possible. We show that this leads to two different interpretations for the meaning and numerical values of the units bel and decibel. We review the question of which of these alternative definitions is actually used, or is used by implication, by workers in the field. Finally, we discuss the relative advantages of the alternative definitions.
Resumo:
Comparison between observed and calculated infrared band contours has been made to determine the vibrational transition moment ratio |M10/M9| for the Coriolis interacting ν9 and ν10 perpendicular fundamentals of allene-h4. The ratio obtained is appreciably lower than that of a previous estimate and the result obtained by integrated band intensity measurements of Overend and Crawford. From the best estimate of the ratio, the dipole moment derivatives of the two bands are determined; the value for the weaker band ν9 is subject to a large uncertainty.
Resumo:
This paper investigates the applications of capture–recapture methods to human populations. Capture–recapture methods are commonly used in estimating the size of wildlife populations but can also be used in epidemiology and social sciences, for estimating prevalence of a particular disease or the size of the homeless population in a certain area. Here we focus on estimating the prevalence of infectious diseases. Several estimators of population size are considered: the Lincoln–Petersen estimator and its modified version, the Chapman estimator, Chao’s lower bound estimator, the Zelterman’s estimator, McKendrick’s moment estimator and the maximum likelihood estimator. In order to evaluate these estimators, they are applied to real, three-source, capture-recapture data. By conditioning on each of the sources of three source data, we have been able to compare the estimators with the true value that they are estimating. The Chapman and Chao estimators were compared in terms of their relative bias. A variance formula derived through conditioning is suggested for Chao’s estimator, and normal 95% confidence intervals are calculated for this and the Chapman estimator. We then compare the coverage of the respective confidence intervals. Furthermore, a simulation study is included to compare Chao’s and Chapman’s estimator. Results indicate that Chao’s estimator is less biased than Chapman’s estimator unless both sources are independent. Chao’s estimator has also the smaller mean squared error. Finally, the implications and limitations of the above methods are discussed, with suggestions for further development.
Resumo:
It is accepted that an important source of variation in the response of anoestrous ewes, to the introduction of rams, is the intensity of male stimulation. The aim of this study was to investigate strategies capable of increasing the impact and transmission of the ram stimuli. In Experiment 1, two groups of seven ewes (Bluefaced Leicester male x Swaledale female) were individually penned with one ram and for the next 6 h the rams either remained in the pen or were replaced hourly. Blood samples revealed no difference in the pattern of plasma LH secretion. In Experiment 2, three groups of 16 ewes were either introduced to one ram, individually (H) or in groups of 8 (L), or remained isolated. Ram introduction increased the plasma LH pulsatility (P < 0.001). H ewes displayed more (nine versus six) male-induced LH pulses (pulses occurring within the first 45 min) and more pulses per 8 h intervals than the L group of ewes (1.9 +/- 0.3 versus 1.3 +/- 0.3), but these differences were not significant. It was concluded that (i) frequent replacement of rams within a few hours following ram introduction to ewes does not further improve the response of ewes, especially if the ram:ewe ratio is high; (ii) the characterization of the plasma LH secretion parameters during a period of 6-8 h does not seem to be an effective method to detect small differences in the intensity of stimulation received by the ewes when exposed to rams; (iii) North Country Mule ewes (Bluefaced Leicester male x Swaledale female) in the UK respond to the presence of rams in spring (late oestrous/early anoestrous season) with an elevation in plasma LH secretion. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
We introduce a procedure for association based analysis of nuclear families that allows for dichotomous and more general measurements of phenotype and inclusion of covariate information. Standard generalized linear models are used to relate phenotype and its predictors. Our test procedure, based on the likelihood ratio, unifies the estimation of all parameters through the likelihood itself and yields maximum likelihood estimates of the genetic relative risk and interaction parameters. Our method has advantages in modelling the covariate and gene-covariate interaction terms over recently proposed conditional score tests that include covariate information via a two-stage modelling approach. We apply our method in a study of human systemic lupus erythematosus and the C-reactive protein that includes sex as a covariate.