136 resultados para covariance estimator
Resumo:
The arthropod species richness of pastures in three Azorean islands was used to examine the relationship between local and regional species richness over two years. Two groups of arthropods, spiders and sucking insects, representing two functionally different but common groups of pasture invertebrates were investigated. The local-regional species richness relationship was assessed over relatively fine scales: quadrats (= local scale) and within pastures (= regional scale). Mean plot species richness was used as a measure of local species richness (= alpha diversity) and regional species richness was estimated at the pasture level (= gamma diversity) with the 'first-order-Jackknife' estimator. Three related issues were addressed: (i) the role of estimated regional species richness and variables operating at the local scale (vegetation structure and diversity) in determining local species richness; (ii) quantification of the relative contributions of alpha and beta diversity to regional diversity using additive partitioning; and (iii) the occurrence of consistent patterns in different years by analysing independently between-year data. Species assemblages of spiders were saturated at the local scale (similar local species richness and increasing beta-diversity in richer regions) and were more dependent on vegetational structure than regional species richness. Sucking insect herbivores, by contrast, exhibited a linear relationship between local and regional species richness, consistent with the proportional sampling model. The patterns were consistent between years. These results imply that for spiders local processes are important, with assemblages in a particular patch being constrained by habitat structure. In contrast, for sucking insects, local processes may be insignificant in structuring communities.
Resumo:
Indirect and direct models of sexual selection make different predictions regarding the quantitative genetic relationships between sexual ornaments and fitness. Indirect models predict that ornaments should have a high heritability and that strong positive genetic covariance should exist between fitness and the ornament. Direct models, on the other hand, make no such assumptions about the level of genetic variance in fitness and the ornament, and are therefore likely to be more important when environmental sources of variation are large. Here we test these predictions in a wild population of the blue tit (Parus caeruleus), a species in which plumage coloration has been shown to be under sexual selection. Using 3 years of cross-fostering data from over 250 breeding attempts, we partition the covariance between parental coloration and aspects of nestling fitness into a genetic and environmental component. Contrary to indirect models of sexual selection, but in agreement with direct models, we show that variation in coloration is only weakly heritable (h(2) < 0.11), and that two components of offspring fitness-nestling size and fledgling recruitment-are strongly dependent on parental effects, rather than genetic effects. Furthermore, there was no evidence of significant positive genetic covariation between parental colour and offspring traits. Contrary to direct benefit models, however, we find little evidence that variation in colour reliably indicates the level of parental care provided by either males or females. Taken together, these results indicate that the assumptions of indirect models of sexual selection are not supported by the genetic basis of the traits reported on here.
Resumo:
Diebold and Lamb (1997) argue that since the long-run elasticity of supply derived from the Nerlovian model entails a ratio of random variables, it is without moments. They propose minimum expected loss estimation to correct this problem but in so-doing ignore the fact that a non white-noise-error is implicit in the model. We show that, as a consequence the estimator is biased and demonstrate that Bayesian estimation which fully accounts for the error structure is preferable.
Resumo:
Conventional seemingly unrelated estimation of the almost ideal demand system is shown to lead to small sample bias and distortions in the size of a Wald test for symmetry and homogeneity when the data are co-integrated. A fully modified estimator is developed in an attempt to remedy these problems. It is shown that this estimator reduces the small sample bias but fails to eliminate the size distortion.. Bootstrapping is shown to be ineffective as a method of removing small sample bias in both the conventional and fully modified estimators. Bootstrapping is effective, however, as a method of removing. size distortion and performs equally well in this respect with both estimators.
Resumo:
In clinical trials, situations often arise where more than one response from each patient is of interest; and it is required that any decision to stop the study be based upon some or all of these measures simultaneously. Theory for the design of sequential experiments with simultaneous bivariate responses is described by Jennison and Turnbull (Jennison, C., Turnbull, B. W. (1993). Group sequential tests for bivariate response: interim analyses of clinical trials with both efficacy and safety endpoints. Biometrics 49:741-752) and Cook and Farewell (Cook, R. J., Farewell, V. T. (1994). Guidelines for monitoring efficacy and toxicity responses in clinical trials. Biometrics 50:1146-1152) in the context of one efficacy and one safety response. These expositions are in terms of normally distributed data with known covariance. The methods proposed require specification of the correlation, ρ between test statistics monitored as part of the sequential test. It can be difficult to quantify ρ and previous authors have suggested simply taking the lowest plausible value, as this will guarantee power. This paper begins with an illustration of the effect that inappropriate specification of ρ can have on the preservation of trial error rates. It is shown that both the type I error and the power can be adversely affected. As a possible solution to this problem, formulas are provided for the calculation of correlation from data collected as part of the trial. An adaptive approach is proposed and evaluated that makes use of these formulas and an example is provided to illustrate the method. Attention is restricted to the bivariate case for ease of computation, although the formulas derived are applicable in the general multivariate case.
Resumo:
The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis - the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts have been derived in the context of linear statistical data assimilation in numerical weather prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the four-dimensional variational system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the boreal spring 2003 operational system, 15% of the global influence is due to the assimilated observations in any one analysis, and the complementary 85% is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 25% of the observational information is currently provided by surface-based observing systems, and 75% by satellite systems. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background-error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). Incorrect specifications of background and observation-error covariance matrices can be identified, interpreted and better understood by the use of influence-matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system. Copyright © 2004 Royal Meteorological Society
Resumo:
Proportion estimators are quite frequently used in many application areas. The conventional proportion estimator (number of events divided by sample size) encounters a number of problems when the data are sparse as will be demonstrated in various settings. The problem of estimating its variance when sample sizes become small is rarely addressed in a satisfying framework. Specifically, we have in mind applications like the weighted risk difference in multicenter trials or stratifying risk ratio estimators (to adjust for potential confounders) in epidemiological studies. It is suggested to estimate p using the parametric family (see PDF for character) and p(1 - p) using (see PDF for character), where (see PDF for character). We investigate the estimation problem of choosing c 0 from various perspectives including minimizing the average mean squared error of (see PDF for character), average bias and average mean squared error of (see PDF for character). The optimal value of c for minimizing the average mean squared error of (see PDF for character) is found to be independent of n and equals c = 1. The optimal value of c for minimizing the average mean squared error of (see PDF for character) is found to be dependent of n with limiting value c = 0.833. This might justifiy to use a near-optimal value of c = 1 in practice which also turns out to be beneficial when constructing confidence intervals of the form (see PDF for character).
Resumo:
The problem of estimating the individual probabilities of a discrete distribution is considered. The true distribution of the independent observations is a mixture of a family of power series distributions. First, we ensure identifiability of the mixing distribution assuming mild conditions. Next, the mixing distribution is estimated by non-parametric maximum likelihood and an estimator for individual probabilities is obtained from the corresponding marginal mixture density. We establish asymptotic normality for the estimator of individual probabilities by showing that, under certain conditions, the difference between this estimator and the empirical proportions is asymptotically negligible. Our framework includes Poisson, negative binomial and logarithmic series as well as binomial mixture models. Simulations highlight the benefit in achieving normality when using the proposed marginal mixture density approach instead of the empirical one, especially for small sample sizes and/or when interest is in the tail areas. A real data example is given to illustrate the use of the methodology.
Resumo:
It is common practice to design a survey with a large number of strata. However, in this case the usual techniques for variance estimation can be inaccurate. This paper proposes a variance estimator for estimators of totals. The method proposed can be implemented with standard statistical packages without any specific programming, as it involves simple techniques of estimation, such as regression fitting.
Resumo:
The systematic sampling (SYS) design (Madow and Madow, 1944) is widely used by statistical offices due to its simplicity and efficiency (e.g., Iachan, 1982). But it suffers from a serious defect, namely, that it is impossible to unbiasedly estimate the sampling variance (Iachan, 1982) and usual variance estimators (Yates and Grundy, 1953) are inadequate and can overestimate the variance significantly (Särndal et al., 1992). We propose a novel variance estimator which is less biased and that can be implemented with any given population order. We will justify this estimator theoretically and with a Monte Carlo simulation study.
Resumo:
We show that the Hájek (Ann. Math Statist. (1964) 1491) variance estimator can be used to estimate the variance of the Horvitz–Thompson estimator when the Chao sampling scheme (Chao, Biometrika 69 (1982) 653) is implemented. This estimator is simple and can be implemented with any statistical packages. We consider a numerical and an analytic method to show that this estimator can be used. A series of simulations supports our findings.
Resumo:
Background: The present paper investigates the question of a suitable basic model for the number of scrapie cases in a holding and applications of this knowledge to the estimation of scrapie-ffected holding population sizes and adequacy of control measures within holding. Is the number of scrapie cases proportional to the size of the holding in which case it should be incorporated into the parameter of the error distribution for the scrapie counts? Or, is there a different - potentially more complex - relationship between case count and holding size in which case the information about the size of the holding should be better incorporated as a covariate in the modeling? Methods: We show that this question can be appropriately addressed via a simple zero-truncated Poisson model in which the hypothesis of proportionality enters as a special offset-model. Model comparisons can be achieved by means of likelihood ratio testing. The procedure is illustrated by means of surveillance data on classical scrapie in Great Britain. Furthermore, the model with the best fit is used to estimate the size of the scrapie-affected holding population in Great Britain by means of two capture-recapture estimators: the Poisson estimator and the generalized Zelterman estimator. Results: No evidence could be found for the hypothesis of proportionality. In fact, there is some evidence that this relationship follows a curved line which increases for small holdings up to a maximum after which it declines again. Furthermore, it is pointed out how crucial the correct model choice is when applied to capture-recapture estimation on the basis of zero-truncated Poisson models as well as on the basis of the generalized Zelterman estimator. Estimators based on the proportionality model return very different and unreasonable estimates for the population sizes. Conclusion: Our results stress the importance of an adequate modelling approach to the association between holding size and the number of cases of classical scrapie within holding. Reporting artefacts and speculative biological effects are hypothesized as the underlying causes of the observed curved relationship. The lack of adjustment for these artefacts might well render ineffective the current strategies for the control of the disease.
Resumo:
Two simple and frequently used capture–recapture estimates of the population size are compared: Chao's lower-bound estimate and Zelterman's estimate allowing for contaminated distributions. In the Poisson case it is shown that if there are only counts of ones and twos, the estimator of Zelterman is always bounded above by Chao's estimator. If counts larger than two exist, the estimator of Zelterman is becoming larger than that of Chao's, if only the ratio of the frequencies of counts of twos and ones is small enough. A similar analysis is provided for the binomial case. For a two-component mixture of Poisson distributions the asymptotic bias of both estimators is derived and it is shown that the Zelterman estimator can experience large overestimation bias. A modified Zelterman estimator is suggested and also the bias-corrected version of Chao's estimator is considered. All four estimators are compared in a simulation study.
Resumo:
This paper investigates the applications of capture-recapture methods to human populations. Capture-recapture methods are commonly used in estimating the size of wildlife populations but can also be used in epidemiology and social sciences, for estimating prevalence of a particular disease or the size of the homeless population in a certain area. Here we focus on estimating the prevalence of infectious diseases. Several estimators of population size are considered: the Lincoln-Petersen estimator and its modified version, the Chapman estimator, Chao's lower bound estimator, the Zelterman's estimator, McKendrick's moment estimator and the maximum likelihood estimator. In order to evaluate these estimators, they are applied to real, three-source, capture-recapture data. By conditioning on each of the sources of three source data, we have been able to compare the estimators with the true value that they are estimating. The Chapman and Chao estimators were compared in terms of their relative bias. A variance formula derived through conditioning is suggested for Chao's estimator, and normal 95% confidence intervals are calculated for this and the Chapman estimator. We then compare the coverage of the respective confidence intervals. Furthermore, a simulation study is included to compare Chao's and Chapman's estimator. Results indicate that Chao's estimator is less biased than Chapman's estimator unless both sources are independent. Chao's estimator has also the smaller mean squared error. Finally, the implications and limitations of the above methods are discussed, with suggestions for further development.
Resumo:
Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.