969 resultados para covariance estimator


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The R-package “compositions”is a tool for advanced compositional analysis. Its basic functionality has seen some conceptual improvement, containing now some facilities to work with and represent ilr bases built from balances, and an elaborated subsys- tem for dealing with several kinds of irregular data: (rounded or structural) zeroes, incomplete observations and outliers. The general approach to these irregularities is based on subcompositions: for an irregular datum, one can distinguish a “regular” sub- composition (where all parts are actually observed and the datum behaves typically) and a “problematic” subcomposition (with those unobserved, zero or rounded parts, or else where the datum shows an erratic or atypical behaviour). Systematic classification schemes are proposed for both outliers and missing values (including zeros) focusing on the nature of irregularities in the datum subcomposition(s). To compute statistics with values missing at random and structural zeros, a projection approach is implemented: a given datum contributes to the estimation of the desired parameters only on the subcompositon where it was observed. For data sets with values below the detection limit, two different approaches are provided: the well-known imputation technique, and also the projection approach. To compute statistics in the presence of outliers, robust statistics are adapted to the characteristics of compositional data, based on the minimum covariance determinant approach. The outlier classification is based on four different models of outlier occur- rence and Monte-Carlo-based tests for their characterization. Furthermore the package provides special plots helping to understand the nature of outliers in the dataset. Keywords: coda-dendrogram, lost values, MAR, missing data, MCD estimator, robustness, rounded zeros

Relevância:

20.00% 20.00%

Publicador:

Resumo:

lecture slides for COMP6235

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study the role of natural resource windfalls in explaining the efficiency of public expenditures. Using a rich dataset of expenditures and public good provision for 1,836 municipalities in Peru for period 2001-2010, we estimate a non-monotonic relationship between the efficiency of public good provision and the level of natural resource transfers. Local governments that were extremely favored by the boom of mineral prices were more efficient in using fiscal windfalls whereas those benefited with modest transfers were more inefficient. These results can be explained by the increase in political competition associated with the boom. However, the fact that increases in efficiency were related to reductions in public good provision casts doubts about the beneficial effects of political competition in promoting efficiency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the case of a multicenter trial in which the center specific sample sizes are potentially small. Under homogeneity, the conventional procedure is to pool information using a weighted estimator where the weights used are inverse estimated center-specific variances. Whereas this procedure is efficient for conventional asymptotics (e. g. center-specific sample sizes become large, number of center fixed), it is commonly believed that the efficiency of this estimator holds true also for meta-analytic asymptotics (e.g. center-specific sample size bounded, potentially small, and number of centers large). In this contribution we demonstrate that this estimator fails to be efficient. In fact, it shows a persistent bias with increasing number of centers showing that it isnot meta-consistent. In addition, we show that the Cochran and Mantel-Haenszel weighted estimators are meta-consistent and, in more generality, provide conditions on the weights such that the associated weighted estimator is meta-consistent.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The jackknife method is often used for variance estimation in sample surveys but has only been developed for a limited class of sampling designs.We propose a jackknife variance estimator which is defined for any without-replacement unequal probability sampling design. We demonstrate design consistency of this estimator for a broad class of point estimators. A Monte Carlo study shows how the proposed estimator may improve on existing estimators.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe and evaluate a new estimator of the effective population size (N-e), a critical parameter in evolutionary and conservation biology. This new "SummStat" N-e. estimator is based upon the use of summary statistics in an approximate Bayesian computation framework to infer N-e. Simulations of a Wright-Fisher population with known N-e show that the SummStat estimator is useful across a realistic range of individuals and loci sampled, generations between samples, and N-e values. We also address the paucity of information about the relative performance of N-e estimators by comparing the SUMMStat estimator to two recently developed likelihood-based estimators and a traditional moment-based estimator. The SummStat estimator is the least biased of the four estimators compared. In 32 of 36 parameter combinations investigated rising initial allele frequencies drawn from a Dirichlet distribution, it has the lowest bias. The relative mean square error (RMSE) of the SummStat estimator was generally intermediate to the others. All of the estimators had RMSE > 1 when small samples (n = 20, five loci) were collected a generation apart. In contrast, when samples were separated by three or more generations and Ne less than or equal to 50, the SummStat and likelihood-based estimators all had greatly reduced RMSE. Under the conditions simulated, SummStat confidence intervals were more conservative than the likelihood-based estimators and more likely to include true N-e. The greatest strength of the SummStat estimator is its flexible structure. This flexibility allows it to incorporate any, potentially informative summary statistic from Population genetic data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel sparse kernel density estimator is derived based on a regression approach, which selects a very small subset of significant kernels by means of the D-optimality experimental design criterion using an orthogonal forward selection procedure. The weights of the resulting sparse kernel model are calculated using the multiplicative nonnegative quadratic programming algorithm. The proposed method is computationally attractive, in comparison with many existing kernel density estimation algorithms. Our numerical results also show that the proposed method compares favourably with other existing methods, in terms of both test accuracy and model sparsity, for constructing kernel density estimates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This correspondence introduces a new orthogonal forward regression (OFR) model identification algorithm using D-optimality for model structure selection and is based on an M-estimators of parameter estimates. M-estimator is a classical robust parameter estimation technique to tackle bad data conditions such as outliers. Computationally, The M-estimator can be derived using an iterative reweighted least squares (IRLS) algorithm. D-optimality is a model structure robustness criterion in experimental design to tackle ill-conditioning in model Structure. The orthogonal forward regression (OFR), often based on the modified Gram-Schmidt procedure, is an efficient method incorporating structure selection and parameter estimation simultaneously. The basic idea of the proposed approach is to incorporate an IRLS inner loop into the modified Gram-Schmidt procedure. In this manner, the OFR algorithm for parsimonious model structure determination is extended to bad data conditions with improved performance via the derivation of parameter M-estimators with inherent robustness to outliers. Numerical examples are included to demonstrate the effectiveness of the proposed algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Estimation of a population size by means of capture-recapture techniques is an important problem occurring in many areas of life and social sciences. We consider the frequencies of frequencies situation, where a count variable is used to summarize how often a unit has been identified in the target population of interest. The distribution of this count variable is zero-truncated since zero identifications do not occur in the sample. As an application we consider the surveillance of scrapie in Great Britain. In this case study holdings with scrapie that are not identified (zero counts) do not enter the surveillance database. The count variable of interest is the number of scrapie cases per holding. For count distributions a common model is the Poisson distribution and, to adjust for potential heterogeneity, a discrete mixture of Poisson distributions is used. Mixtures of Poissons usually provide an excellent fit as will be demonstrated in the application of interest. However, as it has been recently demonstrated, mixtures also suffer under the so-called boundary problem, resulting in overestimation of population size. It is suggested here to select the mixture model on the basis of the Bayesian Information Criterion. This strategy is further refined by employing a bagging procedure leading to a series of estimates of population size. Using the median of this series, highly influential size estimates are avoided. In limited simulation studies it is shown that the procedure leads to estimates with remarkable small bias.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dense deployments of wireless local area networks (WLANs) are becoming a norm in many cities around the world. However, increased interference and traffic demands can severely limit the aggregate throughput achievable unless an effective channel assignment scheme is used. In this work, a simple and effective distributed channel assignment (DCA) scheme is proposed. It is shown that in order to maximise throughput, each access point (AP) simply chooses the channel with the minimum number of active neighbour nodes (i.e. nodes associated with neighbouring APs that have packets to send). However, application of such a scheme to practice depends critically on its ability to estimate the number of neighbour nodes in each channel, for which no practical estimator has been proposed before. In view of this, an extended Kalman filter (EKF) estimator and an estimate of the number of nodes by AP are proposed. These not only provide fast and accurate estimates but can also exploit channel switching information of neighbouring APs. Extensive packet level simulation results show that the proposed minimum neighbour and EKF estimator (MINEK) scheme is highly scalable and can provide significant throughput improvement over other channel assignment schemes.