898 resultados para Nonparametric Estimators
Resumo:
This paper applies the theoretical literature on nonparametric bounds ontreatment effects to the estimation of how limited English proficiency (LEP)affects wages and employment opportunities for Hispanic workers in theUnited States. I analyze the identifying power of several weak assumptionson treatment response and selection, and stress the interactions between LEPand education, occupation and immigration status. I show that thecombination of two weak but credible assumptions provides informative upperbounds on the returns to language skills for certain subgroups of thepopulation. Adding age at arrival as a monotone instrumental variable alsoprovides informative lower bounds.
Resumo:
In the fixed design regression model, additional weights areconsidered for the Nadaraya--Watson and Gasser--M\"uller kernel estimators.We study their asymptotic behavior and the relationships between new andclassical estimators. For a simple family of weights, and considering theIMSE as global loss criterion, we show some possible theoretical advantages.An empirical study illustrates the performance of the weighted estimatorsin finite samples.
Resumo:
In this article we propose using small area estimators to improve the estimatesof both the small and large area parameters. When the objective is to estimateparameters at both levels accurately, optimality is achieved by a mixed sampledesign of fixed and proportional allocations. In the mixed sample design, oncea sample size has been determined, one fraction of it is distributedproportionally among the different small areas while the rest is evenlydistributed among them. We use Monte Carlo simulations to assess theperformance of the direct estimator and two composite covariant-freesmall area estimators, for different sample sizes and different sampledistributions. Performance is measured in terms of Mean Squared Errors(MSE) of both small and large area parameters. It is found that the adoptionof small area composite estimators open the possibility of 1) reducingsample size when precision is given, or 2) improving precision for a givensample size.
Resumo:
Most methods for small-area estimation are based on composite estimators derived from design- or model-based methods. A composite estimator is a linear combination of a direct and an indirect estimator with weights that usually depend on unknown parameters which need to be estimated. Although model-based small-area estimators are usually based on random-effects models, the assumption of fixed effects is at face value more appropriate.Model-based estimators are justified by the assumption of random (interchangeable) area effects; in practice, however, areas are not interchangeable. In the present paper we empirically assess the quality of several small-area estimators in the setting in which the area effects are treated as fixed. We consider two settings: one that draws samples from a theoretical population, and another that draws samples from an empirical population of a labor force register maintained by the National Institute of Social Security (NISS) of Catalonia. We distinguish two types of composite estimators: a) those that use weights that involve area specific estimates of bias and variance; and, b) those that use weights that involve a common variance and a common squared bias estimate for all the areas. We assess their precision and discuss alternatives to optimizing composite estimation in applications.
Resumo:
This paper proposes a nonparametric test in order to establish the level of accuracy of theforeign trade statistics of 17 Latin American countries when contrasted with the trade statistics of the main partners in 1925. The Wilcoxon Matched-Pairs Ranks test is used to determine whether the differences between the data registered by exporters and importers are meaningful, and if so, whether the differences are systematic in any direction. The paper tests for the reliability of the data registered for two homogeneous products, petroleum and coal, both in volume and value. The conclusion of the several exercises performed is that we cannot accept the existence of statistically significant differences between the data provided by the exporters and the registered by the importing countries in most cases. The qualitative historiography of Latin American describes its foreign trade statistics as mostly unusable. Our quantitative results contest this view.
Resumo:
This paper investigates the comparative performance of five small areaestimators. We use Monte Carlo simulation in the context of boththeoretical and empirical populations. In addition to the direct andindirect estimators, we consider the optimal composite estimator withpopulation weights, and two composite estimators with estimatedweights: one that assumes homogeneity of within area variance andsquare bias, and another one that uses area specific estimates ofvariance and square bias. It is found that among the feasibleestimators, the best choice is the one that uses area specificestimates of variance and square bias.
Resumo:
We continue the development of a method for the selection of a bandwidth or a number of design parameters in density estimation. We provideexplicit non-asymptotic density-free inequalities that relate the $L_1$ error of the selected estimate with that of the best possible estimate,and study in particular the connection between the richness of the classof density estimates and the performance bound. For example, our methodallows one to pick the bandwidth and kernel order in the kernel estimatesimultaneously and still assure that for {\it all densities}, the $L_1$error of the corresponding kernel estimate is not larger than aboutthree times the error of the estimate with the optimal smoothing factor and kernel plus a constant times $\sqrt{\log n/n}$, where $n$ is the sample size, and the constant only depends on the complexity of the family of kernels used in the estimate. Further applications include multivariate kernel estimates, transformed kernel estimates, and variablekernel estimates.
Resumo:
The OLS estimator of the intergenerational earnings correlation is biased towards zero, while the instrumental variables estimator is biased upwards. The first of these results arises because of measurement error, while the latter rests on the presumption that the education of the parent family is an invalid instrument. We propose a panel data framework for quantifying the asymptotic biases of these estimators, as well as a mis-specification test for the IV estimator. [Author]
Resumo:
We conduct a large-scale comparative study on linearly combining superparent-one-dependence estimators (SPODEs), a popular family of seminaive Bayesian classifiers. Altogether, 16 model selection and weighing schemes, 58 benchmark data sets, and various statistical tests are employed. This paper's main contributions are threefold. First, it formally presents each scheme's definition, rationale, and time complexity and hence can serve as a comprehensive reference for researchers interested in ensemble learning. Second, it offers bias-variance analysis for each scheme's classification error performance. Third, it identifies effective schemes that meet various needs in practice. This leads to accurate and fast classification algorithms which have an immediate and significant impact on real-world applications. Another important feature of our study is using a variety of statistical tests to evaluate multiple learning methods across multiple data sets.
Resumo:
An e cient procedure for the blind inversion of a nonlinear Wiener system is proposed. We proved that the problem can be expressed as a problem of blind source separation in nonlinear mixtures, for which a solution has been recently proposed. Based on a quasi-nonparametric relative gradient descent, the proposed algorithm can perform e ciently even in the presence of hard distortions.
Resumo:
Microsatellite loci mutate at an extremely high rate and are generally thought to evolve through a stepwise mutation model. Several differentiation statistics taking into account the particular mutation scheme of the microsatellite have been proposed. The most commonly used is R(ST) which is independent of the mutation rate under a generalized stepwise mutation model. F(ST) and R(ST) are commonly reported in the literature, but often differ widely. Here we compare their statistical performances using individual-based simulations of a finite island model. The simulations were run under different levels of gene flow, mutation rates, population number and sizes. In addition to the per locus statistical properties, we compare two ways of combining R(ST) over loci. Our simulations show that even under a strict stepwise mutation model, no statistic is best overall. All estimators suffer to different extents from large bias and variance. While R(ST) better reflects population differentiation in populations characterized by very low gene-exchange, F(ST) gives better estimates in cases of high levels of gene flow. The number of loci sampled (12, 24, or 96) has only a minor effect on the relative performance of the estimators under study. For all estimators there is a striking effect of the number of samples, with the differentiation estimates showing very odd distributions for two samples.
Resumo:
This paper is concerned with the derivation of new estimators and performance bounds for the problem of timing estimation of (linearly) digitally modulated signals. The conditional maximum likelihood (CML) method is adopted, in contrast to the classical low-SNR unconditional ML (UML) formulationthat is systematically applied in the literature for the derivationof non-data-aided (NDA) timing-error-detectors (TEDs). A new CML TED is derived and proved to be self-noise free, in contrast to the conventional low-SNR-UML TED. In addition, the paper provides a derivation of the conditional Cramér–Rao Bound (CRB ), which is higher (less optimistic) than the modified CRB (MCRB)[which is only reached by decision-directed (DD) methods]. It is shown that the CRB is a lower bound on the asymptotic statisticalaccuracy of the set of consistent estimators that are quadratic with respect to the received signal. Although the obtained boundis not general, it applies to most NDA synchronizers proposed in the literature. A closed-form expression of the conditional CRBis obtained, and numerical results confirm that the CML TED attains the new bound for moderate to high Eg/No.
Resumo:
We propose robust estimators of the generalized log-gamma distribution and, more generally, of location-shape-scale families of distributions. A (weighted) Q tau estimator minimizes a tau scale of the differences between empirical and theoretical quantiles. It is n(1/2) consistent; unfortunately, it is not asymptotically normal and, therefore, inconvenient for inference. However, it is a convenient starting point for a one-step weighted likelihood estimator, where the weights are based on a disparity measure between the model density and a kernel density estimate. The one-step weighted likelihood estimator is asymptotically normal and fully efficient under the model. It is also highly robust under outlier contamination. Supplementary materials are available online.
Resumo:
En aquest treball estudiem si el valor intrínsec de Tubacex entre 1994-2013 coincideix amb la seva tendència bursàtil a llarg termini, tenint en compte part de la teoria defensada per Shiller. També verifiquem la possible infravaloració de l’acció de Tubacex a 31/12/13. A la primera part expliquem els principals mètodes de valoració d’empreses y a la segona part fem una anàlisi del sector en el que opera Tubacex (acer inoxidable) i calculem el valor de l’acció de Tubacex per mitjà de tres mètodes de valoració (Free Cash Flow, Cash Flow i Valor en Llibres). Apliquem aquests tres mètodes de valoració per verificar si com a mínim algun d’ells coincideix amb la tendència bursàtil a llarg termini.