150 resultados para robust estimation statistics
Resumo:
Connections between Statistics and Archaeology have always appeared veryfruitful. The objective of this paper is to offer an outlook of somestatistical techniques that are being developed in the most recentyears and that can be of interest for archaeologists in the short run.
Resumo:
Most methods for small-area estimation are based on composite estimators derived from design- or model-based methods. A composite estimator is a linear combination of a direct and an indirect estimator with weights that usually depend on unknown parameters which need to be estimated. Although model-based small-area estimators are usually based on random-effects models, the assumption of fixed effects is at face value more appropriate.Model-based estimators are justified by the assumption of random (interchangeable) area effects; in practice, however, areas are not interchangeable. In the present paper we empirically assess the quality of several small-area estimators in the setting in which the area effects are treated as fixed. We consider two settings: one that draws samples from a theoretical population, and another that draws samples from an empirical population of a labor force register maintained by the National Institute of Social Security (NISS) of Catalonia. We distinguish two types of composite estimators: a) those that use weights that involve area specific estimates of bias and variance; and, b) those that use weights that involve a common variance and a common squared bias estimate for all the areas. We assess their precision and discuss alternatives to optimizing composite estimation in applications.
Resumo:
Agent-based computational economics is becoming widely used in practice. This paperexplores the consistency of some of its standard techniques. We focus in particular on prevailingwholesale electricity trading simulation methods. We include different supply and demandrepresentations and propose the Experience-Weighted Attractions method to include severalbehavioural algorithms. We compare the results across assumptions and to economic theorypredictions. The match is good under best-response and reinforcement learning but not underfictitious play. The simulations perform well under flat and upward-slopping supply bidding,and also for plausible demand elasticity assumptions. Learning is influenced by the number ofbids per plant and the initial conditions. The overall conclusion is that agent-based simulationassumptions are far from innocuous. We link their performance to underlying features, andidentify those that are better suited to model wholesale electricity markets.
Resumo:
In this paper we explore the effects of the minimum pension program on welfare andretirement in Spain. This is done with a stylized life-cycle model which provides a convenient analytical characterization of optimal behavior. We use data from the Spanish Social Security to estimate the behavioral parameters of the model and then simulate the changes induced by the minimum pension in aggregate retirement patterns. The impact is substantial: there is threefold increase in retirement at 60 (the age of first entitlement) with respect to the economy without minimum pensions, and total early retirement (before or at 60) is almost 50% larger.
Resumo:
We obtain minimax lower and upper bounds for the expected distortionredundancy of empirically designed vector quantizers. We show that the meansquared distortion of a vector quantizer designed from $n$ i.i.d. datapoints using any design algorithm is at least $\Omega (n^{-1/2})$ awayfrom the optimal distortion for some distribution on a bounded subset of${\cal R}^d$. Together with existing upper bounds this result shows thatthe minimax distortion redundancy for empirical quantizer design, as afunction of the size of the training data, is asymptotically on the orderof $n^{1/2}$. We also derive a new upper bound for the performance of theempirically optimal quantizer.
Resumo:
This paper establishes a general framework for metric scaling of any distance measure between individuals based on a rectangular individuals-by-variables data matrix. The method allows visualization of both individuals and variables as well as preserving all the good properties of principal axis methods such as principal components and correspondence analysis, based on the singular-value decomposition, including the decomposition of variance into components along principal axes which provide the numerical diagnostics known as contributions. The idea is inspired from the chi-square distance in correspondence analysis which weights each coordinate by an amount calculated from the margins of the data table. In weighted metric multidimensional scaling (WMDS) we allow these weights to be unknown parameters which are estimated from the data to maximize the fit to the original distances. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing a matrix and displaying its rows and columns in biplots.
Resumo:
A tool for user choice of the local bandwidth function for a kernel density estimate is developed using KDE, a graphical object-oriented package for interactive kernel density estimation written in LISP-STAT. The bandwidth function is a cubic spline, whose knots are manipulated by the user in one window, while the resulting estimate appears in another window. A real data illustration of this method raises concerns, because an extremely large family of estimates is available.
Resumo:
This work is part of a project studying the performance of model basedestimators in a small area context. We have chosen a simple statisticalapplication in which we estimate the growth rate of accupation for severalregions of Spain. We compare three estimators: the direct one based onstraightforward results from the survey (which is unbiassed), and a thirdone which is based in a statistical model and that minimizes the mean squareerror.
Resumo:
Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.
Resumo:
We have analyzed the spatial accuracy of European foreign trade statistics compared to Latin American. We have also included USA s data because of the importance of this country in Latin American trade. We have developed a method for mapping discrepancies between exporters and importers, trying to isolate systematic spatial deviations. Although our results don t allow a unique explanation, they present some interesting clues to the distribution channels in the Latin American Continent as well as some spatial deviations for statistics in individual countries. Connecting our results with the literature specialized in the accuracy of foreign trade statistics; we can revisit Morgernstern (1963) as well as Federico and Tena (1991). Morgernstern had had a really pessimistic view on the reliability of this statistic source, but his main alert was focused on the trade balances, not in gross export or import values. Federico and Tena (1991) have demonstrated howaccuracy increases by aggregation, geographical and of product at the same time. But they still have a pessimistic view with relation to distribution questions, remarking that perhaps it will be more accurate to use import sources in this latest case. We have stated that the data set coming from foreign trade statistics for a sample in 1925, being it exporters or importers, it s a valuable tool for geography of trade patterns, although in some specific cases it needs some spatial adjustments.
Resumo:
We analyze the effects of neutral and investment-specific technology shockson hours and output. Long cycles in hours are captured in a variety of ways.Hours robustly fall in response to neutral shocks and robustly increase inresponse to investment specific shocks. The percentage of the variance ofhours (output) explained by neutral shocks is small (large); the opposite istrue for investment specific shocks. News shocks are uncorrelated with theestimated technology shocks.
Resumo:
We set up a dynamic model of firm investment in which liquidity constraintsenter explicity into the firm's maximization problem. The optimal policyrules are incorporated into a maximum likelihood procedure which estimatesthe structural parameters of the model. Investment is positively related tothe firm's internal financial position when the firm is relatively poor. This relationship disappears for wealthy firms, which can reach theirdesired level of investment. Borrowing is an increasing function of financial position for poor firms. This relationship is reversed as a firm's financial position improves, and large firms hold little debt.Liquidity constrained firms may be unused credits lines and the capacity toinvest further if they desire. However the fear that liquidity constraintswill become binding in the future induces them to invest only when internalresources increase.We estimate the structural parameters of the model and use them to quantifythe importance of liquidity constraints on firms' investment. We find thatliquidity constraints matter significantly for the investment decisions of firms. If firms can finance investment by issuing fresh equity, rather than with internal funds or debt, average capital stock is almost 35% higher overa period of 20 years. Transitory shocks to internal funds have a sustained effect on the capital stock. This effect lasts for several periods and ismore persistent for small firms than for large firms. A 10% negative shock to firm fundamentals reduces the capital stock of firms which face liquidityconstraints by almost 8% over a period as opposed to only 3.5% for firms which do not face these constraints.
Resumo:
We develop a general error analysis framework for the Monte Carlo simulationof densities for functionals in Wiener space. We also study variancereduction methods with the help of Malliavin derivatives. For this, wegive some general heuristic principles which are applied to diffusionprocesses. A comparison with kernel density estimates is made.
Resumo:
We propose a method to estimate time invariant cyclical DSGE models using the informationprovided by a variety of filters. We treat data filtered with alternative procedures as contaminated proxies of the relevant model-based quantities and estimate structural and non-structuralparameters jointly using a signal extraction approach. We employ simulated data to illustratethe properties of the procedure and compare our conclusions with those obtained when just onefilter is used. We revisit the role of money in the transmission of monetary business cycles.
Resumo:
This paper proposes a nonparametric test in order to establish the level of accuracy of theforeign trade statistics of 17 Latin American countries when contrasted with the trade statistics of the main partners in 1925. The Wilcoxon Matched-Pairs Ranks test is used to determine whether the differences between the data registered by exporters and importers are meaningful, and if so, whether the differences are systematic in any direction. The paper tests for the reliability of the data registered for two homogeneous products, petroleum and coal, both in volume and value. The conclusion of the several exercises performed is that we cannot accept the existence of statistically significant differences between the data provided by the exporters and the registered by the importing countries in most cases. The qualitative historiography of Latin American describes its foreign trade statistics as mostly unusable. Our quantitative results contest this view.