94 resultados para Nonparametric estimation
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
A method to estimate an extreme quantile that requires no distributional assumptions is presented. The approach is based on transformed kernel estimation of the cumulative distribution function (cdf). The proposed method consists of a double transformation kernel estimation. We derive optimal bandwidth selection methods that have a direct expression for the smoothing parameter. The bandwidth can accommodate to the given quantile level. The procedure is useful for large data sets and improves quantile estimation compared to other methods in heavy tailed distributions. Implementation is straightforward and R programs are available.
Resumo:
The aim of this article is to assess the effects of several territorial characteristics, specifically agglomeration economies, on industrial location processes in the Spanish region of Catalonia. Theoretically, the level of agglomeration causes economies which favour the location of new establishments, but an excessive level of agglomeration might cause diseconomies, since congestion effects arise. The empirical evidence on this matter is inconclusive, probably because the models used so far are not suitable enough. We use a more flexible semiparametric specification, which allows us to study the nonlinear relationship between the different types of agglomeration levels and location processes. Our main statistical source is the REIC (Catalan Manufacturing Establishments Register), which has plant-level microdata on location of new industrial establishments. Keywords: agglomeration economies, industrial location, Generalized Additive Models, nonparametric estimation, count data models.
Resumo:
The last 20 years have seen a significant evolution in the literature on horizontal inequity (HI) and have generated two major and "rival" methodological strands, namely, classical HI and reranking. We propose in this paper a class of ethically flexible tools that integrate these two strands. This is achieved using a measure of inequality that merges the well-known Gini coefficient and Atkinson indices, and that allows a decomposition of the total redistributive effect of taxes and transfers in a vertical equity effect and a loss of redistribution due to either classical HI or reranking. An inequality-change approach and a money-metric cost-of-inequality approach are developed. The latter approach makes aggregate classical HI decomposable across groups. As in recent work, equals are identified through a nonparametric estimation of the joint density of gross and net incomes. An illustration using Canadian data from 1981 to 1994 shows a substantial, and increasing, robust erosion of redistribution attributable both to classical HI and to reranking, but does not reveal which of reranking or classical HI is more important since this requires a judgement that is fundamentally normative in nature.
Resumo:
Given a model that can be simulated, conditional moments at a trial parameter value can be calculated with high accuracy by applying kernel smoothing methods to a long simulation. With such conditional moments in hand, standard method of moments techniques can be used to estimate the parameter. Since conditional moments are calculated using kernel smoothing rather than simple averaging, it is not necessary that the model be simulable subject to the conditioning information that is used to define the moment conditions. For this reason, the proposed estimator is applicable to general dynamic latent variable models. Monte Carlo results show that the estimator performs well in comparison to other estimators that have been proposed for estimation of general DLV models.
Resumo:
Abstract. Given a model that can be simulated, conditional moments at a trial parameter value can be calculated with high accuracy by applying kernel smoothing methods to a long simulation. With such conditional moments in hand, standard method of moments techniques can be used to estimate the parameter. Because conditional moments are calculated using kernel smoothing rather than simple averaging, it is not necessary that the model be simulable subject to the conditioning information that is used to define the moment conditions. For this reason, the proposed estimator is applicable to general dynamic latent variable models. It is shown that as the number of simulations diverges, the estimator is consistent and a higher-order expansion reveals the stochastic difference between the infeasible GMM estimator based on the same moment conditions and the simulated version. In particular, we show how to adjust standard errors to account for the simulations. Monte Carlo results show how the estimator may be applied to a range of dynamic latent variable (DLV) models, and that it performs well in comparison to several other estimators that have been proposed for DLV models.
Resumo:
This paper presents an analysis of motor vehicle insurance claims relating to vehicle damage and to associated medical expenses. We use univariate severity distributions estimated with parametric and non-parametric methods. The methods are implemented using the statistical package R. Parametric analysis is limited to estimation of normal and lognormal distributions for each of the two claim types. The nonparametric analysis presented involves kernel density estimation. We illustrate the benefits of applying transformations to data prior to employing kernel based methods. We use a log-transformation and an optimal transformation amongst a class of transformations that produces symmetry in the data. The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management. To this end, we have included in the Appendix of this paper all the R code that has been used in the analysis so that readers, both students and educators, can fully explore the techniques described
Resumo:
This paper applies the theoretical literature on nonparametric bounds ontreatment effects to the estimation of how limited English proficiency (LEP)affects wages and employment opportunities for Hispanic workers in theUnited States. I analyze the identifying power of several weak assumptionson treatment response and selection, and stress the interactions between LEPand education, occupation and immigration status. I show that thecombination of two weak but credible assumptions provides informative upperbounds on the returns to language skills for certain subgroups of thepopulation. Adding age at arrival as a monotone instrumental variable alsoprovides informative lower bounds.
Resumo:
We continue the development of a method for the selection of a bandwidth or a number of design parameters in density estimation. We provideexplicit non-asymptotic density-free inequalities that relate the $L_1$ error of the selected estimate with that of the best possible estimate,and study in particular the connection between the richness of the classof density estimates and the performance bound. For example, our methodallows one to pick the bandwidth and kernel order in the kernel estimatesimultaneously and still assure that for {\it all densities}, the $L_1$error of the corresponding kernel estimate is not larger than aboutthree times the error of the estimate with the optimal smoothing factor and kernel plus a constant times $\sqrt{\log n/n}$, where $n$ is the sample size, and the constant only depends on the complexity of the family of kernels used in the estimate. Further applications include multivariate kernel estimates, transformed kernel estimates, and variablekernel estimates.
Resumo:
The most suitable method for estimation of size diversity is investigated. Size diversity is computed on the basis of the Shannon diversity expression adapted for continuous variables, such as size. It takes the form of an integral involving the probability density function (pdf) of the size of the individuals. Different approaches for the estimation of pdf are compared: parametric methods, assuming that data come from a determinate family of pdfs, and nonparametric methods, where pdf is estimated using some kind of local evaluation. Exponential, generalized Pareto, normal, and log-normal distributions have been used to generate simulated samples using estimated parameters from real samples. Nonparametric methods include discrete computation of data histograms based on size intervals and continuous kernel estimation of pdf. Kernel approach gives accurate estimation of size diversity, whilst parametric methods are only useful when the reference distribution have similar shape to the real one. Special attention is given for data standardization. The division of data by the sample geometric mean is proposedas the most suitable standardization method, which shows additional advantages: the same size diversity value is obtained when using original size or log-transformed data, and size measurements with different dimensionality (longitudes, areas, volumes or biomasses) may be immediately compared with the simple addition of ln k where kis the dimensionality (1, 2, or 3, respectively). Thus, the kernel estimation, after data standardization by division of sample geometric mean, arises as the most reliable and generalizable method of size diversity evaluation
Resumo:
This comment corrects the errors in the estimation process that appear in Martins (2001). The first error is in the parametric probit estimation, as the previously presented results do not maximize the log-likelihood function. In the global maximum more variables become significant. As for the semiparametric estimation method, the kernel function used in Martins (2001) can take on both positive and negative values, which implies that the participation probability estimates may be outside the interval [0,1]. We have solved the problem by applying local smoothing in the kernel estimation, as suggested by Klein and Spady (1993).
Resumo:
Lean meat percentage (LMP) is an important carcass quality parameter. The aim of this work is to obtain a calibration equation for the Computed Tomography (CT) scans with the Partial Least Square Regression (PLS) technique in order to predict the LMP of the carcass and the different cuts and to study and compare two different methodologies of the selection of the variables (Variable Importance for Projection — VIP- and Stepwise) to be included in the prediction equation. The error of prediction with cross-validation (RMSEPCV) of the LMP obtained with PLS and selection based on VIP value was 0.82% and for stepwise selection it was 0.83%. The prediction of the LMP scanning only the ham had a RMSEPCV of 0.97% and if the ham and the loin were scanned the RMSEPCV was 0.90%. Results indicate that for CT data both VIP and stepwise selection are good methods. Moreover the scanning of only the ham allowed us to obtain a good prediction of the LMP of the whole carcass.
Resumo:
Properties of GMM estimators for panel data, which have become very popular in the empirical economic growth literature, are not well known when the number of individuals is small. This paper analyses through Monte Carlo simulations the properties of various GMM and other estimators when the number of individuals is the one typically available in country growth studies. It is found that, provided that some persistency is present in the series, the system GMM estimator has a lower bias and higher efficiency than all the other estimators analysed, including the standard first-differences GMM estimator.
Resumo:
This is a guide that explains how to use software that implements the simulated nonparametric moments (SNM) estimator proposed by Creel and Kristensen (2009). The guide shows how results of that paper may easily be replicated, and explains how to install and use the software for estimation of simulable econometric models.
Resumo:
This paper analyses the impact of using different correlation assumptions between lines of business when estimating the risk-based capital reserve, the Solvency Capital Requirement (SCR), under Solvency II regulations. A case study is presented and the SCR is calculated according to the Standard Model approach. Alternatively, the requirement is then calculated using an Internal Model based on a Monte Carlo simulation of the net underwriting result at a one-year horizon, with copulas being used to model the dependence between lines of business. To address the impact of these model assumptions on the SCR we conduct a sensitivity analysis. We examine changes in the correlation matrix between lines of business and address the choice of copulas. Drawing on aggregate historical data from the Spanish non-life insurance market between 2000 and 2009, we conclude that modifications of the correlation and dependence assumptions have a significant impact on SCR estimation.
Resumo:
Report for the scientific sojourn at the the Philipps-Universität Marburg, Germany, from september to december 2007. For the first, we employed the Energy-Decomposition Analysis (EDA) to investigate aromaticity on Fischer carbenes as it is related through all the reaction mechanisms studied in my PhD thesis. This powerful tool, compared with other well-known aromaticity indices in the literature like NICS, is useful not only for quantitative results but also to measure the degree of conjugation or hyperconjugation in molecules. Our results showed for the annelated benzenoid systems studied here, that electron density is more concentrated on the outer rings than in the central one. The strain-induced bond localization plays a major role as a driven force to keep the more substituted ring as the less aromatic. The discussion presented in this work was contrasted at different levels of theory to calibrate the method and ensure the consistency of our results. We think these conclusions can also be extended to arene chemistry for explaining aromaticity and regioselectivity reactions found in those systems.In the second work, we have employed the Turbomole program package and density-functionals of the best performance in the state of art, to explore reaction mechanisms in the noble gas chemistry. Particularly, we were interested in compounds of the form H--Ng--Ng--F (where Ng (Noble Gas) = Ar, Kr and Xe) and we investigated the relative stability of these species. Our quantum chemical calculations predict that the dixenon compound HXeXeF has an activation barrier for decomposition of 11 kcal/mol which should be large enough to identify the molecule in a low-temperature matrix. The other noble gases present lower activation barriers and therefore are more labile and difficult to be observable systems experimentally.