97 resultados para Nonparametric regression techniques
Resumo:
How much would output increase if underdeveloped economies were toincrease their levels of schooling? We contribute to the development accounting literature by describing a non-parametric upper bound on theincrease in output that can be generated by more schooling. The advantage of our approach is that the upper bound is valid for any number ofschooling levels with arbitrary patterns of substitution/complementarity.Another advantage is that the upper bound is robust to certain forms ofendogenous technology response to changes in schooling. We also quantify the upper bound for all economies with the necessary data, compareour results with the standard development accounting approach, andprovide an update on the results using the standard approach for a largesample of countries.
Resumo:
The paper develops a method to solve higher-dimensional stochasticcontrol problems in continuous time. A finite difference typeapproximation scheme is used on a coarse grid of low discrepancypoints, while the value function at intermediate points is obtainedby regression. The stability properties of the method are discussed,and applications are given to test problems of up to 10 dimensions.Accurate solutions to these problems can be obtained on a personalcomputer.
Resumo:
Connections between Statistics and Archaeology have always appeared veryfruitful. The objective of this paper is to offer an outlook of somestatistical techniques that are being developed in the most recentyears and that can be of interest for archaeologists in the short run.
Resumo:
This paper applies the theoretical literature on nonparametric bounds ontreatment effects to the estimation of how limited English proficiency (LEP)affects wages and employment opportunities for Hispanic workers in theUnited States. I analyze the identifying power of several weak assumptionson treatment response and selection, and stress the interactions between LEPand education, occupation and immigration status. I show that thecombination of two weak but credible assumptions provides informative upperbounds on the returns to language skills for certain subgroups of thepopulation. Adding age at arrival as a monotone instrumental variable alsoprovides informative lower bounds.
Resumo:
In this paper we examine the determinants of wages and decompose theobserved differences across genders into the "explained by differentcharacteristics" and "explained by different returns components"using a sample of Spanish workers. Apart from the conditionalexpectation of wages, we estimate the conditional quantile functionsfor men and women and find that both the absolute wage gap and thepart attributed to different returns at each of the quantiles, farfrom being well represented by their counterparts at the mean, aregreater as we move up in the wage range.
Resumo:
We present an exact test for whether two random variables that have known bounds on their support are negatively correlated. The alternative hypothesis is that they are not negatively correlated. No assumptions are made on the underlying distributions. We show by example that the Spearman rank correlation test as the competing exact test of correlation in nonparametric settings rests on an additional assumption on the data generating process without which it is not valid as a test for correlation.We then show how to test for the significance of the slope in a linear regression analysis that invovles a single independent variable and where outcomes of the dependent variable belong to a known bounded set.
Resumo:
There are two fundamental puzzles about trade credit: why does it appearto be so expensive,and why do input suppliers engage in the business oflending money? This paper addresses and answers both questions analysingthe interaction between the financial and the industrial aspects of thesupplier-customer relationship. It examines how, in a context of limitedenforceability of contracts, suppliers may have a comparative advantageover banks in lending to their customers because they hold the extrathreat of stopping the supply of intermediate goods. Suppliers may alsoact as lenders of last resort, providing insurance against liquidityshocks that may endanger the survival of their customers. The relativelyhigh implicit interest rates of trade credit result from the existenceof default and insurance premia. The implications of the model areexamined empirically using parametric and nonparametric techniques on apanel of UK firms.
Resumo:
This paper presents a comparative analysis of linear and mixed modelsfor short term forecasting of a real data series with a high percentage of missing data. Data are the series of significant wave heights registered at regular periods of three hours by a buoy placed in the Bay of Biscay.The series is interpolated with a linear predictor which minimizes theforecast mean square error. The linear models are seasonal ARIMA models and themixed models have a linear component and a non linear seasonal component.The non linear component is estimated by a non parametric regression of dataversus time. Short term forecasts, no more than two days ahead, are of interestbecause they can be used by the port authorities to notice the fleet.Several models are fitted and compared by their forecasting behavior.
Resumo:
This paper proposes a nonparametric test in order to establish the level of accuracy of theforeign trade statistics of 17 Latin American countries when contrasted with the trade statistics of the main partners in 1925. The Wilcoxon Matched-Pairs Ranks test is used to determine whether the differences between the data registered by exporters and importers are meaningful, and if so, whether the differences are systematic in any direction. The paper tests for the reliability of the data registered for two homogeneous products, petroleum and coal, both in volume and value. The conclusion of the several exercises performed is that we cannot accept the existence of statistically significant differences between the data provided by the exporters and the registered by the importing countries in most cases. The qualitative historiography of Latin American describes its foreign trade statistics as mostly unusable. Our quantitative results contest this view.
Resumo:
The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.
Resumo:
We continue the development of a method for the selection of a bandwidth or a number of design parameters in density estimation. We provideexplicit non-asymptotic density-free inequalities that relate the $L_1$ error of the selected estimate with that of the best possible estimate,and study in particular the connection between the richness of the classof density estimates and the performance bound. For example, our methodallows one to pick the bandwidth and kernel order in the kernel estimatesimultaneously and still assure that for {\it all densities}, the $L_1$error of the corresponding kernel estimate is not larger than aboutthree times the error of the estimate with the optimal smoothing factor and kernel plus a constant times $\sqrt{\log n/n}$, where $n$ is the sample size, and the constant only depends on the complexity of the family of kernels used in the estimate. Further applications include multivariate kernel estimates, transformed kernel estimates, and variablekernel estimates.
Resumo:
This paper proposes a common and tractable framework for analyzingdifferent definitions of fixed and random effects in a contant-slopevariable-intercept model. It is shown that, regardless of whethereffects (i) are treated as parameters or as an error term, (ii) areestimated in different stages of a hierarchical model, or whether (iii)correlation between effects and regressors is allowed, when the sameinformation on effects is introduced into all estimation methods, theresulting slope estimator is also the same across methods. If differentmethods produce different results, it is ultimately because differentinformation is being used for each methods.
Resumo:
This paper shows how recently developed regression-based methods for thedecomposition of health inequality can be extended to incorporateindividual heterogeneity in the responses of health to the explanatoryvariables. We illustrate our method with an application to the CanadianNPHS of 1994. Our strategy for the estimation of heterogeneous responsesis based on the quantile regression model. The results suggest that thereis an important degree of heterogeneity in the association of health toexplanatory variables which, in turn, accounts for a substantial percentageof inequality in observed health. A particularly interesting finding isthat the marginal response of health to income is zero for healthyindividuals but positive and significant for unhealthy individuals. Theheterogeneity in the income response reduces both overall health inequalityand income related health inequality.
Resumo:
The current operational very short-term and short-term quantitative precipitation forecast (QPF) at the Meteorological Service of Catalonia (SMC) is made by three different methodologies: Advection of the radar reflectivity field (ADV), Identification, tracking and forecasting of convective structures (CST) and numerical weather prediction (NWP) models using observational data assimilation (radar, satellite, etc.). These precipitation forecasts have different characteristics, lead time and spatial resolutions. The objective of this study is to combine these methods in order to obtain a single and optimized QPF at each lead time. This combination (blending) of the radar forecast (ADV and CST) and precipitation forecast from NWP model is carried out by means of different methodologies according to the prediction horizon. Firstly, in order to take advantage of the rainfall location and intensity from radar observations, a phase correction technique is applied to the NWP output to derive an additional corrected forecast (MCO). To select the best precipitation estimation in the first and second hour (t+1 h and t+2 h), the information from radar advection (ADV) and the corrected outputs from the model (MCO) are mixed by using different weights, which vary dynamically, according to indexes that quantify the quality of these predictions. This procedure has the ability to integrate the skill of rainfall location and patterns that are given by the advection of radar reflectivity field with the capacity of generating new precipitation areas from the NWP models. From the third hour (t+3 h), as radar-based forecasting has generally low skills, only the quantitative precipitation forecast from model is used. This blending of different sources of prediction is verified for different types of episodes (convective, moderately convective and stratiform) to obtain a robust methodology for implementing it in an operational and dynamic way.