50 resultados para Statistics, Nonparametric
Resumo:
To detect directional couplings from time series various measures based on distances in reconstructed state spaces were introduced. These measures can, however, be biased by asymmetries in the dynamics' structure, noise color, or noise level, which are ubiquitous in experimental signals. Using theoretical reasoning and results from model systems we identify the various sources of bias and show that most of them can be eliminated by an appropriate normalization. We furthermore diminish the remaining biases by introducing a measure based on ranks of distances. This rank-based measure outperforms existing distance-based measures concerning both sensitivity and specificity for directional couplings. Therefore, our findings are relevant for a reliable detection of directional couplings from experimental signals.
Resumo:
The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Centralnotations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform.In this way very elaborated aspects of mathematical statistics can be understoodeasily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating,combination of likelihood and robust M-estimation functions are simple additions/perturbations in A2(Pprior). Weighting observations corresponds to a weightedaddition of the corresponding evidence.Likelihood based statistics for general exponential families turns out to have aparticularly easy interpretation in terms of A2(P). Regular exponential families formfinite dimensional linear subspaces of A2(P) and they correspond to finite dimensionalsubspaces formed by their posterior in the dual information space A2(Pprior).The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P.The discussion of A2(P) valued random variables, such as estimation functionsor likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning
Resumo:
In this paper we propose a subsampling estimator for the distribution ofstatistics diverging at either known rates when the underlying timeseries in strictly stationary abd strong mixing. Based on our results weprovide a detailed discussion how to estimate extreme order statisticswith dependent data and present two applications to assessing financialmarket risk. Our method performs well in estimating Value at Risk andprovides a superior alternative to Hill's estimator in operationalizingSafety First portofolio selection.
Resumo:
We introduce simple nonparametric density estimators that generalize theclassical histogram and frequency polygon. The new estimators are expressed as linear combination of density functions that are piecewisepolynomials, where the coefficients are optimally chosen in order to minimize the integrated square error of the estimator. We establish the asymptotic behaviour of the proposed estimators, and study theirperformance in a simulation study.
Resumo:
How much would output increase if underdeveloped economies were toincrease their levels of schooling? We contribute to the development accounting literature by describing a non-parametric upper bound on theincrease in output that can be generated by more schooling. The advantage of our approach is that the upper bound is valid for any number ofschooling levels with arbitrary patterns of substitution/complementarity.Another advantage is that the upper bound is robust to certain forms ofendogenous technology response to changes in schooling. We also quantify the upper bound for all economies with the necessary data, compareour results with the standard development accounting approach, andprovide an update on the results using the standard approach for a largesample of countries.
Resumo:
This paper deals with the impact of "early" nineteenth-century globalization (c.1815-1860) on foreign trade in the Southern Cone (SC). Most of the evidence is drawn from bilateral trades between Britain and the SC, at a time when Britain was the main commercial partner of the new republics. The main conclusion drawn is that early globalization had a positive impact on foreign trade in the SC, and this was due to: improvements in the SC's terms of trade during this period; the SC's per capita consumption of textiles (the main manufacture traded on world markets at that time) increased substantially during this period, at a time when clothing was one of the main items of SC household budgets; British merchants brought with them capital, shipping, insurance, and also facilitated the formation of vast global networks, which further promoted the SC's exports to a wider range of outlets.
Resumo:
We introduce several exact nonparametric tests for finite sample multivariatelinear regressions, and compare their powers. This fills an important gap inthe literature where the only known nonparametric tests are either asymptotic,or assume one covariate only.
Resumo:
Random coefficient regression models have been applied in differentfields and they constitute a unifying setup for many statisticalproblems. The nonparametric study of this model started with Beranand Hall (1992) and it has become a fruitful framework. In thispaper we propose and study statistics for testing a basic hypothesisconcerning this model: the constancy of coefficients. The asymptoticbehavior of the statistics is investigated and bootstrapapproximations are used in order to determine the critical values ofthe test statistics. A simulation study illustrates the performanceof the proposals.
Resumo:
This paper applies the theoretical literature on nonparametric bounds ontreatment effects to the estimation of how limited English proficiency (LEP)affects wages and employment opportunities for Hispanic workers in theUnited States. I analyze the identifying power of several weak assumptionson treatment response and selection, and stress the interactions between LEPand education, occupation and immigration status. I show that thecombination of two weak but credible assumptions provides informative upperbounds on the returns to language skills for certain subgroups of thepopulation. Adding age at arrival as a monotone instrumental variable alsoprovides informative lower bounds.
Resumo:
In the fixed design regression model, additional weights areconsidered for the Nadaraya--Watson and Gasser--M\"uller kernel estimators.We study their asymptotic behavior and the relationships between new andclassical estimators. For a simple family of weights, and considering theIMSE as global loss criterion, we show some possible theoretical advantages.An empirical study illustrates the performance of the weighted estimatorsin finite samples.
Resumo:
This paper presents a comparative analysis of linear and mixed modelsfor short term forecasting of a real data series with a high percentage of missing data. Data are the series of significant wave heights registered at regular periods of three hours by a buoy placed in the Bay of Biscay.The series is interpolated with a linear predictor which minimizes theforecast mean square error. The linear models are seasonal ARIMA models and themixed models have a linear component and a non linear seasonal component.The non linear component is estimated by a non parametric regression of dataversus time. Short term forecasts, no more than two days ahead, are of interestbecause they can be used by the port authorities to notice the fleet.Several models are fitted and compared by their forecasting behavior.
Resumo:
We have analyzed the spatial accuracy of European foreign trade statistics compared to Latin American. We have also included USA s data because of the importance of this country in Latin American trade. We have developed a method for mapping discrepancies between exporters and importers, trying to isolate systematic spatial deviations. Although our results don t allow a unique explanation, they present some interesting clues to the distribution channels in the Latin American Continent as well as some spatial deviations for statistics in individual countries. Connecting our results with the literature specialized in the accuracy of foreign trade statistics; we can revisit Morgernstern (1963) as well as Federico and Tena (1991). Morgernstern had had a really pessimistic view on the reliability of this statistic source, but his main alert was focused on the trade balances, not in gross export or import values. Federico and Tena (1991) have demonstrated howaccuracy increases by aggregation, geographical and of product at the same time. But they still have a pessimistic view with relation to distribution questions, remarking that perhaps it will be more accurate to use import sources in this latest case. We have stated that the data set coming from foreign trade statistics for a sample in 1925, being it exporters or importers, it s a valuable tool for geography of trade patterns, although in some specific cases it needs some spatial adjustments.
Resumo:
We investigate on-line prediction of individual sequences. Given a class of predictors, the goal is to predict as well as the best predictor in the class, where the loss is measured by the self information (logarithmic) loss function. The excess loss (regret) is closely related to the redundancy of the associated lossless universal code. Using Shtarkov's theorem and tools from empirical process theory, we prove a general upper bound on the best possible (minimax) regret. The bound depends on certain metric properties of the class of predictors. We apply the bound to both parametric and nonparametric classes ofpredictors. Finally, we point out a suboptimal behavior of the popular Bayesian weighted average algorithm.
Resumo:
A national survey designed for estimating a specific population quantity is sometimes used for estimation of this quantity also for a small area, such as a province. Budget constraints do not allow a greater sample size for the small area, and so other means of improving estimation have to be devised. We investigate such methods and assess them by a Monte Carlo study. We explore how a complementary survey can be exploited in small area estimation. We use the context of the Spanish Labour Force Survey (EPA) and the Barometer in Spain for our study.
Resumo:
We propose a new family of density functions that possess both flexibilityand closed form expressions for moments and anti-derivatives, makingthem particularly appealing for applications. We illustrate its usefulnessby applying our new family to obtain density forecasts of U.S. inflation.Our methods generate forecasts that improve on standard methods based on AR-ARCH models relying on normal or Student's t-distributional assumptions.