32 resultados para kernel estimate

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

70.00% 70.00%

Publicador:

Resumo:

For the standard kernel density estimate, it is known that one can tune the bandwidth such that the expected L1 error is within a constant factor of the optimal L1 error (obtained when one is allowed to choose the bandwidth with knowledge of the density). In this paper, we pose the same problem for variable bandwidth kernel estimates where the bandwidths are allowed to depend upon the location. We show in particular that for positive kernels on the real line, for any data-based bandwidth, there exists a densityfor which the ratio of expected L1 error over optimal L1 error tends to infinity. Thus, the problem of tuning the variable bandwidth in an optimal manner is ``too hard''. Moreover, from the class of counterexamples exhibited in the paper, it appears thatplacing conditions on the densities (monotonicity, convexity, smoothness) does not help.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Let a class $\F$ of densities be given. We draw an i.i.d.\ sample from a density $f$ which may or may not be in $\F$. After every $n$, one must make a guess whether $f \in \F$ or not. A class is almost surely testable if there exists such a testing sequence such that for any $f$, we make finitely many errors almost surely. In this paper, several results are given that allowone to decide whether a class is almost surely testable. For example, continuity and square integrability are not testable, but unimodality, log-concavity, and boundedness by a given constant are.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We continue the development of a method for the selection of a bandwidth or a number of design parameters in density estimation. We provideexplicit non-asymptotic density-free inequalities that relate the $L_1$ error of the selected estimate with that of the best possible estimate,and study in particular the connection between the richness of the classof density estimates and the performance bound. For example, our methodallows one to pick the bandwidth and kernel order in the kernel estimatesimultaneously and still assure that for {\it all densities}, the $L_1$error of the corresponding kernel estimate is not larger than aboutthree times the error of the estimate with the optimal smoothing factor and kernel plus a constant times $\sqrt{\log n/n}$, where $n$ is the sample size, and the constant only depends on the complexity of the family of kernels used in the estimate. Further applications include multivariate kernel estimates, transformed kernel estimates, and variablekernel estimates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study examines the evolution of labor productivity across Spanish regions during the period from 1977 to 2002. By applying the kernel technique, we estimate the effects of the Transition process on labor productivity and its main sources. We find that Spanish regions experienced a major convergence process in labor productivity and in human capital in the 1977-1993 period. We also pinpoint the existence of a transition co-movement between labor productivity and human capital. Conversely, the dynamics of investment in physical capital seem unrelated to the transition dynamics of labor productivity. The lack of co-evolution can be addressed as one of the causes of the current slowdown in productivity. Classification-JEL: J24, N34, N940, O18, O52, R10

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Given a model that can be simulated, conditional moments at a trial parameter value can be calculated with high accuracy by applying kernel smoothing methods to a long simulation. With such conditional moments in hand, standard method of moments techniques can be used to estimate the parameter. Since conditional moments are calculated using kernel smoothing rather than simple averaging, it is not necessary that the model be simulable subject to the conditioning information that is used to define the moment conditions. For this reason, the proposed estimator is applicable to general dynamic latent variable models. Monte Carlo results show that the estimator performs well in comparison to other estimators that have been proposed for estimation of general DLV models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract. Given a model that can be simulated, conditional moments at a trial parameter value can be calculated with high accuracy by applying kernel smoothing methods to a long simulation. With such conditional moments in hand, standard method of moments techniques can be used to estimate the parameter. Because conditional moments are calculated using kernel smoothing rather than simple averaging, it is not necessary that the model be simulable subject to the conditioning information that is used to define the moment conditions. For this reason, the proposed estimator is applicable to general dynamic latent variable models. It is shown that as the number of simulations diverges, the estimator is consistent and a higher-order expansion reveals the stochastic difference between the infeasible GMM estimator based on the same moment conditions and the simulated version. In particular, we show how to adjust standard errors to account for the simulations. Monte Carlo results show how the estimator may be applied to a range of dynamic latent variable (DLV) models, and that it performs well in comparison to several other estimators that have been proposed for DLV models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Vegeu el resum a l'inici del document del fitxer adjunt.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method to estimate an extreme quantile that requires no distributional assumptions is presented. The approach is based on transformed kernel estimation of the cumulative distribution function (cdf). The proposed method consists of a double transformation kernel estimation. We derive optimal bandwidth selection methods that have a direct expression for the smoothing parameter. The bandwidth can accommodate to the given quantile level. The procedure is useful for large data sets and improves quantile estimation compared to other methods in heavy tailed distributions. Implementation is straightforward and R programs are available.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a seminal paper, Aitchison and Lauder (1985) introduced classical kernel densityestimation techniques in the context of compositional data analysis. Indeed, they gavetwo options for the choice of the kernel to be used in the kernel estimator. One ofthese kernels is based on the use the alr transformation on the simplex SD jointly withthe normal distribution on RD-1. However, these authors themselves recognized thatthis method has some deficiencies. A method for overcoming these dificulties based onrecent developments for compositional data analysis and multivariate kernel estimationtheory, combining the ilr transformation with the use of the normal density with a fullbandwidth matrix, was recently proposed in Martín-Fernández, Chacón and Mateu-Figueras (2006). Here we present an extensive simulation study that compares bothmethods in practice, thus exploring the finite-sample behaviour of both estimators

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We use aggregate GDP data and within-country income shares for theperiod 1970-1998 to assign a level of income to each person in theworld. We then estimate the gaussian kernel density function for theworldwide distribution of income. We compute world poverty rates byintegrating the density function below the poverty lines. The $1/daypoverty rate has fallen from 20% to 5% over the last twenty five years.The $2/day rate has fallen from 44% to 18%. There are between 300 and500 million less poor people in 1998 than there were in the 70s.We estimate global income inequality using seven different popularindexes: the Gini coefficient, the variance of log-income, two ofAtkinson s indexes, the Mean Logarithmic Deviation, the Theil indexand the coefficient of variation. All indexes show a reduction in globalincome inequality between 1980 and 1998. We also find that most globaldisparities can be accounted for by across-country, not within-country,inequalities. Within-country disparities have increased slightly duringthe sample period, but not nearly enough to offset the substantialreduction in across-country disparities. The across-country reductionsin inequality are driven mainly, but not fully, by the large growth rateof the incomes of the 1.2 billion Chinese citizens. Unless Africa startsgrowing in the near future, we project that income inequalities willstart rising again. If Africa does not start growing, then China, India,the OECD and the rest of middle-income and rich countries diverge awayfrom it, and global inequality will rise. Thus, the aggregate GDP growthof the African continent should be the priority of anyone concerned withincreasing global income inequality.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the fixed design regression model, additional weights areconsidered for the Nadaraya--Watson and Gasser--M\"uller kernel estimators.We study their asymptotic behavior and the relationships between new andclassical estimators. For a simple family of weights, and considering theIMSE as global loss criterion, we show some possible theoretical advantages.An empirical study illustrates the performance of the weighted estimatorsin finite samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A tool for user choice of the local bandwidth function for a kernel density estimate is developed using KDE, a graphical object-oriented package for interactive kernel density estimation written in LISP-STAT. The bandwidth function is a cubic spline, whose knots are manipulated by the user in one window, while the resulting estimate appears in another window. A real data illustration of this method raises concerns, because an extremely large family of estimates is available.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The classical binary classification problem is investigatedwhen it is known in advance that the posterior probability function(or regression function) belongs to some class of functions. We introduceand analyze a method which effectively exploits this knowledge. The methodis based on minimizing the empirical risk over a carefully selected``skeleton'' of the class of regression functions. The skeleton is acovering of the class based on a data--dependent metric, especiallyfitted for classification. A new scale--sensitive dimension isintroduced which is more useful for the studied classification problemthan other, previously defined, dimension measures. This fact isdemonstrated by performance bounds for the skeleton estimate in termsof the new dimension.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

[cat] Es presenta un estimador nucli transformat que és adequat per a distribucions de cua pesada. Utilitzant una transformació basada en la distribució de probabilitat Beta l’elecció del paràmetre de finestra és molt directa. Es presenta una aplicació a dades d’assegurances i es mostra com calcular el Valor en Risc.