10 resultados para estimate
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models
Resumo:
The classical binary classification problem is investigatedwhen it is known in advance that the posterior probability function(or regression function) belongs to some class of functions. We introduceand analyze a method which effectively exploits this knowledge. The methodis based on minimizing the empirical risk over a carefully selected``skeleton'' of the class of regression functions. The skeleton is acovering of the class based on a data--dependent metric, especiallyfitted for classification. A new scale--sensitive dimension isintroduced which is more useful for the studied classification problemthan other, previously defined, dimension measures. This fact isdemonstrated by performance bounds for the skeleton estimate in termsof the new dimension.
Resumo:
This work proposes a parallel architecture for a motion estimation algorithm. It is well known that image processing requires a huge amount of computation, mainly at low level processing where the algorithms are dealing with a great numbers of data-pixel. One of the solutions to estimate motions involves detection of the correspondences between two images. Due to its regular processing scheme, parallel implementation of correspondence problem can be an adequate approach to reduce the computation time. This work introduces parallel and real-time implementation of such low-level tasks to be carried out from the moment that the current image is acquired by the camera until the pairs of point-matchings are detected
Resumo:
Testing weather or not data belongs could been generated by a family of extreme value copulas is difficult. We generalize a test and we prove that it can be applied whatever the alternative hypothesis. We also study the effect of using different extreme value copulas in the context of risk estimation. To measure the risk we use a quantile. Our results have motivated by a bivariate sample of losses from a real database of auto insurance claims. Methods are implemented in R.
Resumo:
[eng] This paper examines the quantitative effects of gender gaps in entrepreneurship and labor force participation on aggregate productivity and income per capita. We simulate an occupational choice model with heterogeneous agents in entrepreneurial ability, where agents choose to be workers, self-employed or employers. The model assumes that men and women have the same talent distribution, but we impose several frictions on women's opportunities and pay in the labor market. In particular, we restrict the fraction of women participating in the labor market.
Resumo:
This paper examines the quantitative effects of gender gaps in entrepreneurship and labor force participation on aggregate productivity and income per capita. We simulate an occupational choice model with heterogeneous agents in entrepreneurial ability, where agents choose to be workers, self-employed or employers. The model assumes that men and women have the same talent distribution, but we impose several frictions on women's opportunities and pay in the labor market. In particular, we restrict the fraction of women participating in the labor market.
Resumo:
This paper examines the quantitative effects of gender gaps in entrepreneurship and labor force participation on aggregate productivity and income per capita. We simulate an occupational choice model with heterogeneous agents in entrepreneurial ability, where agents choose to be workers, self-employed or employers. The model assumes that men and women have the same talent distribution, but we impose several frictions on women's opportunities and pay in the labor market. In particular, we restrict the fraction of women participating in the labor market.
Resumo:
A maximum entropy statistical treatment of an inverse problem concerning frame theory is presented. The problem arises from the fact that a frame is an overcomplete set of vectors that defines a mapping with no unique inverse. Although any vector in the concomitant space can be expressed as a linear combination of frame elements, the coefficients of the expansion are not unique. Frame theory guarantees the existence of a set of coefficients which is “optimal” in a minimum norm sense. We show here that these coefficients are also “optimal” from a maximum entropy viewpoint.
Resumo:
In this work, a LIDAR-based 3D Dynamic Measurement System is presented and evaluated for the geometric characterization of tree crops. Using this measurement system, trees were scanned from two opposing sides to obtain two three-dimensional point clouds. After registration of the point clouds, a simple and easily obtainable parameter is the number of impacts received by the scanned vegetation. The work in this study is based on the hypothesis of the existence of a linear relationship between the number of impacts of the LIDAR sensor laser beam on the vegetation and the tree leaf area. Tests performed under laboratory conditions using an ornamental tree and, subsequently, in a pear tree orchard demonstrate the correct operation of the measurement system presented in this paper. The results from both the laboratory and field tests confirm the initial hypothesis and the 3D Dynamic Measurement System is validated in field operation. This opens the door to new lines of research centred on the geometric characterization of tree crops in the field of agriculture and, more specifically, in precision fruit growing.