5 resultados para Alternative splicing

em Universitat de Girona, Spain


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We compare correspondance análisis to the logratio approach based on compositional data. We also compare correspondance análisis and an alternative approach using Hellinger distance, for representing categorical data in a contingency table. We propose a coefficient which globally measures the similarity between these approaches. This coefficient can be decomposed into several components, one component for each principal dimension, indicating the contribution of the dimensions to the difference between the two representations. These three methods of representation can produce quite similar results. One illustrative example is given

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The statistical analysis of literary style is the part of stylometry that compares measurable characteristics in a text that are rarely controlled by the author, with those in other texts. When the goal is to settle authorship questions, these characteristics should relate to the author’s style and not to the genre, epoch or editor, and they should be such that their variation between authors is larger than the variation within comparable texts from the same author. For an overview of the literature on stylometry and some of the techniques involved, see for example Mosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) or Lebart, Salem and Berry (1998). Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be “the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writters like Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translated several times into Spanish, Italian and French, with modern English translations by Rosenthal (1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465, but it was not printed until 1490. There is an intense and long lasting debate around its authorship sprouting from its first edition, where its introduction states that the whole book is the work of Martorell (1413?-1468), while at the end it is stated that the last one fourth of the book is by Galba (?-1490), after the death of Martorell. Some of the authors that support the theory of single authorship are Riquer (1990), Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer (1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990). Neither of the two candidate authors left any text comparable to the one under study, and therefore discriminant analysis can not be used to help classify chapters by author. By using sample texts encompassing about ten percent of the book, and looking at word length and at the use of 44 conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that might indicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba and Ginebra (2000) estimates that stylistic boundary to be near chapter 383. Following the lead of the extensive literature, this paper looks into word length, the use of the most frequent words and into the use of vowels in each chapter of the book. Given that the features selected are categorical, that leads to three contingency tables of ordered rows and therefore to three sequences of multinomial observations. Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3 describes the problem of the estimation of a suden change-point in those sequences, in the following sections we propose various ways to estimate change-points in multinomial sequences; the method in section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma models onto the sequence of Chi-square distances between each row profiles and the average profile, the one in Section 6 fits models onto the sequence of values taken by the first component of the correspondence analysis as well as onto sequences of other summary measures like the average word length. In Section 7 we fit models onto the marginal binomial sequences to identify the features that distinguish the chapters before and after that boundary. Most methods rely heavily on the use of generalized linear models

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In CoDaWork’05, we presented an application of discriminant function analysis (DFA) to 4 different compositional datasets and modelled the first canonical variable using a segmented regression model solely based on an observation about the scatter plots. In this paper, multiple linear regressions are applied to different datasets to confirm the validity of our proposed model. In addition to dating the unknown tephras by calibration as discussed previously, another method of mapping the unknown tephras into samples of the reference set or missing samples in between consecutive reference samples is proposed. The application of these methodologies is demonstrated with both simulated and real datasets. This new proposed methodology provides an alternative, more acceptable approach for geologists as their focus is on mapping the unknown tephra with relevant eruptive events rather than estimating the age of unknown tephra. Kew words: Tephrochronology; Segmented regression

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The pedagogical and didactic dynamic system is focused on individual learning process and aims at the development of artistic knowledge, helping and guiding learners through different strategies or individual support, thus reinforcing the process. In consequence, this presentation looks for an alternative to the intercommunication student-teacher supported on the educational paradigm, through textual analyses of the daily diaries, developped by teacher and students, so as to discover successes or difficulties