1000 resultados para Estil literari -- Mètodes estadístics
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristicsin a text that are rarely controlled by the author, with those in other texts. When thegoal is to settle authorship questions, these characteristics should relate to the author’s style andnot to the genre, epoch or editor, and they should be such that their variation between authors islarger than the variation within comparable texts from the same author.For an overview of the literature on stylometry and some of the techniques involved, see for exampleMosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) orLebart, Salem and Berry (1998).Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be“the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writterslike Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translatedseveral times into Spanish, Italian and French, with modern English translations by Rosenthal(1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465,but it was not printed until 1490.There is an intense and long lasting debate around its authorship sprouting from its first edition,where its introduction states that the whole book is the work of Martorell (1413?-1468), while atthe end it is stated that the last one fourth of the book is by Galba (?-1490), after the death ofMartorell. Some of the authors that support the theory of single authorship are Riquer (1990),Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer(1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990).Neither of the two candidate authors left any text comparable to the one under study, and thereforediscriminant analysis can not be used to help classify chapters by author. By using sample textsencompassing about ten percent of the book, and looking at word length and at the use of 44conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that mightindicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba andGinebra (2000) estimates that stylistic boundary to be near chapter 383.Following the lead of the extensive literature, this paper looks into word length, the use of the mostfrequent words and into the use of vowels in each chapter of the book. Given that the featuresselected are categorical, that leads to three contingency tables of ordered rows and therefore tothree sequences of multinomial observations.Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3describes the problem of the estimation of a suden change-point in those sequences, in the followingsections we propose various ways to estimate change-points in multinomial sequences; the methodin section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma modelsonto the sequence of Chi-square distances between each row profiles and the average profile, theone in Section 6 fits models onto the sequence of values taken by the first component of thecorrespondence analysis as well as onto sequences of other summary measures like the averageword length. In Section 7 we fit models onto the marginal binomial sequences to identify thefeatures that distinguish the chapters before and after that boundary. Most methods rely heavilyon the use of generalized linear models
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristics in a text that are rarely controlled by the author, with those in other texts. When the goal is to settle authorship questions, these characteristics should relate to the author’s style and not to the genre, epoch or editor, and they should be such that their variation between authors is larger than the variation within comparable texts from the same author. For an overview of the literature on stylometry and some of the techniques involved, see for example Mosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) or Lebart, Salem and Berry (1998). Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be “the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writters like Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translated several times into Spanish, Italian and French, with modern English translations by Rosenthal (1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465, but it was not printed until 1490. There is an intense and long lasting debate around its authorship sprouting from its first edition, where its introduction states that the whole book is the work of Martorell (1413?-1468), while at the end it is stated that the last one fourth of the book is by Galba (?-1490), after the death of Martorell. Some of the authors that support the theory of single authorship are Riquer (1990), Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer (1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990). Neither of the two candidate authors left any text comparable to the one under study, and therefore discriminant analysis can not be used to help classify chapters by author. By using sample texts encompassing about ten percent of the book, and looking at word length and at the use of 44 conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that might indicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba and Ginebra (2000) estimates that stylistic boundary to be near chapter 383. Following the lead of the extensive literature, this paper looks into word length, the use of the most frequent words and into the use of vowels in each chapter of the book. Given that the features selected are categorical, that leads to three contingency tables of ordered rows and therefore to three sequences of multinomial observations. Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3 describes the problem of the estimation of a suden change-point in those sequences, in the following sections we propose various ways to estimate change-points in multinomial sequences; the method in section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma models onto the sequence of Chi-square distances between each row profiles and the average profile, the one in Section 6 fits models onto the sequence of values taken by the first component of the correspondence analysis as well as onto sequences of other summary measures like the average word length. In Section 7 we fit models onto the marginal binomial sequences to identify the features that distinguish the chapters before and after that boundary. Most methods rely heavily on the use of generalized linear models
Resumo:
L’objectiu d’aquest estudi, que correspon a un projecte de recerca sobre la pèrdua funcional i la mortalitat de persones grans fràgils, és construir un procés de supervivència predictiu que tingui en compte l’evolució funcional i nutricional dels pacients al llarg del temps. En aquest estudi ens enfrontem a l’anàlisi de dades de supervivència i mesures repetides però els mètodes estadístics habituals per al tractament conjunt d’aquest tipus de dades no són apropiats en aquest cas. Com a alternativa utilitzem els models de supervivència multi-estats per avaluar l’associació entre mortalitat i recuperació, o no, dels nivells funcionals i nutricionals considerats normals. Després d’estimar el model i d’identificar els factors pronòstics de mortalitat és possible obtenir un procés predictiu que permet fer prediccions de la supervivència dels pacients en funció de la seva història concreta fins a un determinat moment. Això permet realitzar un pronòstic més precís de cada grup de pacients, la qual cosa pot ser molt útil per als professionals sanitaris a l’hora de prendre decisions clíniques.
Resumo:
We give a 5-approximation algorithm to the rooted Subtree-Prune-and-Regraft (rSPR) distance between two phylogenies, which was recently shown to be NP-complete by Bordewich and Semple [5]. This paper presents the first approximation result for this important tree distance. The algorithm follows a standard format for tree distances such as Rodrigues et al. [24] and Hein et al. [13]. The novel ideas are in the analysis. In the analysis, the cost of the algorithm uses a \cascading" scheme that accounts for possible wrong moves. This accounting is missing from previous analysis of tree distance approximation algorithms. Further, we show how all algorithms of this type can be implemented in linear time and give experimental results.
Resumo:
L’objectiu del present estudi és la caracterització i avaluació del grau de contaminació per metalls pesants dels sòls adjacents a una carretera de l’àrea metropolitana de Barcelona. Aquesta caracterització inclou la comparació dels valors mesurats amb els nivells màxims legals i la determinació de l’origen antropogènic o litogènic dels metalls mitjançant mètodes estadístics. Per a assolir aquest objectiu, s’emprarà la tècnica de la fluorescència de raigs X portàtil (FPXRF), de manera que l’estudi també inclourà la validació dels resultats obtinguts mitjançant aquesta tècnica per comparació amb un mètode de referència, així com l’estudi de la principal limitació que presenta la tècnica, que és la presència d’humitat a les mostres.
Resumo:
Estudi realitzat a partir d’una estada al Computer Science and Artificial Intelligence Lab, del Massachusetts Institute of Technology, entre 2006 i 2008. La recerca desenvolupada en aquest projecte se centra en mètodes d'aprenentatge automàtic per l'anàlisi sintàctica del llenguatge. Com a punt de partida, establim que la complexitat del llenguatge exigeix no només entendre els processos computacionals associats al llenguatge sinó també entendre com es pot aprendre automàticament el coneixement per a dur a terme aquests processos.
Resumo:
This technical report is a document prepared as a deliverable [D4.3 Report of the Interlinkages and forecasting prototype tool] of a EU project – DECOIN Project No. 044428 - FP6-2005-SSP-5A. The text is divided into 4 sections: (1) this short introductory section explains the purpose of the report; (2) the second section provides a general discussion of a systemic problem found in existing quantitative analysis of sustainability. It addresses the epistemological implications of complexity, which entails the need of dealing with the existence of Multiple-Scales and non-equivalent narratives (multiple dimensions/attributes) to be used to define sustainability issues. There is an unavoidable tension between a “steady-state view” (= the perception of what is going on now – reflecting a PAST --& PRESENT view of the reality) versus an “evolutionary view” (= the unknown transformation that we have to expect in the process of becoming of the observed reality and in the observer – reflecting a PRESENT --& FUTURE view of the reality). The section ends by listing the implications of these points on the choice of integrated packages of sustainability indicators; (3) the third section illustrates the potentiality of the DECOIN toolkit for the study of sustainability trade-offs and linkages across indicators using quantitative examples taken from cases study of another EU project (SMILE). In particular, this section starts by addressing the existence of internal constraints to sustainability (economic versus social aspects). The narrative chosen for this discussion focuses on the dark side of ageing and immigration on the economic viability of social systems. Then the section continues by exploring external constraints to sustainability (economic development vs the environment). The narrative chosen for this discussion focuses on the dark side of current strategy of economic development based on externalization and the “bubbles-disease”; (4) the last section presents a critical appraisal of the quality of energy data found in energy statistics. It starts with a discussion of the general goal of statistical accounting. Then it introduces the concept of multipurpose grammars. The second part uses the experience made in the activities of the DECOIN project to answer the question: how useful are EUROSTAT energy statistics? The answer starts with an analysis of basic epistemological problems associated with accounting of energy. This discussion leads to the acknowledgment of an important epistemological problem: the unavoidable bifurcations in the mechanism of accounting needed to generate energy statistics. By using numerical example the text deals with the following issues: (i) the pitfalls of the actual system of accounting in energy statistics; (ii) a critical appraisal of the actual system of accounting in BP statistics; (iii) a critical appraisal of the actual system of accounting in Eurostat statistics. The section ends by proposing an innovative method to represent energy statistics which can result more useful for those willing develop sustainability indicators.
Resumo:
This document presents an integrated analysis of the performance of Catalonia based on an analysis of how the energy consumption (measured at the societal level for the Catalan Society) is used within both the productive sectors of the economy and the household, to generate added value, jobs, and to guarantee a given level of material standard of living to the population. The trends found in Catalonia are compared to the trends of other European Countries to contextualize the performance of Catalonia with respect to other societies that have followed different paths of economic development. The first part of the document consists of the Multi-Scale Integrated Analysis of Societal and Ecosystem Metabolism (MuSIASEM) approach that has been used to provide this integrated analysis of Catalan Society across different scales (starting from an analysis of the specific sectors of the Catalan economy as an Autonomous Community and scaling up to an intra-regional (European Union 14) comparison) and across different dimensions of analyses of energy consumption coupled with added value generation. Within the scope of this study, we observe the various trajectories of changes in the metabolic pattern for Catalonia and the EU14 countries in the Paid Work Sectors composed of namely, the Agricultural Sector, the Productive Sector and the Services and Government Sector also in comparison with the changes in the household sector. The flow intensities of the exosomatic energy and the added value generated for each specific sector are defined per hour of human activity, thus characterized as exosomatic energy (MJ/hour) (or Exosomatic Metabolic Rate) and added value (€/hour) (Economic Labour Productivity) across multiple levels. Within the second part of the document, the possible usage of the MuSIASEM approach to land use analyses (using a multi-level matrix of categories of land use) has been conducted.
Resumo:
This study presents a first attempt to extend the “Multi-scale integrated analysis of societal and ecosystem metabolism (MuSIASEM)” approach to a spatial dimension using GIS techniques in the Metropolitan area of Barcelona. We use a combination of census and commercial databases along with a detailed land cover map to create a layer of Common Geographic Units that we populate with the local values of human time spent in different activities according to MuSIASEM hierarchical typology. In this way, we mapped the hours of available human time, in regards to the working hours spent in different locations, putting in evidence the gradients in spatial density between the residential location of workers (generating the work supply) and the places where the working hours are actually taking place. We found a strong three-modal pattern of clumps of areas with different combinations of values of time spent on household activities and on paid work. We also measured and mapped spatial segregation between these two activities and put forward the conjecture that this segregation increases with higher energy throughput, as the size of the functional units must be able to cope with the flow of exosomatic energy. Finally, we discuss the effectiveness of the approach by comparing our geographic representation of exosomatic throughput to the one issued from conventional methods.
Resumo:
La validació de mètodes és un dels pilars fonamentals de l’assegurament de la qualitat en els laboratoris d’anàlisi, tal i com queda reflectit en la norma ISO/IEC 17025. És, per tant, un aspecte que cal abordar en els plans d’estudis dels presents i dels futurs graus en Química. Existeix molta bibliografia relativa a la validació de mètodes, però molt sovint aquesta s’utilitza poc, degut a la dificultat manifesta de processar tota la informació disponible i aplicar-la al laboratori i als problemes concrets. Una altra de les limitacions en aquest camps és la manca de programaris adaptats a les necessitats del laboratori. Moltes de les rutines estadístiques que es fan servir en la validació de mètodes són adaptacions fetes amb Microsoft Excel o venen incorporades en paquets estadístics gegants, amb un alt grau de complexitat. És per aquest motiu que l’objectiu del projecte ha estat generar un programari per la validació de mètodes i l’assegurament de la qualitat dels resultats analítics, que incorporés únicament les rutines necessàries. Específicament, el programari incorpora les funcions estadístiques necessàries per a verificar l’exactitud i avaluar la precisió d’un mètode analític. El llenguatge de programació triat ha estat el Java en la seva versió 6. La part de creació del programari ha constat de les següents etapes: recollida de requisits, anàlisi dels requisits, disseny del programari en mòduls, programació d les funcions del programa i de la interfície gràfica, creació de tests d’integració i prova amb usuaris reals, i, finalment, la posada en funcionament del programari (creació de l’instal·lador i distribució del programari).
Resumo:
In this paper we address the complexity of the analysis of water use in relation to the issue of sustainability. In fact, the flows of water in our planet represent a complex reality which can be studied using many different perceptions and narratives referring to different scales and dimensions of analysis. For this reason, a quantitative analysis of water use has to be based on analytical methods that are semantically open: they must be able to define what we mean with the term “water” when crossing different scales of analysis. We propose here a definition of water as a resource that deal with the many services it provides to humans and ecosystems. WE argue that water can fulfil so many of them since the element has many characteristics that allow for the resource to be labelled with different attributes, depending on the end use –such as drinkable. Since the services for humans and the functions for ecosystems associated with water flows are defined on different scales but still interconnected it is necessary to organize our assessment of water use across different hierarchical levels. In order to do so we define how to approach the study of water use in the Societal Metabolism, by proposing the Water Metabolism, tganized in three levels: societal level, ecosystem level and global level. The possible end uses we distinguish for the society are: personal/physiological use, household use, economic use. Organizing the study of “water use” across all these levels increases the usefulness of the quantitative analysis and the possibilities of finding relevant and comparable results. To achieve this result, we adapted a method developed to deal with multi-level, multi-scale analysis - the Multi-Scale Integrated Analysis of Societal and Ecosystem Metabolism (MuSIASEM) approach - to the analysis of water metabolism. In this paper, we discuss the peculiar analytical identity that “water” shows within multi-scale metabolic studies: water represents a flow-element when considering the metabolism of social systems (at a small scale, when describing the water metabolism inside the society) and a fund-element when considering the metabolism o ecosystems (at a larger scale when describing the water metabolism outside the society). The theoretical analysis is illustrated using two case which characterize the metabolic patterns regarding water use of a productive system in Catalonia and a water management policy in Andarax River Basin in Andalusia.
Resumo:
Projecte de recerca elaborat a partir d’una estada a la Universidad Politécnica de Madrid, Espanya, entre setembre i o desembre del 2007. Actualment la indústria aeroespacial i aeronàutica té com prioritat millorar la fiabilitat de las seves estructures a través del desenvolupament de nous sistemes per a la monitorització i detecció d’impactes. Hi ha diverses tècniques potencialment útils, i la seva aplicabilitat en una situació particular depèn críticament de la mida del defecte que permet l’estructura. Qualsevol defecte canviarà la resposta vibratòria de l’element estructural, així com el transitori de l’ona que es propaga per l’estructura elàstica. Correlacionar aquests canvis, que poden ser detectats experimentalment amb l’ocurrència del defecte, la seva localització i quantificació, és un problema molt complex. Aquest treball explora l’ús de l'Anàlisis de Components Principals (Principal Component Analysis - PCA-) basat en la formulació dels estadístics T2 i Q per tal de detectar i distingir els defectes a l'estructura, tot correlacionant els seus canvis a la resposta vibratòria. L’estructura utilitzada per l’estudi és l’ala d’una turbina d’un avió comercial. Aquesta ala s’excita en un extrem utilitzant un vibrador, i a la qual s'han adherit set sensors PZT a la superfície. S'aplica un senyal conegut i s'analitzen les respostes. Es construeix un model PCA utilitzant dades de l’estructura sense defecte. Per tal de provar el model, s'adhereix un tros d’alumini en quatre posicions diferents. Les dades dels assajos de l'estructura amb defecte es projecten sobre el model. Les components principals i les distàncies de Q-residual i T2-Hotelling s'utilitzaran per a l'anàlisi de les incidències. Q-residual indica com de bé s'adiu cadascuna de les mostres al model PCA, ja que és una mesura de la diferència, o residu, entre la mostra i la seva projecció sobre les components principals retingudes en el model. La distància T2-Hotelling és una mesura de la variació de cada mostra dins del model PCA, o el que vindria a ser el mateix, la distància al centre del model PCA.
Resumo:
La aplicación Log2XML tiene como objeto principal la transformación de archivos log en formato texto con separador de campos a un formato XML estandarizado. Para permitir que la aplicación pueda trabajar con logs de diferentes sistemas o aplicaciones, dispone de un sistema de plantillas (indicación de orden de campos y carácter separador) que permite definir la estructura mínima para poder extraer la información de cualquier tipo de log que se base en separadores de campo. Por último, la aplicación permite el procesamiento de la información extraída para la generación de informes y estadísticas.Por otro lado, en el proyecto se profundiza en la tecnología Grails.
Resumo:
This presentation aims to make understandable the use and application context of two Webometrics techniques, the logs analysis and Google Analytics, which currently coexist in the Virtual Library of the UOC. In this sense, first of all it is provided a comprehensive introduction to webometrics and then it is analysed the case of the UOC's Virtual Library focusing on the assimilation of these techniques and the considerations underlying their use, and covering in a holistic way the process of gathering, processing and data exploitation. Finally there are also provided guidelines for the interpretation of the metric variables obtained.
Resumo:
This study examines how structural determinants influence intermediary factors of child health inequities and how they operate through the communities where children live. In particular, we explore individual, family and community level characteristics associated with a composite indicator that quantitatively measures intermediary determinants of early childhood health in Colombia. We use data from the 2010 Colombian Demographic and Health Survey (DHS). Adopting the conceptual framework of the Commission on Social Determinants of Health (CSDH), three dimensions related to child health are represented in the index: behavioural factors, psychosocial factors and health system. In order to generate the weight of the variables and take into account the discrete nature of the data, principal component analysis (PCA) using polychoric correlations are employed in the index construction. Weighted multilevel models are used to examine community effects. The results show that the effect of household’s SES is attenuated when community characteristics are included, indicating the importance that the level of community development may have in mediating individual and family characteristics. The findings indicate that there is a significant variance in intermediary determinants of child health between-community, especially for those determinants linked to the health system, even after controlling for individual, family and community characteristics. These results likely reflect that whilst the community context can exert a greater influence on intermediary factors linked directly to health, in the case of psychosocial factors and the parent’s behaviours, the family context can be more important. This underlines the importance of distinguishing between community and family intervention programmes.