4 resultados para Generalized linear mixed model
em Universitat de Girona, Spain
Resumo:
Comparison of donor-acceptor electronic couplings calculated within two-state and three-state models suggests that the two-state treatment can provide unreliable estimates of Vda because of neglecting the multistate effects. We show that in most cases accurate values of the electronic coupling in a π stack, where donor and acceptor are separated by a bridging unit, can be obtained as Ṽ da = (E2 - E1) μ12 Rda + (2 E3 - E1 - E2) 2 μ13 μ23 Rda2, where E1, E2, and E3 are adiabatic energies of the ground, charge-transfer, and bridge states, respectively, μij is the transition dipole moments between the states i and j, and Rda is the distance between the planes of donor and acceptor. In this expression based on the generalized Mulliken-Hush approach, the first term corresponds to the coupling derived within a two-state model, whereas the second term is the superexchange correction accounting for the bridge effect. The formula is extended to bridges consisting of several subunits. The influence of the donor-acceptor energy mismatch on the excess charge distribution, adiabatic dipole and transition moments, and electronic couplings is examined. A diagnostic is developed to determine whether the two-state approach can be applied. Based on numerical results, we showed that the superexchange correction considerably improves estimates of the donor-acceptor coupling derived within a two-state approach. In most cases when the two-state scheme fails, the formula gives reliable results which are in good agreement (within 5%) with the data of the three-state generalized Mulliken-Hush model
Resumo:
Els estudis de supervivència s'interessen pel temps que passa des de l'inici de l'estudi (diagnòstic de la malaltia, inici del tractament,...) fins que es produeix l'esdeveniment d'interès (mort, curació, millora,...). No obstant això, moltes vegades aquest esdeveniment s'observa més d'una vegada en un mateix individu durant el període de seguiment (dades de supervivència multivariant). En aquest cas, és necessari utilitzar una metodologia diferent a la utilitzada en l'anàlisi de supervivència estàndard. El principal problema que l'estudi d'aquest tipus de dades comporta és que les observacions poden no ser independents. Fins ara, aquest problema s'ha solucionat de dues maneres diferents en funció de la variable dependent. Si aquesta variable segueix una distribució de la família exponencial s'utilitzen els models lineals generalitzats mixtes (GLMM); i si aquesta variable és el temps, variable amb una distribució de probabilitat no pertanyent a aquesta família, s'utilitza l'anàlisi de supervivència multivariant. El que es pretén en aquesta tesis és unificar aquests dos enfocs, és a dir, utilitzar una variable dependent que sigui el temps amb agrupacions d'individus o d'observacions, a partir d'un GLMM, amb la finalitat d'introduir nous mètodes pel tractament d'aquest tipus de dades.
Resumo:
The statistical analysis of literary style is the part of stylometry that compares measurable characteristics in a text that are rarely controlled by the author, with those in other texts. When the goal is to settle authorship questions, these characteristics should relate to the author’s style and not to the genre, epoch or editor, and they should be such that their variation between authors is larger than the variation within comparable texts from the same author. For an overview of the literature on stylometry and some of the techniques involved, see for example Mosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) or Lebart, Salem and Berry (1998). Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be “the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writters like Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translated several times into Spanish, Italian and French, with modern English translations by Rosenthal (1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465, but it was not printed until 1490. There is an intense and long lasting debate around its authorship sprouting from its first edition, where its introduction states that the whole book is the work of Martorell (1413?-1468), while at the end it is stated that the last one fourth of the book is by Galba (?-1490), after the death of Martorell. Some of the authors that support the theory of single authorship are Riquer (1990), Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer (1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990). Neither of the two candidate authors left any text comparable to the one under study, and therefore discriminant analysis can not be used to help classify chapters by author. By using sample texts encompassing about ten percent of the book, and looking at word length and at the use of 44 conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that might indicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba and Ginebra (2000) estimates that stylistic boundary to be near chapter 383. Following the lead of the extensive literature, this paper looks into word length, the use of the most frequent words and into the use of vowels in each chapter of the book. Given that the features selected are categorical, that leads to three contingency tables of ordered rows and therefore to three sequences of multinomial observations. Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3 describes the problem of the estimation of a suden change-point in those sequences, in the following sections we propose various ways to estimate change-points in multinomial sequences; the method in section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma models onto the sequence of Chi-square distances between each row profiles and the average profile, the one in Section 6 fits models onto the sequence of values taken by the first component of the correspondence analysis as well as onto sequences of other summary measures like the average word length. In Section 7 we fit models onto the marginal binomial sequences to identify the features that distinguish the chapters before and after that boundary. Most methods rely heavily on the use of generalized linear models
Resumo:
In an earlier investigation (Burger et al., 2000) five sediment cores near the Rodrigues Triple Junction in the Indian Ocean were studied applying classical statistical methods (fuzzy c-means clustering, linear mixing model, principal component analysis) for the extraction of endmembers and evaluating the spatial and temporal variation of geochemical signals. Three main factors of sedimentation were expected by the marine geologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. The display of fuzzy membership values and/or factor scores versus depth provided consistent results for two factors only; the ultra-basic component could not be identified. The reason for this may be that only traditional statistical methods were applied, i.e. the untransformed components were used and the cosine-theta coefficient as similarity measure. During the last decade considerable progress in compositional data analysis was made and many case studies were published using new tools for exploratory analysis of these data. Therefore it makes sense to check if the application of suitable data transformations, reduction of the D-part simplex to two or three factors and visual interpretation of the factor scores would lead to a revision of earlier results and to answers to open questions . In this paper we follow the lines of a paper of R. Tolosana- Delgado et al. (2005) starting with a problem-oriented interpretation of the biplot scattergram, extracting compositional factors, ilr-transformation of the components and visualization of the factor scores in a spatial context: The compositional factors will be plotted versus depth (time) of the core samples in order to facilitate the identification of the expected sources of the sedimentary process. Kew words: compositional data analysis, biplot, deep sea sediments