884 resultados para Problem analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

First discussion on compositional data analysis is attributable to Karl Pearson, in 1897. However, notwithstanding the recent developments on algebraic structure of the simplex, more than twenty years after Aitchison’s idea of log-transformations of closed data, scientific literature is again full of statistical treatments of this type of data by using traditional methodologies. This is particularly true in environmental geochemistry where besides the problem of the closure, the spatial structure (dependence) of the data have to be considered. In this work we propose the use of log-contrast values, obtained by a simplicial principal component analysis, as LQGLFDWRUV of given environmental conditions. The investigation of the log-constrast frequency distributions allows pointing out the statistical laws able to generate the values and to govern their variability. The changes, if compared, for example, with the mean values of the random variables assumed as models, or other reference parameters, allow defining monitors to be used to assess the extent of possible environmental contamination. Case study on running and ground waters from Chiavenna Valley (Northern Italy) by using Na+, K+, Ca2+, Mg2+, HCO3-, SO4 2- and Cl- concentrations will be illustrated

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hydrogeological research usually includes some statistical studies devised to elucidate mean background state, characterise relationships among different hydrochemical parameters, and show the influence of human activities. These goals are achieved either by means of a statistical approach or by mixing models between end-members. Compositional data analysis has proved to be effective with the first approach, but there is no commonly accepted solution to the end-member problem in a compositional framework. We present here a possible solution based on factor analysis of compositions illustrated with a case study. We find two factors on the compositional bi-plot fitting two non-centered orthogonal axes to the most representative variables. Each one of these axes defines a subcomposition, grouping those variables that lay nearest to it. With each subcomposition a log-contrast is computed and rewritten as an equilibrium equation. These two factors can be interpreted as the isometric log-ratio coordinates (ilr) of three hidden components, that can be plotted in a ternary diagram. These hidden components might be interpreted as end-members. We have analysed 14 molarities in 31 sampling stations all along the Llobregat River and its tributaries, with a monthly measure during two years. We have obtained a bi-plot with a 57% of explained total variance, from which we have extracted two factors: factor G, reflecting geological background enhanced by potash mining; and factor A, essentially controlled by urban and/or farming wastewater. Graphical representation of these two factors allows us to identify three extreme samples, corresponding to pristine waters, potash mining influence and urban sewage influence. To confirm this, we have available analysis of diffused and widespread point sources identified in the area: springs, potash mining lixiviates, sewage, and fertilisers. Each one of these sources shows a clear link with one of the extreme samples, except fertilisers due to the heterogeneity of their composition. This approach is a useful tool to distinguish end-members, and characterise them, an issue generally difficult to solve. It is worth note that the end-member composition cannot be fully estimated but only characterised through log-ratio relationships among components. Moreover, the influence of each endmember in a given sample must be evaluated in relative terms of the other samples. These limitations are intrinsic to the relative nature of compositional data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The low levels of unemployment recorded in the UK in recent years are widely cited as evidence of the country’s improved economic performance, and the apparent convergence of unemployment rates across the country’s regions used to suggest that the longstanding divide in living standards between the relatively prosperous ‘south’ and the more depressed ‘north’ has been substantially narrowed. Dissenters from these conclusions have drawn attention to the greatly increased extent of non-employment (around a quarter of the UK’s working age population are not in employment) and the marked regional dimension in its distribution across the country. Amongst these dissenters it is generally agreed that non-employment is concentrated amongst older males previously employed in the now very much smaller ‘heavy’ industries (e.g. coal, steel, shipbuilding). This paper uses the tools of compositiona l data analysis to provide a much richer picture of non-employment and one which challenges the conventional analysis wisdom about UK labour market performance as well as the dissenters view of the nature of the problem. It is shown that, associated with the striking ‘north/south’ divide in nonemployment rates, there is a statistically significant relationship between the size of the non-employment rate and the composition of non-employment. Specifically, it is shown that the share of unemployment in non-employment is negatively correlated with the overall non-employment rate: in regions where the non-employment rate is high the share of unemployment is relatively low. So the unemployment rate is not a very reliable indicator of regional disparities in labour market performance. Even more importantly from a policy viewpoint, a significant positive relationship is found between the size of the non-employment rate and the share of those not employed through reason of sickness or disability and it seems (contrary to the dissenters) that this connection is just as strong for women as it is for men

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Compositional random vectors are fundamental tools in the Bayesian analysis of categorical data. Many of the issues that are discussed with reference to the statistical analysis of compositional data have a natural counterpart in the construction of a Bayesian statistical model for categorical data. This note builds on the idea of cross-fertilization of the two areas recommended by Aitchison (1986) in his seminal book on compositional data. Particular emphasis is put on the problem of what parameterization to use

Relevância:

30.00% 30.00%

Publicador:

Resumo:

At CoDaWork'03 we presented work on the analysis of archaeological glass composi- tional data. Such data typically consist of geochemical compositions involving 10-12 variables and approximates completely compositional data if the main component, sil- ica, is included. We suggested that what has been termed `crude' principal component analysis (PCA) of standardized data often identi ed interpretable pattern in the data more readily than analyses based on log-ratio transformed data (LRA). The funda- mental problem is that, in LRA, minor oxides with high relative variation, that may not be structure carrying, can dominate an analysis and obscure pattern associated with variables present at higher absolute levels. We investigate this further using sub- compositional data relating to archaeological glasses found on Israeli sites. A simple model for glass-making is that it is based on a `recipe' consisting of two `ingredients', sand and a source of soda. Our analysis focuses on the sub-composition of components associated with the sand source. A `crude' PCA of standardized data shows two clear compositional groups that can be interpreted in terms of di erent recipes being used at di erent periods, re ected in absolute di erences in the composition. LRA analysis can be undertaken either by normalizing the data or de ning a `residual'. In either case, after some `tuning', these groups are recovered. The results from the normalized LRA are di erently interpreted as showing that the source of sand used to make the glass di ered. These results are complementary. One relates to the recipe used. The other relates to the composition (and presumed sources) of one of the ingredients. It seems to be axiomatic in some expositions of LRA that statistical analysis of compositional data should focus on relative variation via the use of ratios. Our analysis suggests that absolute di erences can also be informative

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The statistical analysis of literary style is the part of stylometry that compares measurable characteristics in a text that are rarely controlled by the author, with those in other texts. When the goal is to settle authorship questions, these characteristics should relate to the author’s style and not to the genre, epoch or editor, and they should be such that their variation between authors is larger than the variation within comparable texts from the same author. For an overview of the literature on stylometry and some of the techniques involved, see for example Mosteller and Wallace (1964, 82), Herdan (1964), Morton (1978), Holmes (1985), Oakes (1998) or Lebart, Salem and Berry (1998). Tirant lo Blanc, a chivalry book, is the main work in catalan literature and it was hailed to be “the best book of its kind in the world” by Cervantes in Don Quixote. Considered by writters like Vargas Llosa or Damaso Alonso to be the first modern novel in Europe, it has been translated several times into Spanish, Italian and French, with modern English translations by Rosenthal (1996) and La Fontaine (1993). The main body of this book was written between 1460 and 1465, but it was not printed until 1490. There is an intense and long lasting debate around its authorship sprouting from its first edition, where its introduction states that the whole book is the work of Martorell (1413?-1468), while at the end it is stated that the last one fourth of the book is by Galba (?-1490), after the death of Martorell. Some of the authors that support the theory of single authorship are Riquer (1990), Chiner (1993) and Badia (1993), while some of those supporting the double authorship are Riquer (1947), Coromines (1956) and Ferrando (1995). For an overview of this debate, see Riquer (1990). Neither of the two candidate authors left any text comparable to the one under study, and therefore discriminant analysis can not be used to help classify chapters by author. By using sample texts encompassing about ten percent of the book, and looking at word length and at the use of 44 conjunctions, prepositions and articles, Ginebra and Cabos (1998) detect heterogeneities that might indicate the existence of two authors. By analyzing the diversity of the vocabulary, Riba and Ginebra (2000) estimates that stylistic boundary to be near chapter 383. Following the lead of the extensive literature, this paper looks into word length, the use of the most frequent words and into the use of vowels in each chapter of the book. Given that the features selected are categorical, that leads to three contingency tables of ordered rows and therefore to three sequences of multinomial observations. Section 2 explores these sequences graphically, observing a clear shift in their distribution. Section 3 describes the problem of the estimation of a suden change-point in those sequences, in the following sections we propose various ways to estimate change-points in multinomial sequences; the method in section 4 involves fitting models for polytomous data, the one in Section 5 fits gamma models onto the sequence of Chi-square distances between each row profiles and the average profile, the one in Section 6 fits models onto the sequence of values taken by the first component of the correspondence analysis as well as onto sequences of other summary measures like the average word length. In Section 7 we fit models onto the marginal binomial sequences to identify the features that distinguish the chapters before and after that boundary. Most methods rely heavily on the use of generalized linear models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In an earlier investigation (Burger et al., 2000) five sediment cores near the Rodrigues Triple Junction in the Indian Ocean were studied applying classical statistical methods (fuzzy c-means clustering, linear mixing model, principal component analysis) for the extraction of endmembers and evaluating the spatial and temporal variation of geochemical signals. Three main factors of sedimentation were expected by the marine geologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. The display of fuzzy membership values and/or factor scores versus depth provided consistent results for two factors only; the ultra-basic component could not be identified. The reason for this may be that only traditional statistical methods were applied, i.e. the untransformed components were used and the cosine-theta coefficient as similarity measure. During the last decade considerable progress in compositional data analysis was made and many case studies were published using new tools for exploratory analysis of these data. Therefore it makes sense to check if the application of suitable data transformations, reduction of the D-part simplex to two or three factors and visual interpretation of the factor scores would lead to a revision of earlier results and to answers to open questions . In this paper we follow the lines of a paper of R. Tolosana- Delgado et al. (2005) starting with a problem-oriented interpretation of the biplot scattergram, extracting compositional factors, ilr-transformation of the components and visualization of the factor scores in a spatial context: The compositional factors will be plotted versus depth (time) of the core samples in order to facilitate the identification of the expected sources of the sedimentary process. Kew words: compositional data analysis, biplot, deep sea sediments

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Factor analysis as frequent technique for multivariate data inspection is widely used also for compositional data analysis. The usual way is to use a centered logratio (clr) transformation to obtain the random vector y of dimension D. The factor model is then y = Λf + e (1) with the factors f of dimension k < D, the error term e, and the loadings matrix Λ. Using the usual model assumptions (see, e.g., Basilevsky, 1994), the factor analysis model (1) can be written as Cov(y) = ΛΛT + ψ (2) where ψ = Cov(e) has a diagonal form. The diagonal elements of ψ as well as the loadings matrix Λ are estimated from an estimation of Cov(y). Given observed clr transformed data Y as realizations of the random vector y. Outliers or deviations from the idealized model assumptions of factor analysis can severely effect the parameter estimation. As a way out, robust estimation of the covariance matrix of Y will lead to robust estimates of Λ and ψ in (2), see Pison et al. (2003). Well known robust covariance estimators with good statistical properties, like the MCD or the S-estimators (see, e.g. Maronna et al., 2006), rely on a full-rank data matrix Y which is not the case for clr transformed data (see, e.g., Aitchison, 1986). The isometric logratio (ilr) transformation (Egozcue et al., 2003) solves this singularity problem. The data matrix Y is transformed to a matrix Z by using an orthonormal basis of lower dimension. Using the ilr transformed data, a robust covariance matrix C(Z) can be estimated. The result can be back-transformed to the clr space by C(Y ) = V C(Z)V T where the matrix V with orthonormal columns comes from the relation between the clr and the ilr transformation. Now the parameters in the model (2) can be estimated (Basilevsky, 1994) and the results have a direct interpretation since the links to the original variables are still preserved. The above procedure will be applied to data from geochemistry. Our special interest is on comparing the results with those of Reimann et al. (2002) for the Kola project data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We will discuss several examples and research efforts related to the small world problem and set the ground for our discussion of network theory and social network analysis. Readings: An Experimental Study of the Small World Problem, J. Travers and S. Milgram Sociometry 32 425-443 (1969) [Protected Access] Optional: The Strength of Weak Ties, M.S. Granovetter The American Journal of Sociology 78 1360--1380 (1973) [Protected Access] Optional: Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network, J. Leskovec and E. Horvitz MSR-TR-2006-186. Microsoft Research, June 2007. [Web Link, the most recent and comprehensive study on the subject!] Originally from: http://kmi.tugraz.at/staff/markus/courses/SS2008/707.000_web-science/

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El Antígeno Leucocitario Humano (HLA en inglés) ha sido descrito en muchos casos como factor de pronóstico para cáncer. La característica principal de los genes de HLA, localizados en el cromosoma 6 (6p21.3), son sus numerosos polimorfismos. Los análisis de secuencia de nucleótidos muestran que la variación está restringida predominantemente a los exones que codifican los dominios de unión a péptidos de la proteína. Por lo tanto, el polimorfismo del HLA define el repertorio de péptidos que se unen a los alotipos de HLA y este hecho define la habilidad de un individuo para responder a la exposición a muchos agentes infecciosos durante su vida. La tipificación de HLA se ha convertido en un análisis importante en clínica. Muestras de tejido embebidas en parafina y fijadas con formalina (FFPE en inglés) son recolectadas rutinariamente en oncología. Este procedimiento podría ser utilizado como una buena fuente de ADN, dado que en estudios en el pasado los ensayos de recolección de ADN no eran normalmente llevados a cabo de casi ningún tejido o muestra en procedimientos clínicos regulares. Teniendo en cuenta que el problema más importante con el ADN de muestras FFPE es la fragmentación, nosotros propusimos un nuevo método para la tipificación del alelo HLA-A desde muestras FFPE basado en las secuencias del exón 2, 3 y 4. Nosotros diseñamos un juego de 12 cebadores: cuatro para el exón 2 de HLA-A, tres para el exón 3 de HLA-A y cinco para el exón 4 de HLA-A, cada uno de acuerdo las secuencias flanqueantes de su respectivo exón y la variación en la secuencia entre diferentes alelos. 17 muestran FFPE colectadas en el Hospital Universitario de Karolinska en Estocolmo Suecia fueron sometidas a PCR y los productos fueron secuenciados. Finalmente todas las secuencias obtenidas fueron analizadas y comparadas con la base de datos del IMGT-HLA. Las muestras FFPE habían sido previamente tipificadas para HLA y los resultados fueron comparados con los de este método. De acuerdo con nuestros resultados, las muestras pudieron ser correctamente secuenciadas. Con este procedimiento, podemos concluir que nuestro estudio es el primer método de tipificación basado en secuencia que permite analizar muestras viejas de ADN de las cuales no se tiene otra fuente. Este estudio abre la posibilidad de desarrollar análisis para establecer nuevas relaciones entre HLA y diferentes enfermedades como el cáncer también.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A clinical case of compulsive gambling is exposed in this article. The subject’s playing behaviour had both positive and negative consequences. The subject tried to practice control over the urge to play. As shown in the functional analysis, the failure to control the urge despite the best efforts worsened the problem due to its negative consequences. A variation of acceptance and commitment therapy (ACT) was applied in order to break down the fight-surrender vicious circle of the playing behaviour. Two treatment strategies were agreed upon: acceptance of the fact that both playing and not playing had negatives consequences and commitment to one of these options despite its disadvantages. Finally, it is proposed that this acceptance and commitment therapy is a useful therapeutic process as it decreases the suffering subject and eliminates the problem: playing bahaviour.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study a particular restitution problem where there is an indivisible good (land or property) over which two agents have rights: the dispossessed agent and the owner. A third party, possibly the government, seeks to resolve the situation by assigning rights to one and compensate the other. There is also a maximum amount of money available for the compensation. We characterize a family of asymmetrically fair rules that are immune to strategic behavior, guarantee minimal welfare levels for the agents, and satisfy the budget constraint.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The World Bank Report 2012 starts with this statement: “Gender equality matters in itself andit matters for development because, in today’s globalized worlds, countries that use the skillsand talents of their women would have an advantage over those which do not use it.” With theframe that suggest that gender equality matters, this paper describes some policy alternativesoriented to overcome gender disadvantages in the formal labor market incorporation of theurban middle class women in Colombia. On balance, the final recommendation suggest that itis desirable to adopt policy alternatives as Community Centers, which are programs orientedto a social redistribution of the domestic work as a way to encourage women participationin the formal labor market with the social support of the members of their own community.The problem that the social policy needs to address is the segregation of women in the formallabor market in Colombia. Although the evidence shows that the women overcome theeducational gap by showing better performance in education that their male peers, womenare still segregated of the labor market. The persistence of high rates of unemployment on thefemale population, the prevalence of the informal labor market as a women labor market, andthe presence of the payment difference between men and women with similar professionaltrainings are circumstances that sustain the segregation statement. These circumstances areinefficient for the society because an economic analysis shows that the cost of maintain the statuquo is externalized in the social security system that includes health, pension and maternityleave regimens. Therefore, the women segregation involves a market failure.This paper evaluates five policy alternatives each directed to the progress of a different causaldimension of the problem: (i) Quotas in the private market, (ii) Flexible working hours,(iii) replace the maternity leave with a family leave, (iv) Increase the Community Centers forredistributing the care work, and (v) Equal payment enforcement. The first alternative looksto increase women’s participation in the formal labor market. The second, third, and fourthalternatives constitute a package addressed at redistributing care work by reducing women’sresponsibility for reproductive work in the household with the help of husbands and the localgovernment. The fifth alternative intervenes to resolve the equal payment problem.After a four criteria evaluation that measure effectiveness, robustness and improbability inimplementation, efficiency and political acceptability or social opposition, the strongest alternativeis the fostering of Community Centers that promote a redistribution of care work. Thispolicy performs well in the assessment process because it combines gender focus with importantindirect effects: child support and human capabilities. The policy also shows a bottomup implementation process that overcomes the main adoption difficulties in the gender focusprograms and is supported by strong evidence of success in the Colombian context; this evidenceis produced by both transnational actors as a World Bank and also in local accountabilityreporters executed by local institutions like Colombian Institute of Family Welfare (ICBF).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Resumen tomado de la publicaci??n

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this study was to examine objective and subjective distortion present when frequency modulation (FM) systems were coupled with four digital signal processing (DSP) hearing aids. Electroacoustic analysis and subjective listening tests by experienced audiologists revealed that distortion levels varied across hearing aids and channels.