10 resultados para Statistical approach
em Universitat de Girona, Spain
Resumo:
The identification of compositional changes in fumarolic gases of active and quiescent volcanoes is one of the most important targets in monitoring programs. From a general point of view, many systematic (often cyclic) and random processes control the chemistry of gas discharges, making difficult to produce a convincing mathematical-statistical modelling. Changes in the chemical composition of volcanic gases sampled at Vulcano Island (Aeolian Arc, Sicily, Italy) from eight different fumaroles located in the northern sector of the summit crater (La Fossa) have been analysed by considering their dependence from time in the period 2000-2007. Each intermediate chemical composition has been considered as potentially derived from the contribution of the two temporal extremes represented by the 2000 and 2007 samples, respectively, by using inverse modelling methodologies for compositional data. Data pertaining to fumaroles F5 and F27, located on the rim and in the inner part of La Fossa crater, respectively, have been used to achieve the proposed aim. The statistical approach has allowed us to highlight the presence of random and not random fluctuations, features useful to understand how the volcanic system works, opening new perspectives in sampling strategies and in the evaluation of the natural risk related to a quiescent volcano
Resumo:
Hydrogeological research usually includes some statistical studies devised to elucidate mean background state, characterise relationships among different hydrochemical parameters, and show the influence of human activities. These goals are achieved either by means of a statistical approach or by mixing models between end-members. Compositional data analysis has proved to be effective with the first approach, but there is no commonly accepted solution to the end-member problem in a compositional framework. We present here a possible solution based on factor analysis of compositions illustrated with a case study. We find two factors on the compositional bi-plot fitting two non-centered orthogonal axes to the most representative variables. Each one of these axes defines a subcomposition, grouping those variables that lay nearest to it. With each subcomposition a log-contrast is computed and rewritten as an equilibrium equation. These two factors can be interpreted as the isometric log-ratio coordinates (ilr) of three hidden components, that can be plotted in a ternary diagram. These hidden components might be interpreted as end-members. We have analysed 14 molarities in 31 sampling stations all along the Llobregat River and its tributaries, with a monthly measure during two years. We have obtained a bi-plot with a 57% of explained total variance, from which we have extracted two factors: factor G, reflecting geological background enhanced by potash mining; and factor A, essentially controlled by urban and/or farming wastewater. Graphical representation of these two factors allows us to identify three extreme samples, corresponding to pristine waters, potash mining influence and urban sewage influence. To confirm this, we have available analysis of diffused and widespread point sources identified in the area: springs, potash mining lixiviates, sewage, and fertilisers. Each one of these sources shows a clear link with one of the extreme samples, except fertilisers due to the heterogeneity of their composition. This approach is a useful tool to distinguish end-members, and characterise them, an issue generally difficult to solve. It is worth note that the end-member composition cannot be fully estimated but only characterised through log-ratio relationships among components. Moreover, the influence of each endmember in a given sample must be evaluated in relative terms of the other samples. These limitations are intrinsic to the relative nature of compositional data
Resumo:
En aquest estudi, la toxicitat de diversos metalls pesants i l'arsènic va ser analitzada utilitzant diferents models biològics. En la primera part d'aquest treball, el bioassaig de toxicitat Microtox, el qual està basat en la variació de l'emissió lumínica del bacteri luminiscent Vibrio fischeri, va ser utilitzat per establir les corbes dosi-resposta de diferents elements tòxics com el Zn(II), Pb(II), Cu(II), Hg(II), Ag(I), Co(II), Cd(II), Cr(VI), As(V) i As(III) en solucions aquoses. Els experiments es varen portar a terme a pH 6.0 i 7.0 per tal de mostrar que el pH pot influir en la toxicitat final mesurada d'alguns metalls degut als canvis relacionats amb la seva especiació química. Es varen trobar diferents tipus de corbes dosi-resposta depenent del metall analitzat i el pH del medi. En el cas de l'arsènic, l'efecte del pH en la toxicitat de l'arsenat i l'arsenit es va investigar utilitzant l'assaig Microtox en un rang de pHs comprès entre pH 5.0 i 9.0. Els valors d'EC50 determinats per l'As(V) disminueixen, reflectint un augment de la toxicitat, a mesura que el pH de la solució augmenta mentre que, en el cas de l'As(III), els valors d'EC50 quasi bé no varien entre pH 6.0 i 8.0 i només disminueixen a pH 9.0. HAsO42- i H2AsO3- es varen definir com les espècies més tòxiques. Així mateix, una anàlisi estadística va revelar un efecte antagònic entre les espècies químiques d'arsenat que es troben conjuntament a pH 6.0 i 7.0. D'altra banda, els resultats de dos mètodes estadístics per predir la toxicitat i les possibles interaccions entre el Co(II), Cd(II), Cu(II), Zn(II) i Pb(II) en mescles binàries equitòxiques es varen comparar amb la toxicitat observada sobre el bacteri Vibrio fischeri. L'efecte combinat d'aquests metalls va resultar ser antagònic per les mescles de Co(II)-Cd(II), Cd(II)-Zn(II), Cd(II)-Pb(II) i Cu(II)-Pb(II), sinèrgic per Co(II)-Cu(II) i Zn(II)-Pb(II) i additiu en els altres casos, revelant un patró complex de possibles interaccions. L'efecte sinèrgic de la combinació Co(II)-Cu(II) i la forta disminució de la toxicitat del Pb(II) quan es troba en presència de Cd(II) hauria de merèixer més atenció quan s'estableixen les normatives de seguretat ambiental. La sensibilitat de l'assaig Microtox també va ser determinada. Els valors d'EC20, els quals representen la toxicitat llindar mesurable, varen ser determinats per cada element individualment i es va veure que augmenten de la següent manera: Pb(II) < Ag(I) < Hg(II) Cu(II) < Zn(II) < As(V) < Cd(II) Co(II) < As(III) < Cr(VI). Aquests valors es varen comparar amb les concentracions permeses en aigues residuals industrials establertes per la normativa oficial de Catalunya (Espanya). L'assaig Microtox va resultar ser suficientment sensible per detectar els elements assajats respecte a les normes oficials referents al control de la contaminació, excepte en el cas del cadmi, mercuri, arsenat, arsenit i cromat. En la segona part d'aquest treball, com a resultats complementaris dels resultats previs obtinguts utilitzant l'assaig de toxicitat aguda Microtox, els efectes crònics del Cd(II), Cr(VI) i As(V) es varen analitzar sobre la taxa de creixement i la viabilitat en el mateix model biològic. Sorprenentment, aquests productes químics nocius varen resultar ser poc tòxics per aquest bacteri quan es mesura el seu efecte després de temps d'exposició llargs. Tot i això, en el cas del Cr(VI), l'assaig d'inhibició de la viabilitat va resultar ser més sensible que l'assaig de toxicitat aguda Microtox. Així mateix, també va ser possible observar un clar fenomen d'hormesis, especialment en el cas del Cd(II), quan s'utilitza l'assaig d'inhibició de la viabilitat. A més a més, diversos experiments es varen portar a terme per intentar explicar la manca de toxicitat de Cr(VI) mostrada pel bacteri Vibrio fischeri. La resistència mostrada per aquest bacteri podria ser atribuïda a la capacitat d'aquest bacteri de convertir el Cr(VI) a la forma menys tòxica de Cr(III). Es va trobar que aquesta capacitat de reducció depèn de la composició del medi de cultiu, de la concentració inicial de Cr(VI), del temps d'incubació i de la presència d'una font de carboni. En la tercera part d'aquest treball, la línia cel·lular humana HT29 i cultius primaris de cèl·lules sanguínies de Sparus sarba es varen utilitzar in vitro per detectar la toxicitat llindar de metalls mesurant la sobreexpressió de proteines d'estrès. Extractes de fangs precedents de diverses plantes de tractament d'aigues residuals i diferents metalls, individualment o en combinació, es varen analitzar sobre cultius cel·lulars humans per avaluar el seu efecte sobre la taxa de creixement i la capacitat d'induir la síntesi de les proteïnes Hsp72 relacionades amb l'estrès cel·lular. No es varen trobar efectes adversos significatius quan els components s'analitzen individualment. Nogensmenys, quan es troben conjuntament, es produeix un afecte advers sobre tan la taxa de creixement com en l'expressió de proteins d'estrès. D'altra banda, cèl·lules sanguínies procedents de Sparus sarba es varen exposar in vitro a diferents concentracions de cadmi, plom i crom. La proteïna d'estrès HSP70 es va sobreexpressar significativament després de l'exposició a concentracions tan febles com 0.1 M. Sota les nostres condicions de treball, no es va evidenciar una sobreexpressió de metal·lotioneïnes. Nogensmenys, les cèl·lules sanguínies de peix varen resultar ser un model biològic interessant per a ser utilitzat en anàlisis de toxicitat. Ambdós models biològics varen resultar ser molt adequats per a detectar acuradament la toxicitat produïda per metalls. En general, l'avaluació de la toxicitat basada en l'anàlisi de la sobreexpressió de proteïnes d'estrès és més sensible que l'avaluació de la toxicitat realitzada a nivell d'organisme. A partir dels resultats obtinguts, podem concloure que una bateria de bioassaigs és realment necessària per avaluar acuradament la toxicitat de metalls ja que existeixen grans variacions entre els valors de toxicitat obtinguts emprant diferents organismes i molts factors ambientals poden influir i modificar els resultats obtinguts.
Resumo:
This paper is a first draft of the principle of statistical modelling on coordinates. Several causes —which would be long to detail—have led to this situation close to the deadline for submitting papers to CODAWORK’03. The main of them is the fast development of the approach along the last months, which let appear previous drafts as obsolete. The present paper contains the essential parts of the state of the art of this approach from my point of view. I would like to acknowledge many clarifying discussions with the group of people working in this field in Girona, Barcelona, Carrick Castle, Firenze, Berlin, G¨ottingen, and Freiberg. They have given a lot of suggestions and ideas. Nevertheless, there might be still errors or unclear aspects which are exclusively my fault. I hope this contribution serves as a basis for further discussions and new developments
Resumo:
The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Central notations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform. In this way very elaborated aspects of mathematical statistics can be understood easily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating, combination of likelihood and robust M-estimation functions are simple additions/ perturbations in A2(Pprior). Weighting observations corresponds to a weighted addition of the corresponding evidence. Likelihood based statistics for general exponential families turns out to have a particularly easy interpretation in terms of A2(P). Regular exponential families form finite dimensional linear subspaces of A2(P) and they correspond to finite dimensional subspaces formed by their posterior in the dual information space A2(Pprior). The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P. The discussion of A2(P) valued random variables, such as estimation functions or likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning
Resumo:
The preceding two editions of CoDaWork included talks on the possible consideration of densities as infinite compositions: Egozcue and D´ıaz-Barrero (2003) extended the Euclidean structure of the simplex to a Hilbert space structure of the set of densities within a bounded interval, and van den Boogaart (2005) generalized this to the set of densities bounded by an arbitrary reference density. From the many variations of the Hilbert structures available, we work with three cases. For bounded variables, a basis derived from Legendre polynomials is used. For variables with a lower bound, we standardize them with respect to an exponential distribution and express their densities as coordinates in a basis derived from Laguerre polynomials. Finally, for unbounded variables, a normal distribution is used as reference, and coordinates are obtained with respect to a Hermite-polynomials-based basis. To get the coordinates, several approaches can be considered. A numerical accuracy problem occurs if one estimates the coordinates directly by using discretized scalar products. Thus we propose to use a weighted linear regression approach, where all k- order polynomials are used as predictand variables and weights are proportional to the reference density. Finally, for the case of 2-order Hermite polinomials (normal reference) and 1-order Laguerre polinomials (exponential), one can also derive the coordinates from their relationships to the classical mean and variance. Apart of these theoretical issues, this contribution focuses on the application of this theory to two main problems in sedimentary geology: the comparison of several grain size distributions, and the comparison among different rocks of the empirical distribution of a property measured on a batch of individual grains from the same rock or sediment, like their composition
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
This paper focuses on one of the methods for bandwidth allocation in an ATM network: the convolution approach. The convolution approach permits an accurate study of the system load in statistical terms by accumulated calculations, since probabilistic results of the bandwidth allocation can be obtained. Nevertheless, the convolution approach has a high cost in terms of calculation and storage requirements. This aspect makes real-time calculations difficult, so many authors do not consider this approach. With the aim of reducing the cost we propose to use the multinomial distribution function: the enhanced convolution approach (ECA). This permits direct computation of the associated probabilities of the instantaneous bandwidth requirements and makes a simple deconvolution process possible. The ECA is used in connection acceptance control, and some results are presented
Resumo:
Compositional data, also called multiplicative ipsative data, are common in survey research instruments in areas such as time use, budget expenditure and social networks. Compositional data are usually expressed as proportions of a total, whose sum can only be 1. Owing to their constrained nature, statistical analysis in general, and estimation of measurement quality with a confirmatory factor analysis model for multitrait-multimethod (MTMM) designs in particular are challenging tasks. Compositional data are highly non-normal, as they range within the 0-1 interval. One component can only increase if some other(s) decrease, which results in spurious negative correlations among components which cannot be accounted for by the MTMM model parameters. In this article we show how researchers can use the correlated uniqueness model for MTMM designs in order to evaluate measurement quality of compositional indicators. We suggest using the additive log ratio transformation of the data, discuss several approaches to deal with zero components and explain how the interpretation of MTMM designs di ers from the application to standard unconstrained data. We show an illustration of the method on data of social network composition expressed in percentages of partner, family, friends and other members in which we conclude that the faceto-face collection mode is generally superior to the telephone mode, although primacy e ects are higher in the face-to-face mode. Compositions of strong ties (such as partner) are measured with higher quality than those of weaker ties (such as other network members)
Resumo:
La idea básica de detección de defectos basada en vibraciones en Monitorización de la Salud Estructural (SHM), es que el defecto altera las propiedades de rigidez, masa o disipación de energía de un sistema, el cual, altera la respuesta dinámica del mismo. Dentro del contexto de reconocimiento de patrones, esta tesis presenta una metodología híbrida de razonamiento para evaluar los defectos en las estructuras, combinando el uso de un modelo de la estructura y/o experimentos previos con el esquema de razonamiento basado en el conocimiento para evaluar si el defecto está presente, su gravedad y su localización. La metodología involucra algunos elementos relacionados con análisis de vibraciones, matemáticas (wavelets, control de procesos estadístico), análisis y procesamiento de señales y/o patrones (razonamiento basado en casos, redes auto-organizativas), estructuras inteligentes y detección de defectos. Las técnicas son validadas numérica y experimentalmente considerando corrosión, pérdida de masa, acumulación de masa e impactos. Las estructuras usadas durante este trabajo son: una estructura tipo cercha voladiza, una viga de aluminio, dos secciones de tubería y una parte del ala de un avión comercial.