63 resultados para Acceleration data structure
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
En aquest projecte s'ha desenvolupat una interfície web per calcular rutes a la ciutat de Barcelona. Les rutes calculades són a peu, entre un punt d'origen qualsevol i un punt d'interès turístic de la ciutat com a destí. Per això s'han extret les dades dels carrers de Barcelona d'OpenStreetMap i s'han insertat a una base de dades postgreSQL/postGIS, juntament amb una capa vectorial de punts d'interès turístic que s'ha creat amb el SIG d'escriptori qGIS. El càlcul de les rutes amb les dades de la base de dades s'ha realitzat amb l'extensió pgRouting, i la interfície web per seleccionar els punts d'origen i destí, mostrar els mapes, i mostrar les rutes resultat, s'ha desenvolupat utilitzant la llibreria OpenLayers.
Resumo:
We present building blocks for algorithms for the efficient reduction of square factor, i.e. direct repetitions in strings. So the basic problem is this: given a string, compute all strings that can be obtained by reducing factors of the form zz to z. Two types of algorithms are treated: an offline algorithm is one that can compute a data structure on the given string in advance before the actual search for the square begins; in contrast, online algorithms receive all input only at the time when a request is made. For offline algorithms we treat the following problem: Let u and w be two strings such that w is obtained from u by reducing a square factor zz to only z. If we further are given the suffix table of u, how can we derive the suffix table for w without computing it from scratch? As the suffix table plays a key role in online algorithms for the detection of squares in a string, this derivation can make the iterated reduction of squares more efficient. On the other hand, we also show how a suffix array, used for the offline detection of squares, can be adapted to the new string resulting from the deletion of a square. Because the deletion is a very local change, this adaption is more eficient than the computation of the new suffix array from scratch.
Resumo:
Precision of released figures is not only an important quality feature of official statistics,it is also essential for a good understanding of the data. In this paper we show a casestudy of how precision could be conveyed if the multivariate nature of data has to betaken into account. In the official release of the Swiss earnings structure survey, the totalsalary is broken down into several wage components. We follow Aitchison's approachfor the analysis of compositional data, which is based on logratios of components. Wefirst present diferent multivariate analyses of the compositional data whereby the wagecomponents are broken down by economic activity classes. Then we propose a numberof ways to assess precision
Resumo:
In this paper we analyse, using Monte Carlo simulation, the possible consequences of incorrect assumptions on the true structure of the random effects covariance matrix and the true correlation pattern of residuals, over the performance of an estimation method for nonlinear mixed models. The procedure under study is the well known linearization method due to Lindstrom and Bates (1990), implemented in the nlme library of S-Plus and R. Its performance is studied in terms of bias, mean square error (MSE), and true coverage of the associated asymptotic confidence intervals. Ignoring other criteria like the convenience of avoiding over parameterised models, it seems worst to erroneously assume some structure than do not assume any structure when this would be adequate.
Resumo:
The recently measured inclusive electron-proton cross section in the nucleon resonance region, performed with the CLAS detector at the Thomas Jefferson Laboratory, has provided new data for the nucleon structure function F2 with previously unavailable precision. In this paper we propose a description of these experimental data based on a Regge-dual model for F2. The basic inputs in the model are nonlinear complex Regge trajectories producing both isobar resonances and a smooth background. The model is tested against the experimental data, and the Q2 dependence of the moments is calculated. The fitted model for the structure function (inclusive cross section) is a limiting case of the more general scattering amplitude equally applicable to deeply virtual Compton scattering. The connection between the two is discussed.
Resumo:
This paper investigates experimentally how organisational decision processes affect the moral motivations of actors inside a firm that must forego profits to reduce harming a third party. In a "vertical" treatment, one insider unilaterally sets the harm-reduction strategy; the other can only accept or quit. In a "horizontal" treatment, the insiders decide by consensus. Our 2-by-2 design also controls for communication effects. In our data, communication makes vertical firms more ethical; voice appears to mitigate "responsibility-alleviation" in that subordinates with voice feel responsible for what their firms do. Vertical firms are then more ethical than the horizontal firms for which our bargaining data reveal a dynamic form of responsibility-alleviation and our chat data indicate a strong "insider-outsider" effect.
Resumo:
We analyse the implications of optimal taxation for the stochastic behaviour of debt. We show that when a government pursues an optimal fiscal policy under complete markets, the value of debt has the same or less persistence than other variables in the economy and it declines in response to shocks that cause the deficit to increase. By contrast, under incomplete markets debt shows more persistence than other variables and it increases in response to shocks that cause a higher deficit. Data for US government debt reveals diametrically opposite results from those of complete markets and is much more supportive of bond market incompleteness.
Resumo:
Report for the scientific sojourn at the University of Reading, United Kingdom, from January until May 2008. The main objectives have been firstly to infer population structure and parameters in demographic models using a total of 13 microsatellite loci for genotyping approximately 30 individuals per population in 10 Palinurus elephas populations both from Mediterranean and Atlantic waters. Secondly, developing statistical methods to identify discrepant loci, possibly under selection and implement those methods using the R software environment. It is important to consider that the calculation of the probability distribution of the demographic and mutational parameters for a full genetic data set is numerically difficult for complex demographic history (Stephens 2003). The Approximate Bayesian Computation (ABC), based on summary statistics to infer posterior distributions of variable parameters without explicit likelihood calculations, can surmount this difficulty. This would allow to gather information on different demographic prior values (i.e. effective population sizes, migration rate, microsatellite mutation rate, mutational processes) and assay the sensitivity of inferences to demographic priors by assuming different priors.
Resumo:
Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult toachieve because the relative values of the forecast components often fail to behave ina way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It hasbeen shown that cause-specic mortality forecasts are pessimistic when compared withall-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approachof using log mortality rates and forecasts the density of deaths in the life table. Sincethese values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbingstate), they are intrinsically relative rather than absolute values across decrements aswell as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison(1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that theunit sum constraint is honoured. The structure of the best-known, single-decrementmortality-rate forecasting model, devised by Lee and Carter (1992), is expressed incompositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortalityby cause of death for Japan
Resumo:
Simpson's paradox, also known as amalgamation or aggregation paradox, appears whendealing with proportions. Proportions are by construction parts of a whole, which canbe interpreted as compositions assuming they only carry relative information. TheAitchison inner product space structure of the simplex, the sample space of compositions, explains the appearance of the paradox, given that amalgamation is a nonlinearoperation within that structure. Here we propose to use balances, which are specificelements of this structure, to analyse situations where the paradox might appear. Withthe proposed approach we obtain that the centre of the tables analysed is a naturalway to compare them, which avoids by construction the possibility of a paradox.Key words: Aitchison geometry, geometric mean, orthogonal projection
Resumo:
As stated in Aitchison (1986), a proper study of relative variation in a compositional data set should be based on logratios, and dealing with logratios excludes dealing with zeros. Nevertheless, it is clear that zero observations might be present in real data sets, either because the corresponding part is completelyabsent –essential zeros– or because it is below detection limit –rounded zeros. Because the second kind of zeros is usually understood as “a trace too small to measure”, it seems reasonable to replace them by a suitable small value, and this has been the traditional approach. As stated, e.g. by Tauber (1999) and byMartín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000), the principal problem in compositional data analysis is related to rounded zeros. One should be careful to use a replacement strategy that does not seriously distort the general structure of the data. In particular, the covariance structure of the involvedparts –and thus the metric properties– should be preserved, as otherwise further analysis on subpopulations could be misleading. Following this point of view, a non-parametric imputation method isintroduced in Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000). This method is analyzed in depth by Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2003) where it is shown that thetheoretical drawbacks of the additive zero replacement method proposed in Aitchison (1986) can be overcome using a new multiplicative approach on the non-zero parts of a composition. The new approachhas reasonable properties from a compositional point of view. In particular, it is “natural” in the sense thatit recovers the “true” composition if replacement values are identical to the missing values, and it is coherent with the basic operations on the simplex. This coherence implies that the covariance structure of subcompositions with no zeros is preserved. As a generalization of the multiplicative replacement, in thesame paper a substitution method for missing values on compositional data sets is introduced
Resumo:
One of the tantalising remaining problems in compositional data analysis lies in how to deal with data sets in which there are components which are essential zeros. By anessential zero we mean a component which is truly zero, not something recorded as zero simply because the experimental design or the measuring instrument has not been sufficiently sensitive to detect a trace of the part. Such essential zeros occur inmany compositional situations, such as household budget patterns, time budgets,palaeontological zonation studies, ecological abundance studies. Devices such as nonzero replacement and amalgamation are almost invariably ad hoc and unsuccessful insuch situations. From consideration of such examples it seems sensible to build up amodel in two stages, the first determining where the zeros will occur and the secondhow the unit available is distributed among the non-zero parts. In this paper we suggest two such models, an independent binomial conditional logistic normal model and a hierarchical dependent binomial conditional logistic normal model. The compositional data in such modelling consist of an incidence matrix and a conditional compositional matrix. Interesting statistical problems arise, such as the question of estimability of parameters, the nature of the computational process for the estimation of both the incidence and compositional parameters caused by the complexity of the subcompositional structure, the formation of meaningful hypotheses, and the devising of suitable testing methodology within a lattice of such essential zero-compositional hypotheses. The methodology is illustrated by application to both simulated and real compositional data
Resumo:
First discussion on compositional data analysis is attributable to Karl Pearson, in 1897. However, notwithstanding the recent developments on algebraic structure of the simplex, more than twenty years after Aitchison’s idea of log-transformations of closed data, scientific literature is again full of statistical treatments of this type of data by using traditional methodologies. This is particularly true in environmental geochemistry where besides the problem of the closure, the spatial structure (dependence) of the data have to be considered. In this work we propose the use of log-contrast values, obtained by asimplicial principal component analysis, as LQGLFDWRUV of given environmental conditions. The investigation of the log-constrast frequency distributions allows pointing out the statistical laws able togenerate the values and to govern their variability. The changes, if compared, for example, with the mean values of the random variables assumed as models, or other reference parameters, allow definingmonitors to be used to assess the extent of possible environmental contamination. Case study on running and ground waters from Chiavenna Valley (Northern Italy) by using Na+, K+, Ca2+, Mg2+, HCO3-, SO4 2- and Cl- concentrations will be illustrated