71 resultados para compositional variations


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Our essay aims at studying suitable statistical methods for the clustering ofcompositional data in situations where observations are constituted by trajectories ofcompositional data, that is, by sequences of composition measurements along a domain.Observed trajectories are known as “functional data” and several methods have beenproposed for their analysis.In particular, methods for clustering functional data, known as Functional ClusterAnalysis (FCA), have been applied by practitioners and scientists in many fields. To ourknowledge, FCA techniques have not been extended to cope with the problem ofclustering compositional data trajectories. In order to extend FCA techniques to theanalysis of compositional data, FCA clustering techniques have to be adapted by using asuitable compositional algebra.The present work centres on the following question: given a sample of compositionaldata trajectories, how can we formulate a segmentation procedure giving homogeneousclasses? To address this problem we follow the steps described below.First of all we adapt the well-known spline smoothing techniques in order to cope withthe smoothing of compositional data trajectories. In fact, an observed curve can bethought of as the sum of a smooth part plus some noise due to measurement errors.Spline smoothing techniques are used to isolate the smooth part of the trajectory:clustering algorithms are then applied to these smooth curves.The second step consists in building suitable metrics for measuring the dissimilaritybetween trajectories: we propose a metric that accounts for difference in both shape andlevel, and a metric accounting for differences in shape only.A simulation study is performed in order to evaluate the proposed methodologies, usingboth hierarchical and partitional clustering algorithm. The quality of the obtained resultsis assessed by means of several indices

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the tantalising remaining problems in compositional data analysis lies in how to deal with data sets in which there are components which are essential zeros. By anessential zero we mean a component which is truly zero, not something recorded as zero simply because the experimental design or the measuring instrument has not been sufficiently sensitive to detect a trace of the part. Such essential zeros occur inmany compositional situations, such as household budget patterns, time budgets,palaeontological zonation studies, ecological abundance studies. Devices such as nonzero replacement and amalgamation are almost invariably ad hoc and unsuccessful insuch situations. From consideration of such examples it seems sensible to build up amodel in two stages, the first determining where the zeros will occur and the secondhow the unit available is distributed among the non-zero parts. In this paper we suggest two such models, an independent binomial conditional logistic normal model and a hierarchical dependent binomial conditional logistic normal model. The compositional data in such modelling consist of an incidence matrix and a conditional compositional matrix. Interesting statistical problems arise, such as the question of estimability of parameters, the nature of the computational process for the estimation of both the incidence and compositional parameters caused by the complexity of the subcompositional structure, the formation of meaningful hypotheses, and the devising of suitable testing methodology within a lattice of such essential zero-compositional hypotheses. The methodology is illustrated by application to both simulated and real compositional data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

First discussion on compositional data analysis is attributable to Karl Pearson, in 1897. However, notwithstanding the recent developments on algebraic structure of the simplex, more than twenty years after Aitchison’s idea of log-transformations of closed data, scientific literature is again full of statistical treatments of this type of data by using traditional methodologies. This is particularly true in environmental geochemistry where besides the problem of the closure, the spatial structure (dependence) of the data have to be considered. In this work we propose the use of log-contrast values, obtained by asimplicial principal component analysis, as LQGLFDWRUV of given environmental conditions. The investigation of the log-constrast frequency distributions allows pointing out the statistical laws able togenerate the values and to govern their variability. The changes, if compared, for example, with the mean values of the random variables assumed as models, or other reference parameters, allow definingmonitors to be used to assess the extent of possible environmental contamination. Case study on running and ground waters from Chiavenna Valley (Northern Italy) by using Na+, K+, Ca2+, Mg2+, HCO3-, SO4 2- and Cl- concentrations will be illustrated

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Single nucleotide polymorphisms (SNPs) are the most frequent type of sequence variation between individuals, and represent a promising tool for finding genetic determinants of complex diseases and understanding the differences in drug response. In this regard, it is of particular interest to study the effect of non-synonymous SNPs in the context of biological networks such as cell signalling pathways. UniProt provides curated information about the functional and phenotypic effects of sequence variation, including SNPs, as well as on mutations of protein sequences. However, no strategy has been developed to integrate this information with biological networks, with the ultimate goal of studying the impact of the functional effect of SNPs in the structure and dynamics of biological networks. Results: First, we identified the different challenges posed by the integration of the phenotypic effect of sequence variants and mutations with biological networks. Second, we developed a strategy for the combination of data extracted from public resources, such as UniProt, NCBI dbSNP, Reactome and BioModels. We generated attribute files containing phenotypic and genotypic annotations to the nodes of biological networks, which can be imported into network visualization tools such as Cytoscape. These resources allow the mapping and visualization of mutations and natural variations of human proteins and their phenotypic effect on biological networks (e.g. signalling pathways, protein-protein interaction networks, dynamic models). Finally, an example on the use of the sequence variation data in the dynamics of a network model is presented. Conclusion: In this paper we present a general strategy for the integration of pathway and sequence variation data for visualization, analysis and modelling purposes, including the study of the functional impact of protein sequence variations on the dynamics of signalling pathways. This is of particular interest when the SNP or mutation is known to be associated to disease. We expect that this approach will help in the study of the functional impact of disease-associated SNPs on the behaviour of cell signalling pathways, which ultimately will lead to a better understanding of the mechanisms underlying complex diseases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the disadvantages of old age is that there is more past than future: this,however, may be turned into an advantage if the wealth of experience and, hopefully,wisdom gained in the past can be reflected upon and throw some light on possiblefuture trends. To an extent, then, this talk is necessarily personal, certainly nostalgic,but also self critical and inquisitive about our understanding of the discipline ofstatistics. A number of almost philosophical themes will run through the talk: searchfor appropriate modelling in relation to the real problem envisaged, emphasis onsensible balances between simplicity and complexity, the relative roles of theory andpractice, the nature of communication of inferential ideas to the statistical layman, theinter-related roles of teaching, consultation and research. A list of keywords might be:identification of sample space and its mathematical structure, choices betweentransform and stay, the role of parametric modelling, the role of a sample spacemetric, the underused hypothesis lattice, the nature of compositional change,particularly in relation to the modelling of processes. While the main theme will berelevance to compositional data analysis we shall point to substantial implications forgeneral multivariate analysis arising from experience of the development ofcompositional data analysis…

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modern methods of compositional data analysis are not well known in biomedical research.Moreover, there appear to be few mathematical and statistical researchersworking on compositional biomedical problems. Like the earth and environmental sciences,biomedicine has many problems in which the relevant scienti c information isencoded in the relative abundance of key species or categories. I introduce three problemsin cancer research in which analysis of compositions plays an important role. Theproblems involve 1) the classi cation of serum proteomic pro les for early detection oflung cancer, 2) inference of the relative amounts of di erent tissue types in a diagnostictumor biopsy, and 3) the subcellular localization of the BRCA1 protein, and it'srole in breast cancer patient prognosis. For each of these problems I outline a partialsolution. However, none of these problems is \solved". I attempt to identify areas inwhich additional statistical development is needed with the hope of encouraging morecompositional data analysts to become involved in biomedical research

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to spectral mapping , a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the 1990's studies of management accounting practices in Europe and in Latin America have given us data on 23 countries. In this paper we use this data to identify five distinct aspects of national management accounting culture being:1. The influence of regulations on official recommendations;2. The source of management accountants;3. Influence from one country to another;4. Variations in use of specific techniques;5. Variations in the objectives of the management accounting system.We then identify seven significant implications of the manager operating in the multinational environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We analyze the impact of a minimum price variation (tick) and timepriority on the dynamics of quotes and the trading costs when competitionfor the order flow is dynamic. We find that convergence to competitiveoutcomes can take time and that the speed of convergence is influencedby the tick size, the priority rule and the characteristics of the orderarrival process. We show also that a zero minimum price variation is neveroptimal when competition for the order flow is dynamic. We compare thetrading outcomes with and without time priority. Time priority is shownto guarantee that uncompetitive spreads cannot be sustained over time.However it can sometimes result in higher trading costs. Empiricalimplications are proposed. In particular, we relate the size of thetrading costs to the frequency of new offers and the dynamics of theinside spread to the state of the book.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Comparative national management accounting is the least developed aspect in the field of international accounting. Only during the second half of the 1990's some comparisons of national managementaccounting practice have appeared published but only at theregional level. In this paper a range of factors that give rise to variations in national management accounting practice are postulated. We support this list with examples from a range of analyses of national management accounting practices, drawing particularly on the work of Lizcano (1996) and Bhimani (1996).Finally, twelve key factors are identified as influencing an individual country's approach to management accounting.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The singular value decomposition and its interpretation as alinear biplot has proved to be a powerful tool for analysing many formsof multivariate data. Here we adapt biplot methodology to the specifficcase of compositional data consisting of positive vectors each of whichis constrained to have unit sum. These relative variation biplots haveproperties relating to special features of compositional data: the studyof ratios, subcompositions and models of compositional relationships. Themethodology is demonstrated on a data set consisting of six-part colourcompositions in 22 abstract paintings, showing how the singular valuedecomposition can achieve an accurate biplot of the colour ratios and howpossible models interrelating the colours can be diagnosed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The origin of the microscopic inhomogeneities in InxGa12xAs layers grown on GaAs by molecular beam epitaxy is analyzed through the optical absorption spectra near the band gap. It is seen that, for relaxed thick layers of about 2.8 mm, composition inhomogeneities are responsible for the band edge smoothing into the whole compositional range (0.05,x,0.8). On the other hand, in thin enough layers strain inhomogeneities are dominant. This evolution in line with layer thickness is due to the atomic diffusion at the surface during growth, induced by the strain inhomogeneities that arise from stress relaxation. In consequence, the strain variations present in the layer are converted into composition variations during growth. This process is energetically favorable as it diminishes elastic energy. An additional support to this hypothesis is given by a clear proportionality between the magnitude of the composition variations and the mean strain.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The approaches of comparative studies and profile measurements, often used in order to detect post-depositional alterations of ceramics, have been applied simultaneously to two sets of Roman pottery, both of which include altered individuals. As analytical techniques, Neutron Activation Analysis and X-Ray Diffraction have been used. Both approaches lead to substantially different results. This shows that they detect different levels of alteration and should complement each other rather than being used exclusively. For the special process of a glassy phase decomposition followed by a crystallization of the Na-zeolite analcime, the results suggest that it changes high-fired calcareous pottery rapidly, and so fundamentally that the results of various archaeometric techniques can be severely disturbed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, zinc indium tin oxide layers with different compositions are used as the active layer of thin film transistors. This multicomponent transparent conductive oxide is gaining great interest due to its reduced content of the scarce indium element. Experimental data indicate that the incorporation of zinc promotes the creation of oxygen vacancies. In thin-film transistors this effect leads to a higher threshold voltage values. The field-effect mobility is also strongly degraded, probably due to coulomb scattering by ionized defects. A post deposition annealing in air reduces the density of oxygen vacancies and improves the fieldeffect mobility by orders of magnitude. Finally, the electrical characteristics of the fabricated thin-film transistors have been analyzed to estimate the density of states in the gap of the active layers. These measurements reveal a clear peak located at 0.3 eV from the conduction band edge that could be attributed to oxygen vacancies.