882 resultados para methods: data analysis
Resumo:
First discussion on compositional data analysis is attributable to Karl Pearson, in 1897. However, notwithstanding the recent developments on algebraic structure of the simplex, more than twenty years after Aitchison’s idea of log-transformations of closed data, scientific literature is again full of statistical treatments of this type of data by using traditional methodologies. This is particularly true in environmental geochemistry where besides the problem of the closure, the spatial structure (dependence) of the data have to be considered. In this work we propose the use of log-contrast values, obtained by a simplicial principal component analysis, as LQGLFDWRUV of given environmental conditions. The investigation of the log-constrast frequency distributions allows pointing out the statistical laws able to generate the values and to govern their variability. The changes, if compared, for example, with the mean values of the random variables assumed as models, or other reference parameters, allow defining monitors to be used to assess the extent of possible environmental contamination. Case study on running and ground waters from Chiavenna Valley (Northern Italy) by using Na+, K+, Ca2+, Mg2+, HCO3-, SO4 2- and Cl- concentrations will be illustrated
Resumo:
R from http://www.r-project.org/ is ‘GNU S’ – a language and environment for statistical computing and graphics. The environment in which many classical and modern statistical techniques have been implemented, but many are supplied as packages. There are 8 standard packages and many more are available through the cran family of Internet sites http://cran.r-project.org . We started to develop a library of functions in R to support the analysis of mixtures and our goal is a MixeR package for compositional data analysis that provides support for operations on compositions: perturbation and power multiplication, subcomposition with or without residuals, centering of the data, computing Aitchison’s, Euclidean, Bhattacharyya distances, compositional Kullback-Leibler divergence etc. graphical presentation of compositions in ternary diagrams and tetrahedrons with additional features: barycenter, geometric mean of the data set, the percentiles lines, marking and coloring of subsets of the data set, theirs geometric means, notation of individual data in the set . . . dealing with zeros and missing values in compositional data sets with R procedures for simple and multiplicative replacement strategy, the time series analysis of compositional data. We’ll present the current status of MixeR development and illustrate its use on selected data sets
Resumo:
The low levels of unemployment recorded in the UK in recent years are widely cited as evidence of the country’s improved economic performance, and the apparent convergence of unemployment rates across the country’s regions used to suggest that the longstanding divide in living standards between the relatively prosperous ‘south’ and the more depressed ‘north’ has been substantially narrowed. Dissenters from these conclusions have drawn attention to the greatly increased extent of non-employment (around a quarter of the UK’s working age population are not in employment) and the marked regional dimension in its distribution across the country. Amongst these dissenters it is generally agreed that non-employment is concentrated amongst older males previously employed in the now very much smaller ‘heavy’ industries (e.g. coal, steel, shipbuilding). This paper uses the tools of compositiona l data analysis to provide a much richer picture of non-employment and one which challenges the conventional analysis wisdom about UK labour market performance as well as the dissenters view of the nature of the problem. It is shown that, associated with the striking ‘north/south’ divide in nonemployment rates, there is a statistically significant relationship between the size of the non-employment rate and the composition of non-employment. Specifically, it is shown that the share of unemployment in non-employment is negatively correlated with the overall non-employment rate: in regions where the non-employment rate is high the share of unemployment is relatively low. So the unemployment rate is not a very reliable indicator of regional disparities in labour market performance. Even more importantly from a policy viewpoint, a significant positive relationship is found between the size of the non-employment rate and the share of those not employed through reason of sickness or disability and it seems (contrary to the dissenters) that this connection is just as strong for women as it is for men
Resumo:
Usually, psychometricians apply classical factorial analysis to evaluate construct validity of order rank scales. Nevertheless, these scales have particular characteristics that must be taken into account: total scores and rank are highly relevant
Resumo:
First application of compositional data analysis techniques to Australian election data
Resumo:
Isotopic data are currently becoming an important source of information regarding sources, evolution and mixing processes of water in hydrogeologic systems. However, it is not clear how to treat with statistics the geochemical data and the isotopic data together. We propose to introduce the isotopic information as new parts, and apply compositional data analysis with the resulting increased composition. Results are equivalent to downscale the classical isotopic delta variables, because they are already relative (as needed in the compositional framework) and isotopic variations are almost always very small. This methodology is illustrated and tested with the study of the Llobregat River Basin (Barcelona, NE Spain), where it is shown that, though very small, isotopic variations comp lement geochemical principal components, and help in the better identification of pollution sources
Resumo:
notes for class discussion and exercise
Resumo:
data analysis table
Resumo:
Resumen basado en el de la publicación
Resumo:
In this paper, we address issues in segmentation Of remotely sensed LIDAR (LIght Detection And Ranging) data. The LIDAR data, which were captured by airborne laser scanner, contain 2.5 dimensional (2.5D) terrain surface height information, e.g. houses, vegetation, flat field, river, basin, etc. Our aim in this paper is to segment ground (flat field)from non-ground (houses and high vegetation) in hilly urban areas. By projecting the 2.5D data onto a surface, we obtain a texture map as a grey-level image. Based on the image, Gabor wavelet filters are applied to generate Gabor wavelet features. These features are then grouped into various windows. Among these windows, a combination of their first and second order of statistics is used as a measure to determine the surface properties. The test results have shown that ground areas can successfully be segmented from LIDAR data. Most buildings and high vegetation can be detected. In addition, Gabor wavelet transform can partially remove hill or slope effects in the original data by tuning Gabor parameters.
Resumo:
Sensitive methods that are currently used to monitor proteolysis by plasmin in milk are limited due to 7 their high cost and lack of standardisation for quality assurance in the various dairy laboratories. In 8 this study, four methods, trinitrobenzene sulphonic acid (TNBS), reverse phase high pressure liquid 9 chromatography (RP-HPLC), gel electrophoresis and fluorescamine, were selected to assess their 10 suitability for the detection of proteolysis in milk by plasmin. Commercial UHT milk was incubated 11 with plasmin at 37 °C for one week. Clarification was achieved by isoelectric precipitation (pH 4·6 12 soluble extracts)or 6% (final concentration) trichloroacetic acid (TCA). The pH 4·6 and 6% TCA 13 soluble extracts of milk showed high correlations (R2 > 0·93) by the TNBS, fluorescamine and 14 RP-HPLC methods, confirming increased proteolysis during storage. For gel electrophoresis,15 extensive proteolysis was confirmed by the disappearance of α- and β-casein bands on the seventh 16 day, which was more evident in the highest plasmin concentration. This was accompanied by the 17 appearance of α- and β-casein proteolysis products with higher intensities than on previous days, 18 implying that more products had been formed as a result of casein breakdown. The fluorescamine 19 method had a lower detection limit compared with the other methods, whereas gel electrophoresis 20 was the best qualitative method for monitoring β-casein proteolysis products. Although HPLC was the 21 most sensitive, the TNBS method is recommended for use in routine laboratory analysis on the basis 22 of its accuracy, reliability and simplicity.
Resumo:
The principle aim of this research is to elucidate the factors driving the total rate of return of non-listed funds using a panel data analytical framework. In line with previous results, we find that core funds exhibit lower yet more stable returns than value-added and, in particular, opportunistic funds, both cross-sectionally and over time. After taking into account overall market exposure, as measured by weighted market returns, the excess returns of value-added and opportunity funds are likely to stem from: high leverage, high exposure to development, active asset management and investment in specialized property sectors. A random effects estimation of the panel data model largely confirms the findings obtained from the fixed effects model. Again, the country and sector property effect shows the strongest significance in explaining total returns. The stock market variable is negative which hints at switching effects between competing asset classes. For opportunity funds, on average, the returns attributable to gearing are three times higher than those for value added funds and over five times higher than for core funds. Overall, there is relatively strong evidence indicating that country and sector allocation, style, gearing and fund size combinations impact on the performance of unlisted real estate funds.