76 resultados para INDEPENDENT COMPONENT ANALYSIS (ICA)
Resumo:
Goats’ milk is responsible for unique traditional products such as Halloumi cheese. The characteristics of Halloumi depend on the original features of the milk and on the conditions under which the milk has been produced such as feeding regime of the animals or region of production. Using a range of milk (33) and Halloumi (33) samples collected over a year from three different locations in Cyprus (A, Anogyra; K, Kofinou; P, Paphos), the potential for fingerprint VOC analysis as marker to authenticate Halloumi was investigated. This unique set up consists of an in-injector thermo desorption (VOCtrap needle) and a chromatofocusing system based on mass spectrometry (VOCscanner). The mass spectra of all the analyzed samples are treated by multivariate analysis (Principle component analysis and Discriminant functions analysis). Results showed that the highland area of product (P) is clearly identified in milks produced (discriminant score 67%). It is interesting to note that the higher similitude found on milks from regions “A” and “K” (with P being distractive; discriminant score 80%) are not ‘carried over’ on the cheeses (higher similitude between regions “A” and “P”, with “K” distinctive). Data have been broken down into three seasons. Similarly, the seasonality differences observed in different milks are not necessarily reported on the produced cheeses. This is expected due to the different VOC signatures developed in cheeses as part of the numerous biochemical changes during its elaboration compared to milk. VOC however it is an additional analytical tool that can aid in the identification of region origin in dairy products.
Resumo:
Single component geochemical maps are the most basic representation of spatial elemental distributions and commonly used in environmental and exploration geochemistry. However, the compositional nature of geochemical data imposes several limitations on how the data should be presented. The problems relate to the constant sum problem (closure), and the inherently multivariate relative information conveyed by compositional data. Well known is, for instance, the tendency of all heavy metals to show lower values in soils with significant contributions of diluting elements (e.g., the quartz dilution effect); or the contrary effect, apparent enrichment in many elements due to removal of potassium during weathering. The validity of classical single component maps is thus investigated, and reasonable alternatives that honour the compositional character of geochemical concentrations are presented. The first recommended such method relies on knowledge-driven log-ratios, chosen to highlight certain geochemical relations or to filter known artefacts (e.g. dilution with SiO2 or volatiles). This is similar to the classical normalisation approach to a single element. The second approach uses the (so called) log-contrasts, that employ suitable statistical methods (such as classification techniques, regression analysis, principal component analysis, clustering of variables, etc.) to extract potentially interesting geochemical summaries. The caution from this work is that if a compositional approach is not used, it becomes difficult to guarantee that any identified pattern, trend or anomaly is not an artefact of the constant sum constraint. In summary the authors recommend a chain of enquiry that involves searching for the appropriate statistical method that can answer the required geological or geochemical question whilst maintaining the integrity of the compositional nature of the data. The required log-ratio transformations should be applied followed by the chosen statistical method. Interpreting the results may require a closer working relationship between statisticians, data analysts and geochemists.
Resumo:
Statistics are regularly used to make some form of comparison between trace evidence or deploy the exclusionary principle (Morgan and Bull, 2007) in forensic investigations. Trace evidence are routinely the results of particle size, chemical or modal analyses and as such constitute compositional data. The issue is that compositional data including percentages, parts per million etc. only carry relative information. This may be problematic where a comparison of percentages and other constraint/closed data is deemed a statistically valid and appropriate way to present trace evidence in a court of law. Notwithstanding an awareness of the existence of the constant sum problem since the seminal works of Pearson (1896) and Chayes (1960) and the introduction of the application of log-ratio techniques (Aitchison, 1986; Pawlowsky-Glahn and Egozcue, 2001; Pawlowsky-Glahn and Buccianti, 2011; Tolosana-Delgado and van den Boogaart, 2013) the problem that a constant sum destroys the potential independence of variances and covariances required for correlation regression analysis and empirical multivariate methods (principal component analysis, cluster analysis, discriminant analysis, canonical correlation) is all too often not acknowledged in the statistical treatment of trace evidence. Yet the need for a robust treatment of forensic trace evidence analyses is obvious. This research examines the issues and potential pitfalls for forensic investigators if the constant sum constraint is ignored in the analysis and presentation of forensic trace evidence. Forensic case studies involving particle size and mineral analyses as trace evidence are used to demonstrate the use of a compositional data approach using a centred log-ratio (clr) transformation and multivariate statistical analyses.
Resumo:
A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.
Resumo:
We formally compare fundamental factor and latent factor approaches to oil price modelling. Fundamental modelling has a long history in seeking to understand oil price movements, while latent factor modelling has a more recent and limited history, but has gained popularity in other financial markets. The two approaches, though competing, have not formally been compared as to effectiveness. For a range of short- medium- and long-dated WTI oil futures we test a recently proposed five-factor fundamental model and a Principal Component Analysis latent factor model. Our findings demonstrate that there is no discernible difference between the two techniques in a dynamic setting. We conclude that this infers some advantages in adopting the latent factor approach due to the difficulty in determining a well specified fundamental model.
Resumo:
This paper presents a statistical-based fault diagnosis scheme for application to internal combustion engines. The scheme relies on an identified model that describes the relationships between a set of recorded engine variables using principal component analysis (PCA). Since combustion cycles are complex in nature and produce nonlinear relationships between the recorded engine variables, the paper proposes the use of nonlinear PCA (NLPCA). The paper further justifies the use of NLPCA by comparing the model accuracy of the NLPCA model with that of a linear PCA model. A new nonlinear variable reconstruction algorithm and bivariate scatter plots are proposed for fault isolation, following the application of NLPCA. The proposed technique allows the diagnosis of different fault types under steady-state operating conditions. More precisely, nonlinear variable reconstruction can remove the fault signature from the recorded engine data, which allows the identification and isolation of the root cause of abnormal engine behaviour. The paper shows that this can lead to (i) an enhanced identification of potential root causes of abnormal events and (ii) the masking of faulty sensor readings. The effectiveness of the enhanced NLPCA based monitoring scheme is illustrated by its application to a sensor fault and a process fault. The sensor fault relates to a drift in the fuel flow reading, whilst the process fault relates to a partial blockage of the intercooler. These faults are introduced to a Volkswagen TDI 1.9 Litre diesel engine mounted on an experimental engine test bench facility.
Resumo:
Raman microscopy, based upon the inelastic scattering (Raman) of light by molecular species, has been applied as a specific structural probe in a wide range of biomedical samples. The purpose of the present investigation was to assess the potential of the technique for spectral characterization of the porcine outer retina derived from the area centralis, which contains the highest proportion of cone:rod cell ratio in the pig retina. METHODS: Retinal cross-sections, immersion-fixed in 4% (w/v) PFA and cryoprotected, were placed on salinized slides and air-dried prior to direct Raman microscopic analysis at three excitation wavelengths, 785 nm, 633 nm, and 514 nm. RESULTS: Raman spectra of each of the photoreceptor inner and outer segments (PIS, POS) and of the outer nuclear layer (ONL) of the retina acquired at 785 nm were dominated by vibrational features characteristic of proteins and lipids. There was a clear difference between the inner and outer domains in the spectroscopic regions, amide I and III, known to be sensitive to protein conformation. The spectra recorded with 633 nm excitation mirrored those observed at 785 nm excitation for the amide I region, but with an additional pattern of bands in the spectra of the PIS region, attributed to cytochrome c. The same features were even more enhanced in spectra recorded with 514 nm excitation. A significant nucleotide contribution was observed in the spectra recorded for the ONL at all three excitation wavelengths. A Raman map was constructed of the major spectral components found in the retinal outer segments, as predicted by principal component analysis of the data acquired using 633 nm excitation. Comparison of the Raman map with its histological counterpart revealed a strong correlation between the two images. CONCLUSIONS: It has been demonstrated that Raman spectroscopy offers a unique insight into the biochemical composition of the light-sensing cells of the retina following the application of standard histological protocols. The present study points to the considerable promise of Raman microscopy as a component-specific probe of retinal tissue.
Resumo:
This paper points out a serious flaw in dynamic multivariate statistical process control (MSPC). The principal component analysis of a linear time series model that is employed to capture auto- and cross-correlation in recorded data may produce a considerable number of variables to be analysed. To give a dynamic representation of the data (based on variable correlation) and circumvent the production of a large time-series structure, a linear state space model is used here instead. The paper demonstrates that incorporating a state space model, the number of variables to be analysed dynamically can be considerably reduced, compared to conventional dynamic MSPC techniques.
Resumo:
Contrary to the traditional view, recent studies suggest that diabetes mellitus has an adverse influence on male reproductive function. Our aim was to determine the affect of diabetes on the testicular environment by identifying and then assessing perturbations in small molecule metabolites. Testes were obtained from control and streptozotocin induced diabetic C57BL/6 mice, two, four and eight weeks post treatment. Diabetic status was confirmed by HbA1c, non fasting blood glucose, physiological condition and body weight. Protein free, low molecular weight, water soluble extracts were assessed using 1H NMR spectroscopy. Principal Component Analysis of the derived profiles was used to classify any variations and specific metabolites were identified based on their spectral pattern. Characteristic metabolite profiles were identified for control and diabetic animals with the most distinctive being from mice with the greatest physical deterioration and loss of bodyweight. Eight streptozotocin treated animals did not develop diabetes and displayed profiles similar to controls. Diabetic mice had decreases in creatine, choline and carnitine and increases in lactate, alanine and myo-inositol. Betaine levels were found to be increased in the majority of diabetic mice but decreased in two animals with severe loss of body weight and physical condition. The association between perturbations in a number of small molecule metabolites known to be influential in sperm function, with diabetic status and physiological condition, adds further impetus to the proposal that diabetes influences important spermatogenic pathways and mechanisms in a subtle and previously unrecognised manner.
Resumo:
Abstract: Raman spectroscopy has been used for the first time to predict the FA composition of unextracted adipose tissue of pork, beef, lamb, and chicken. It was found that the bulk unsaturation parameters could be predicted successfully [R-2 = 0.97, root mean square error of prediction (RMSEP) = 4.6% of 4 sigma], with cis unsaturation, which accounted for the majority of the unsaturation, giving similar correlations. The combined abundance of all measured PUFA (>= 2 double bonds per chain) was also well predicted with R-2 = 0.97 and RMSEP = 4.0% of 4 sigma. Trans unsaturation was not as well modeled (R-2 = 0.52, RMSEP = 18% of 4 sigma); this reduced prediction ability can be attributed to the low levels of trans FA found in adipose tissue (0.035 times the cis unsaturation level). For the individual FA, the average partial least squares (PLS) regression coefficient of the 18 most abundant FA (relative abundances ranging from 0.1 to 38.6% of the total FA content) was R-2 = 0.73; the average RMSEP = 11.9% of 4 sigma. Regression coefficients and prediction errors for the five most abundant FA were all better than the average value (in some cases as low as RMSEP = 4.7% of 4 sigma). Cross-correlation between the abundances of the minor FA and more abundant acids could be determined by principal component analysis methods, and the resulting groups of correlated compounds were also well-predicted using PLS. The accuracy of the prediction of individual FA was at least as good as other spectroscopic methods, and the extremely straightforward sampling method meant that very rapid analysis of samples at ambient temperature was easily achieved. This work shows that Raman profiling of hundreds of samples per day is easily achievable with an automated sampling system.
Resumo:
Treasure et al. (2004) recently proposed a new sub space-monitoring technique, based on the N4SID algorithm, within the multivariate statistical process control framework. This dynamic-monitoring method requires considerably fewer variables to be analysed when compared with dynamic principal component analysis (PCA). The contribution charts and variable reconstruction, traditionally employed for static PCA, are analysed in a dynamic context. The contribution charts and variable reconstruction may be affected by the ratio of the number of retained components to the total number of analysed variables. Particular problems arise if this ratio is large and a new reconstruction chart is introduced to overcome these. The utility of such a dynamic contribution chart and variable reconstruction is shown in a simulation and by application to industrial data from a distillation unit.
Resumo:
This paper analyses multivariate statistical techniques for identifying and isolating abnormal process behaviour. These techniques include contribution charts and variable reconstructions that relate to the application of principal component analysis (PCA). The analysis reveals firstly that contribution charts produce variable contributions which are linearly dependent and may lead to an incorrect diagnosis, if the number of principal components retained is close to the number of recorded process variables. The analysis secondly yields that variable reconstruction affects the geometry of the PCA decomposition. The paper further introduces an improved variable reconstruction method for identifying multiple sensor and process faults and for isolating their influence upon the recorded process variables. It is shown that this can accommodate the effect of reconstruction, i.e. changes in the covariance matrix of the sensor readings and correctly re-defining the PCA-based monitoring statistics and their confidence limits. (c) 2006 Elsevier Ltd. All rights reserved.
Resumo:
This paper introduces a fast algorithm for moving window principal component analysis (MWPCA) which will adapt a principal component model. This incorporates the concept of recursive adaptation within a moving window to (i) adapt the mean and variance of the process variables, (ii) adapt the correlation matrix, and (iii) adjust the PCA model by recomputing the decomposition. This paper shows that the new algorithm is computationally faster than conventional moving window techniques, if the window size exceeds 3 times the number of variables, and is not affected by the window size. A further contribution is the introduction of an N-step-ahead horizon into the process monitoring. This implies that the PCA model, identified N-steps earlier, is used to analyze the current observation. For monitoring complex chemical systems, this work shows that the use of the horizon improves the ability to detect slowly developing drifts.
Resumo:
Populations of Gammarus duebeni celticus, previously the only amphipod species resident in the rivers of the Lough Neagh catchment, N. Ireland, have been subjected to invasion by G. pulex from the British mainland. Numerous previous studies have investigated the potential behavioural mechanisms, principally differential mutual predation, underlying the replacement of G. d. celticus by G. pulex in Irish waters, and the mutually exclusive distributions of these species in Britain and mainland Europe. However, the relative degree of influence of abiotic versus biotic factors in structuring these amphipod communities remains unresolved. This study used principal component analysis (PCA) to distinguish physico-chemical parameters that have significant roles in determining the current distribution of G. pulex relative to G. d. celticus in L. Neagh rivers. We show that the original domination of rivers by the native G. d, celticus has changed radically, with many sites in several rivers containing either both species or only G. pulex. G. pulex was more abundant than the G. d. celticus in sites with low dissolved oxygen levels. This was reflected in the macroinvertebrate assemblages associated with G. pulex in these sites, which tended to be those tolerant of low biological water quality. The present study thus emphasizes the importance of the habitat template, particularly water quality, for Gammarus spp. interactions. If rivers become increasingly stressed by organic pollution, it is probable the range expansion of G. pulex will continue. Because these two species are not ecological equivalents, the outcomes of G. pulex incursions into G. d. celticus sites may ultimately depend on the prevailing physico-chemical regimes in each site.
Resumo:
Subspace monitoring has recently been proposed as a condition monitoring tool that requires considerably fewer variables to be analysed compared to dynamic principal component analysis (PCA). This paper analyses subspace monitoring in identifying and isolating fault conditions, which reveals that the existing work suffers from inherent limitations if complex fault senarios arise. Based on the assumption that the fault signature is deterministic while the monitored variables are stochastic, the paper introduces a regression-based reconstruction technique to overcome these limitations. The utility of the proposed fault identification and isolation method is shown using a simulation example and the analysis of experimental data from an industrial reactive distillation unit.