78 resultados para principle component analysis
Resumo:
In this paper, our previous work on Principal Component Analysis (PCA) based fault detection method is extended to the dynamic monitoring and detection of loss-of-main in power systems using wide-area synchrophasor measurements. In the previous work, a static PCA model was built and verified to be capable of detecting and extracting system faulty events; however the false alarm rate is high. To address this problem, this paper uses a well-known ‘time lag shift’ method to include dynamic behavior of the PCA model based on the synchronized measurements from Phasor Measurement Units (PMU), which is named as the Dynamic Principal Component Analysis (DPCA). Compared with the static PCA approach as well as the traditional passive mechanisms of loss-of-main detection, the proposed DPCA procedure describes how the synchrophasors are linearly
auto- and cross-correlated, based on conducting the singular value decomposition on the augmented time lagged synchrophasor matrix. Similar to the static PCA method, two statistics, namely T2 and Q with confidence limits are calculated to form intuitive charts for engineers or operators to monitor the loss-of-main situation in real time. The effectiveness of the proposed methodology is evaluated on the loss-of-main monitoring of a real system, where the historic data are recorded from PMUs installed in several locations in the UK/Ireland power system.
Resumo:
This is the first paper that introduces a nonlinearity test for principal component models. The methodology involves the division of the data space into disjunct regions that are analysed using principal component analysis using the cross-validation principle. Several toy examples have been successfully analysed and the nonlinearity test has subsequently been applied to data from an internal combustion engine.
Resumo:
Our review and meta-analysis examined the association between a posteriori–derived dietary patterns (DPs) and risk of type 2 diabetes mellitus. MEDLINE and EMBASE were searched for articles published up to July 2012 and data were extracted by two independent reviewers. Overall, 19 cross-sectional, 12 prospective cohort, and two nested case-control studies were eligible for inclusion. Results from cross-sectional studies reported an inconsistent association between DPs and measures of insulin resistance and/or glucose abnormalities, or prevalence of type 2 diabetes. A meta-analysis was carried out on nine prospective cohort studies that had examined DPs derived by principle component/factor analysis and incidence of type 2 diabetes risk (totaling 309,430 participants and 16,644 incident cases). Multivariate-adjusted odds ratios were combined using a random-effects meta-analysis. Two broad DPs (Healthy/Prudent and Unhealthy/Western) were identified based on food factor loadings published in original studies. Pooled results indicated a 15% lower type 2 diabetes risk for those in the highest category of Healthy/Prudent pattern compared with those in the lowest category (95% CI 0.80 to 0.91; P<0.0001). Compared with the lowest category of Unhealthy/Western DP, those in the highest category had a 41% increased risk of type 2 diabetes (95% CI 1.32 to 1.52; P<0.0001). These results provide evidence that DPs are consistently associated with risk of type 2 diabetes even when other lifestyle factors are controlled for. Thus, greater adherence to a DP characterized by high intakes of fruit, vegetables, and complex carbohydrate and low intakes of refined carbohydrate, processed meat, and fried food may be one strategy that could have a positive influence on the global public health burden of type 2 diabetes.
Resumo:
Statistics are regularly used to make some form of comparison between trace evidence or deploy the exclusionary principle (Morgan and Bull, 2007) in forensic investigations. Trace evidence are routinely the results of particle size, chemical or modal analyses and as such constitute compositional data. The issue is that compositional data including percentages, parts per million etc. only carry relative information. This may be problematic where a comparison of percentages and other constraint/closed data is deemed a statistically valid and appropriate way to present trace evidence in a court of law. Notwithstanding an awareness of the existence of the constant sum problem since the seminal works of Pearson (1896) and Chayes (1960) and the introduction of the application of log-ratio techniques (Aitchison, 1986; Pawlowsky-Glahn and Egozcue, 2001; Pawlowsky-Glahn and Buccianti, 2011; Tolosana-Delgado and van den Boogaart, 2013) the problem that a constant sum destroys the potential independence of variances and covariances required for correlation regression analysis and empirical multivariate methods (principal component analysis, cluster analysis, discriminant analysis, canonical correlation) is all too often not acknowledged in the statistical treatment of trace evidence. Yet the need for a robust treatment of forensic trace evidence analyses is obvious. This research examines the issues and potential pitfalls for forensic investigators if the constant sum constraint is ignored in the analysis and presentation of forensic trace evidence. Forensic case studies involving particle size and mineral analyses as trace evidence are used to demonstrate the use of a compositional data approach using a centred log-ratio (clr) transformation and multivariate statistical analyses.
Resumo:
This paper theoretically analysis the recently proposed "Extended Partial Least Squares" (EPLS) algorithm. After pointing out some conceptual deficiencies, a revised algorithm is introduced that covers the middle ground between Partial Least Squares and Principal Component Analysis. It maximises a covariance criterion between a cause and an effect variable set (partial least squares) and allows a complete reconstruction of the recorded data (principal component analysis). The new and conceptually simpler EPLS algorithm has successfully been applied in detecting and diagnosing various fault conditions, where the original EPLS algorithm did only offer fault detection.
Resumo:
This paper presents two new approaches for use in complete process monitoring. The firstconcerns the identification of nonlinear principal component models. This involves the application of linear
principal component analysis (PCA), prior to the identification of a modified autoassociative neural network (AAN) as the required nonlinear PCA (NLPCA) model. The benefits are that (i) the number of the reduced set of linear principal components (PCs) is smaller than the number of recorded process variables, and (ii) the set of PCs is better conditioned as redundant information is removed. The result is a new set of input data for a modified neural representation, referred to as a T2T network. The T2T NLPCA model is then used for complete process monitoring, involving fault detection, identification and isolation. The second approach introduces a new variable reconstruction algorithm, developed from the T2T NLPCA model. Variable reconstruction can enhance the findings of the contribution charts still widely used in industry by reconstructing the outputs from faulty sensors to produce more accurate fault isolation. These ideas are illustrated using recorded industrial data relating to developing cracks in an industrial glass melter process. A comparison of linear and nonlinear models, together with the combined use of contribution charts and variable reconstruction, is presented.
Resumo:
Geologic and environmental factors acting over varying spatial scales can control
trace element distribution and mobility in soils. In turn, the mobility of an element in soil will affect its oral bioaccessibility. Geostatistics, kriging and principal component analysis (PCA) were used to explore factors and spatial ranges of influence over a suite of 8 element oxides, soil organic carbon (SOC), pH, and the trace elements nickel (Ni), vanadium (V) and zinc (Zn). Bioaccessibility testing was carried out previously using the Unified BARGE Method on a sub-set of 91 soil samples from the Northern Ireland Tellus1 soil archive. Initial spatial mapping of total Ni, V and Zn concentrations shows their distributions are correlated spatially with local geologic formations, and prior correlation analyses showed that statistically significant controls were exerted over trace element bioaccessibility by the 8 oxides, SOC and pH. PCA applied to the geochemistry parameters of the bioaccessibility sample set yielded three principal components accounting for 77% of cumulative variance in the data
set. Geostatistical analysis of oxide, trace element, SOC and pH distributions using 6862 sample locations also identified distinct spatial ranges of influence for these variables, concluded to arise from geologic forming processes, weathering processes, and localised soil chemistry factors. Kriging was used to conduct a spatial PCA of Ni, V and Zn distributions which identified two factors comprising the majority of distribution variance. This was spatially accounted for firstly by basalt rock types, with the second component associated with sandstone and limestone in the region. The results suggest trace element bioaccessibility and distribution is controlled by chemical and geologic processes which occur over variable spatial ranges of influence.
Resumo:
Biosignal measurement and processing is increasingly being deployed in ambulatory situations particularly in connected health applications. Such an environment dramatically increases the likelihood of artifacts which can occlude features of interest and reduce the quality of information available in the signal. If multichannel recordings are available for a given signal source, then there are currently a considerable range of methods which can suppress or in some cases remove the distorting effect of such artifacts. There are, however, considerably fewer techniques available if only a single-channel measurement is available and yet single-channel measurements are important where minimal instrumentation complexity is required. This paper describes a novel artifact removal technique for use in such a context. The technique known as ensemble empirical mode decomposition with canonical correlation analysis (EEMD-CCA) is capable of operating on single-channel measurements. The EEMD technique is first used to decompose the single-channel signal into a multidimensional signal. The CCA technique is then employed to isolate the artifact components from the underlying signal using second-order statistics. The new technique is tested against the currently available wavelet denoising and EEMD-ICA techniques using both electroencephalography and functional near-infrared spectroscopy data and is shown to produce significantly improved results. © 1964-2012 IEEE.
Resumo:
The use of handheld near infrared (NIR) instrumentation, as a tool for rapid analysis, has the potential to be used widely in the animal feed sector. A comparison was made between handheld NIR and benchtop instruments in terms of proximate analysis of poultry feed using off-the-shelf calibration models and including statistical analysis. Additionally, melamine adulterated soya bean products were used to develop qualitative and quantitative calibration models from the NIRS spectral data with excellent calibration models and prediction statistics obtained. With regards to the quantitative approach, the coefficients of determination (R2) were found to be 0.94-0.99 with the corresponding values for the root mean square error of calibration and prediction were found to be 0.081-0.215 % and 0.095-0.288 % respectively. In addition, cross validation was used to further validate the models with the root mean square error of cross validation found to be 0.101-0.212 %. Furthermore, by adopting a qualitative approach with the spectral data and applying Principal Component Analysis, it was possible to discriminate between adulterated and pure samples.
Resumo:
Single component geochemical maps are the most basic representation of spatial elemental distributions and commonly used in environmental and exploration geochemistry. However, the compositional nature of geochemical data imposes several limitations on how the data should be presented. The problems relate to the constant sum problem (closure), and the inherently multivariate relative information conveyed by compositional data. Well known is, for instance, the tendency of all heavy metals to show lower values in soils with significant contributions of diluting elements (e.g., the quartz dilution effect); or the contrary effect, apparent enrichment in many elements due to removal of potassium during weathering. The validity of classical single component maps is thus investigated, and reasonable alternatives that honour the compositional character of geochemical concentrations are presented. The first recommended such method relies on knowledge-driven log-ratios, chosen to highlight certain geochemical relations or to filter known artefacts (e.g. dilution with SiO2 or volatiles). This is similar to the classical normalisation approach to a single element. The second approach uses the (so called) log-contrasts, that employ suitable statistical methods (such as classification techniques, regression analysis, principal component analysis, clustering of variables, etc.) to extract potentially interesting geochemical summaries. The caution from this work is that if a compositional approach is not used, it becomes difficult to guarantee that any identified pattern, trend or anomaly is not an artefact of the constant sum constraint. In summary the authors recommend a chain of enquiry that involves searching for the appropriate statistical method that can answer the required geological or geochemical question whilst maintaining the integrity of the compositional nature of the data. The required log-ratio transformations should be applied followed by the chosen statistical method. Interpreting the results may require a closer working relationship between statisticians, data analysts and geochemists.
Resumo:
A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.
Resumo:
We formally compare fundamental factor and latent factor approaches to oil price modelling. Fundamental modelling has a long history in seeking to understand oil price movements, while latent factor modelling has a more recent and limited history, but has gained popularity in other financial markets. The two approaches, though competing, have not formally been compared as to effectiveness. For a range of short- medium- and long-dated WTI oil futures we test a recently proposed five-factor fundamental model and a Principal Component Analysis latent factor model. Our findings demonstrate that there is no discernible difference between the two techniques in a dynamic setting. We conclude that this infers some advantages in adopting the latent factor approach due to the difficulty in determining a well specified fundamental model.
Resumo:
This paper presents a statistical-based fault diagnosis scheme for application to internal combustion engines. The scheme relies on an identified model that describes the relationships between a set of recorded engine variables using principal component analysis (PCA). Since combustion cycles are complex in nature and produce nonlinear relationships between the recorded engine variables, the paper proposes the use of nonlinear PCA (NLPCA). The paper further justifies the use of NLPCA by comparing the model accuracy of the NLPCA model with that of a linear PCA model. A new nonlinear variable reconstruction algorithm and bivariate scatter plots are proposed for fault isolation, following the application of NLPCA. The proposed technique allows the diagnosis of different fault types under steady-state operating conditions. More precisely, nonlinear variable reconstruction can remove the fault signature from the recorded engine data, which allows the identification and isolation of the root cause of abnormal engine behaviour. The paper shows that this can lead to (i) an enhanced identification of potential root causes of abnormal events and (ii) the masking of faulty sensor readings. The effectiveness of the enhanced NLPCA based monitoring scheme is illustrated by its application to a sensor fault and a process fault. The sensor fault relates to a drift in the fuel flow reading, whilst the process fault relates to a partial blockage of the intercooler. These faults are introduced to a Volkswagen TDI 1.9 Litre diesel engine mounted on an experimental engine test bench facility.
Resumo:
Raman microscopy, based upon the inelastic scattering (Raman) of light by molecular species, has been applied as a specific structural probe in a wide range of biomedical samples. The purpose of the present investigation was to assess the potential of the technique for spectral characterization of the porcine outer retina derived from the area centralis, which contains the highest proportion of cone:rod cell ratio in the pig retina. METHODS: Retinal cross-sections, immersion-fixed in 4% (w/v) PFA and cryoprotected, were placed on salinized slides and air-dried prior to direct Raman microscopic analysis at three excitation wavelengths, 785 nm, 633 nm, and 514 nm. RESULTS: Raman spectra of each of the photoreceptor inner and outer segments (PIS, POS) and of the outer nuclear layer (ONL) of the retina acquired at 785 nm were dominated by vibrational features characteristic of proteins and lipids. There was a clear difference between the inner and outer domains in the spectroscopic regions, amide I and III, known to be sensitive to protein conformation. The spectra recorded with 633 nm excitation mirrored those observed at 785 nm excitation for the amide I region, but with an additional pattern of bands in the spectra of the PIS region, attributed to cytochrome c. The same features were even more enhanced in spectra recorded with 514 nm excitation. A significant nucleotide contribution was observed in the spectra recorded for the ONL at all three excitation wavelengths. A Raman map was constructed of the major spectral components found in the retinal outer segments, as predicted by principal component analysis of the data acquired using 633 nm excitation. Comparison of the Raman map with its histological counterpart revealed a strong correlation between the two images. CONCLUSIONS: It has been demonstrated that Raman spectroscopy offers a unique insight into the biochemical composition of the light-sensing cells of the retina following the application of standard histological protocols. The present study points to the considerable promise of Raman microscopy as a component-specific probe of retinal tissue.