77 resultados para principal component regression
Resumo:
Systematic principal component analysis (PCA) methods are presented in this paper for reliable islanding detection for power systems with significant penetration of distributed generations (DGs), where synchrophasors recorded by Phasor Measurement Units (PMUs) are used for system monitoring. Existing islanding detection methods such as Rate-of-change-of frequency (ROCOF) and Vector Shift are fast for processing local information, however with the growth in installed capacity of DGs, they suffer from several drawbacks. Incumbent genset islanding detection cannot distinguish a system wide disturbance from an islanding event, leading to mal-operation. The problem is even more significant when the grid does not have sufficient inertia to limit frequency divergences in the system fault/stress due to the high penetration of DGs. To tackle such problems, this paper introduces PCA methods for islanding detection. Simple control chart is established for intuitive visualization of the transients. A Recursive PCA (RPCA) scheme is proposed as a reliable extension of the PCA method to reduce the false alarms for time-varying process. To further reduce the computational burden, the approximate linear dependence condition (ALDC) errors are calculated to update the associated PCA model. The proposed PCA and RPCA methods are verified by detecting abnormal transients occurring in the UK utility network.
Resumo:
A novel model-based principal component analysis (PCA) method is proposed in this paper for wide-area power system monitoring, aiming to tackle one of the critical drawbacks of the conventional PCA, i.e. the incapability to handle non-Gaussian distributed variables. It is a significant extension of the original PCA method which has already shown to outperform traditional methods like rate-of-change-of-frequency (ROCOF). The ROCOF method is quick for processing local information, but its threshold is difficult to determine and nuisance tripping may easily occur. The proposed model-based PCA method uses a radial basis function neural network (RBFNN) model to handle the nonlinearity in the data set to solve the no-Gaussian issue, before the PCA method is used for islanding detection. To build an effective RBFNN model, this paper first uses a fast input selection method to remove insignificant neural inputs. Next, a heuristic optimization technique namely Teaching-Learning-Based-Optimization (TLBO) is adopted to tune the nonlinear parameters in the RBF neurons to build the optimized model. The novel RBFNN based PCA monitoring scheme is then employed for wide-area monitoring using the residuals between the model outputs and the real PMU measurements. Experimental results confirm the efficiency and effectiveness of the proposed method in monitoring a suite of process variables with different distribution characteristics, showing that the proposed RBFNN PCA method is a reliable scheme as an effective extension to the linear PCA method.
Resumo:
In this paper, our previous work on Principal Component Analysis (PCA) based fault detection method is extended to the dynamic monitoring and detection of loss-of-main in power systems using wide-area synchrophasor measurements. In the previous work, a static PCA model was built and verified to be capable of detecting and extracting system faulty events; however the false alarm rate is high. To address this problem, this paper uses a well-known ‘time lag shift’ method to include dynamic behavior of the PCA model based on the synchronized measurements from Phasor Measurement Units (PMU), which is named as the Dynamic Principal Component Analysis (DPCA). Compared with the static PCA approach as well as the traditional passive mechanisms of loss-of-main detection, the proposed DPCA procedure describes how the synchrophasors are linearly
auto- and cross-correlated, based on conducting the singular value decomposition on the augmented time lagged synchrophasor matrix. Similar to the static PCA method, two statistics, namely T2 and Q with confidence limits are calculated to form intuitive charts for engineers or operators to monitor the loss-of-main situation in real time. The effectiveness of the proposed methodology is evaluated on the loss-of-main monitoring of a real system, where the historic data are recorded from PMUs installed in several locations in the UK/Ireland power system.
Resumo:
Subspace monitoring has recently been proposed as a condition monitoring tool that requires considerably fewer variables to be analysed compared to dynamic principal component analysis (PCA). This paper analyses subspace monitoring in identifying and isolating fault conditions, which reveals that the existing work suffers from inherent limitations if complex fault senarios arise. Based on the assumption that the fault signature is deterministic while the monitored variables are stochastic, the paper introduces a regression-based reconstruction technique to overcome these limitations. The utility of the proposed fault identification and isolation method is shown using a simulation example and the analysis of experimental data from an industrial reactive distillation unit.
Resumo:
Single component geochemical maps are the most basic representation of spatial elemental distributions and commonly used in environmental and exploration geochemistry. However, the compositional nature of geochemical data imposes several limitations on how the data should be presented. The problems relate to the constant sum problem (closure), and the inherently multivariate relative information conveyed by compositional data. Well known is, for instance, the tendency of all heavy metals to show lower values in soils with significant contributions of diluting elements (e.g., the quartz dilution effect); or the contrary effect, apparent enrichment in many elements due to removal of potassium during weathering. The validity of classical single component maps is thus investigated, and reasonable alternatives that honour the compositional character of geochemical concentrations are presented. The first recommended such method relies on knowledge-driven log-ratios, chosen to highlight certain geochemical relations or to filter known artefacts (e.g. dilution with SiO2 or volatiles). This is similar to the classical normalisation approach to a single element. The second approach uses the (so called) log-contrasts, that employ suitable statistical methods (such as classification techniques, regression analysis, principal component analysis, clustering of variables, etc.) to extract potentially interesting geochemical summaries. The caution from this work is that if a compositional approach is not used, it becomes difficult to guarantee that any identified pattern, trend or anomaly is not an artefact of the constant sum constraint. In summary the authors recommend a chain of enquiry that involves searching for the appropriate statistical method that can answer the required geological or geochemical question whilst maintaining the integrity of the compositional nature of the data. The required log-ratio transformations should be applied followed by the chosen statistical method. Interpreting the results may require a closer working relationship between statisticians, data analysts and geochemists.
Resumo:
Abstract: Raman spectroscopy has been used for the first time to predict the FA composition of unextracted adipose tissue of pork, beef, lamb, and chicken. It was found that the bulk unsaturation parameters could be predicted successfully [R-2 = 0.97, root mean square error of prediction (RMSEP) = 4.6% of 4 sigma], with cis unsaturation, which accounted for the majority of the unsaturation, giving similar correlations. The combined abundance of all measured PUFA (>= 2 double bonds per chain) was also well predicted with R-2 = 0.97 and RMSEP = 4.0% of 4 sigma. Trans unsaturation was not as well modeled (R-2 = 0.52, RMSEP = 18% of 4 sigma); this reduced prediction ability can be attributed to the low levels of trans FA found in adipose tissue (0.035 times the cis unsaturation level). For the individual FA, the average partial least squares (PLS) regression coefficient of the 18 most abundant FA (relative abundances ranging from 0.1 to 38.6% of the total FA content) was R-2 = 0.73; the average RMSEP = 11.9% of 4 sigma. Regression coefficients and prediction errors for the five most abundant FA were all better than the average value (in some cases as low as RMSEP = 4.7% of 4 sigma). Cross-correlation between the abundances of the minor FA and more abundant acids could be determined by principal component analysis methods, and the resulting groups of correlated compounds were also well-predicted using PLS. The accuracy of the prediction of individual FA was at least as good as other spectroscopic methods, and the extremely straightforward sampling method meant that very rapid analysis of samples at ambient temperature was easily achieved. This work shows that Raman profiling of hundreds of samples per day is easily achievable with an automated sampling system.
Resumo:
Logistic regression and Gaussian mixture model (GMM) classifiers have been trained to estimate the probability of acute myocardial infarction (AMI) in patients based upon the concentrations of a panel of cardiac markers. The panel consists of two new markers, fatty acid binding protein (FABP) and glycogen phosphorylase BB (GPBB), in addition to the traditional cardiac troponin I (cTnI), creatine kinase MB (CKMB) and myoglobin. The effect of using principal component analysis (PCA) and Fisher discriminant analysis (FDA) to preprocess the marker concentrations was also investigated. The need for classifiers to give an accurate estimate of the probability of AMI is argued and three categories of performance measure are described, namely discriminatory ability, sharpness, and reliability. Numerical performance measures for each category are given and applied. The optimum classifier, based solely upon the samples take on admission, was the logistic regression classifier using FDA preprocessing. This gave an accuracy of 0.85 (95% confidence interval: 0.78-0.91) and a normalised Brier score of 0.89. When samples at both admission and a further time, 1-6 h later, were included, the performance increased significantly, showing that logistic regression classifiers can indeed use the information from the five cardiac markers to accurately and reliably estimate the probability AMI. © Springer-Verlag London Limited 2008.
Resumo:
In this paper, we present a Statistical Shape Model for Human Figure Segmentation in gait sequences. Point Distribution Models (PDM) generally use Principal Component analysis (PCA) to describe the main directions of variation in the training set. However, PCA assumes a number of restrictions on the data that do not always hold. In this work, we explore the potential of Independent Component Analysis (ICA) as an alternative shape decomposition to the PDM-based Human Figure Segmentation. The shape model obtained enables accurate estimation of human figures despite segmentation errors in the input silhouettes and has really good convergence qualities.
Resumo:
Purpose. To examine the association between a posteriori–derived dietary patterns (DP) and retinal vessel caliber in an elderly population.
Methods. This was a cross-sectional study of 288 elderly adults (>65 years) who participated in the European Eye study (EUREYE) Northern Irish cohort. DP were extracted using principal component analysis from completed food frequency questionnaires. Semi-automated computer grading was used to determine the mean retinal vessel diameters (central retinal arteriole equivalent [CRAE] and central retinal venule equivalent [CRVE]) from digitized visual field one images using a standard measurement protocol.
Results. Three major DP were identified in this population, which accounted for 21% of the total variance: a “healthy” pattern with high factor loadings for oily fish, fruits and vegetables, and olive oil; an “unhealthy” pattern with high factor loadings for red and processed meat, refined grains, eggs, butter, sugar and sweets; and a “snack and beverage” pattern with high factor loading for pizza, nuts, and coffee. Multivariable linear regression analysis indicated no significant association between major identified DP and mean CRAE or CRVE in all models.
Conclusions. This is the first study to investigate associations between a posteriori–derived DP and retinal vessel caliber. There was no evidence of a relationship between extracted DP and retinal vessel measurements in this population. However, it is possible that potentially important relationships exist between single nutrients or foods and vessel diameters that cannot be identified using a DP approach. Further studies to examine the role of dietary factors in the microcirculation are required.
Resumo:
Reducing wafer metrology continues to be a major target in semiconductor manufacturing efficiency initiatives due to it being a high cost, non-value added operation that impacts on cycle-time and throughput. However, metrology cannot be eliminated completely given the important role it plays in process monitoring and advanced process control. To achieve the required manufacturing precision, measurements are typically taken at multiple sites across a wafer. The selection of these sites is usually based on a priori knowledge of wafer failure patterns and spatial variability with additional sites added over time in response to process issues. As a result, it is often the case that in mature processes significant redundancy can exist in wafer measurement plans. This paper proposes a novel methodology based on Forward Selection Component Analysis (FSCA) for analyzing historical metrology data in order to determine the minimum set of wafer sites needed for process monitoring. The paper also introduces a virtual metrology (VM) based approach for reconstructing the complete wafer profile from the optimal sites identified by FSCA. The proposed methodology is tested and validated on a wafer manufacturing metrology dataset. © 2012 IEEE.
Resumo:
Statistics are regularly used to make some form of comparison between trace evidence or deploy the exclusionary principle (Morgan and Bull, 2007) in forensic investigations. Trace evidence are routinely the results of particle size, chemical or modal analyses and as such constitute compositional data. The issue is that compositional data including percentages, parts per million etc. only carry relative information. This may be problematic where a comparison of percentages and other constraint/closed data is deemed a statistically valid and appropriate way to present trace evidence in a court of law. Notwithstanding an awareness of the existence of the constant sum problem since the seminal works of Pearson (1896) and Chayes (1960) and the introduction of the application of log-ratio techniques (Aitchison, 1986; Pawlowsky-Glahn and Egozcue, 2001; Pawlowsky-Glahn and Buccianti, 2011; Tolosana-Delgado and van den Boogaart, 2013) the problem that a constant sum destroys the potential independence of variances and covariances required for correlation regression analysis and empirical multivariate methods (principal component analysis, cluster analysis, discriminant analysis, canonical correlation) is all too often not acknowledged in the statistical treatment of trace evidence. Yet the need for a robust treatment of forensic trace evidence analyses is obvious. This research examines the issues and potential pitfalls for forensic investigators if the constant sum constraint is ignored in the analysis and presentation of forensic trace evidence. Forensic case studies involving particle size and mineral analyses as trace evidence are used to demonstrate the use of a compositional data approach using a centred log-ratio (clr) transformation and multivariate statistical analyses.
Resumo:
This paper presents a statistical-based fault diagnosis scheme for application to internal combustion engines. The scheme relies on an identified model that describes the relationships between a set of recorded engine variables using principal component analysis (PCA). Since combustion cycles are complex in nature and produce nonlinear relationships between the recorded engine variables, the paper proposes the use of nonlinear PCA (NLPCA). The paper further justifies the use of NLPCA by comparing the model accuracy of the NLPCA model with that of a linear PCA model. A new nonlinear variable reconstruction algorithm and bivariate scatter plots are proposed for fault isolation, following the application of NLPCA. The proposed technique allows the diagnosis of different fault types under steady-state operating conditions. More precisely, nonlinear variable reconstruction can remove the fault signature from the recorded engine data, which allows the identification and isolation of the root cause of abnormal engine behaviour. The paper shows that this can lead to (i) an enhanced identification of potential root causes of abnormal events and (ii) the masking of faulty sensor readings. The effectiveness of the enhanced NLPCA based monitoring scheme is illustrated by its application to a sensor fault and a process fault. The sensor fault relates to a drift in the fuel flow reading, whilst the process fault relates to a partial blockage of the intercooler. These faults are introduced to a Volkswagen TDI 1.9 Litre diesel engine mounted on an experimental engine test bench facility.
Resumo:
During lateral leg raising, a synergistic inclination of the supporting leg and trunk in the opposite direction to the leg movement is performed in order to preserve equilibrium. As first hypothesized by Pagano and Turvey (J Exp Psychol Hum Percept Perform, 1995, 21:1070-1087), the perception of limb orientation could be based on the orientation of the limb's inertia tensor. The purpose of this study was thus to explore whether the final upper body orientation (trunk inclination relative to vertical) depends on changes in the trunk inertia tensor. We imposed a loading condition, with total mass of 4 kg added to the subject's trunk in either a symmetrical or asymmetrical configuration. This changed the orientation of the trunk inertia tensor while keeping the total trunk mass constant. In order to separate any effects of the inertia tensor from the effects of gravitational torque, the experiment was carried out in normo- and microgravity. The results indicated that in normogravity the same final upper body orientation was maintained irrespective of the loading condition. In microgravity, regardless of loading conditions the same (but different from the normogravity) orientation of the upper body was achieved through different joint organizations: two joints (the hip and ankle joints of the supporting leg) in the asymmetrical loading condition, and one (hip) in the symmetrical loading condition. In order to determine whether the different orientations of the inertia tensor were perceived during the movement, the interjoint coordination was quantified by performing a principal components analysis (PCA) on the supporting and moving hips and on the supporting ankle joints. It was expected that different loading conditions would modify the principal component of the PCA. In normogravity, asymmetrical loading decreased the coupling between joints, while in microgravity a strong coupling was preserved whatever the loading condition. It was concluded that the trunk inertia tensor did not play a role during the lateral leg raising task because in spite of the absence of gravitational torque the final upper body orientation and the interjoint coupling were not influenced.