103 resultados para Multivariate statistics

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a statistical-based fault diagnosis scheme for application to internal combustion engines. The scheme relies on an identified model that describes the relationships between a set of recorded engine variables using principal component analysis (PCA). Since combustion cycles are complex in nature and produce nonlinear relationships between the recorded engine variables, the paper proposes the use of nonlinear PCA (NLPCA). The paper further justifies the use of NLPCA by comparing the model accuracy of the NLPCA model with that of a linear PCA model. A new nonlinear variable reconstruction algorithm and bivariate scatter plots are proposed for fault isolation, following the application of NLPCA. The proposed technique allows the diagnosis of different fault types under steady-state operating conditions. More precisely, nonlinear variable reconstruction can remove the fault signature from the recorded engine data, which allows the identification and isolation of the root cause of abnormal engine behaviour. The paper shows that this can lead to (i) an enhanced identification of potential root causes of abnormal events and (ii) the masking of faulty sensor readings. The effectiveness of the enhanced NLPCA based monitoring scheme is illustrated by its application to a sensor fault and a process fault. The sensor fault relates to a drift in the fuel flow reading, whilst the process fault relates to a partial blockage of the intercooler. These faults are introduced to a Volkswagen TDI 1.9 Litre diesel engine mounted on an experimental engine test bench facility.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Treasure et al. (2004) recently proposed a new sub space-monitoring technique, based on the N4SID algorithm, within the multivariate statistical process control framework. This dynamic-monitoring method requires considerably fewer variables to be analysed when compared with dynamic principal component analysis (PCA). The contribution charts and variable reconstruction, traditionally employed for static PCA, are analysed in a dynamic context. The contribution charts and variable reconstruction may be affected by the ratio of the number of retained components to the total number of analysed variables. Particular problems arise if this ratio is large and a new reconstruction chart is introduced to overcome these. The utility of such a dynamic contribution chart and variable reconstruction is shown in a simulation and by application to industrial data from a distillation unit.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper analyses multivariate statistical techniques for identifying and isolating abnormal process behaviour. These techniques include contribution charts and variable reconstructions that relate to the application of principal component analysis (PCA). The analysis reveals firstly that contribution charts produce variable contributions which are linearly dependent and may lead to an incorrect diagnosis, if the number of principal components retained is close to the number of recorded process variables. The analysis secondly yields that variable reconstruction affects the geometry of the PCA decomposition. The paper further introduces an improved variable reconstruction method for identifying multiple sensor and process faults and for isolating their influence upon the recorded process variables. It is shown that this can accommodate the effect of reconstruction, i.e. changes in the covariance matrix of the sensor readings and correctly re-defining the PCA-based monitoring statistics and their confidence limits. (c) 2006 Elsevier Ltd. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Element profile was investigated for their use to trace the geographical origin of rice (Oryza sativa L.) samples. The concentrations of 13 elements (calcium (Ca), potassium (K), magnesium (Mg), phosphorus (P), boron (B), manganese (Mn), iron (Fe), nickel (Ni), copper (Cu), arsenic (As), selenium (Se), molybdenum (Mo), and cadmium (Cd)) were determined in the rice samples by inductively coupled plasma optical emission and mass spectrometry. Most of the essential elements for human health in rice were within normal ranges except for Mo and Se. Mo concentrations were twice as high as those in rice from Vietnam and Spain. Meanwhile, Se concentrations were three times lower in the whole province compared to the Chinese average level of 0.088 mg/kg. About 12% of the rice samples failed the Chinese national food safety standard of 0.2 mg/kg for Cd. Combined with the multi-elemental profile in rice, the principal component analysis (PCA), discriminant function analysis (DFA) and Fibonacci index analysis (FIA) were applied to discriminate geographical origins of the samples. Results indicated that the FIA method could achieve a more effective geographical origin classification compared with PCA and DFA, due to its efficiency in making the grouping even when the elemental variability was so high that PCA and DFA showed little discriminatory power. Furthermore, some elements were identified as the most powerful indicators of geographical origin: Ca, Ni, Fe and Cd. This suggests that the newly established methodology of FIA based on the ionome profile can be applied to determine the geographical origin of rice.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Research over the past two decades on the Holocene sediments from the tide dominated west side of the lower Ganges delta has focussed on constraining the sedimentary environment through grain size distributions (GSD). GSD has traditionally been assessed through the use of probability density function (PDF) methods (e.g. log-normal, log skew-Laplace functions), but these approaches do not acknowledge the compositional nature of the data, which may compromise outcomes in lithofacies interpretations. The use of PDF approaches in GSD analysis poses a series of challenges for the development of lithofacies models, such as equifinal distribution coefficients and obscuring the empirical data variability. In this study a methodological framework for characterising GSD is presented through compositional data analysis (CODA) plus a multivariate statistical framework. This provides a statistically robust analysis of the fine tidal estuary sediments from the West Bengal Sundarbans, relative to alternative PDF approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces the application of linear multivariate statistical techniques, including partial least squares (PLS), canonical correlation analysis (CCA) and reduced rank regression (RRR), into the area of Systems Biology. This new approach aims to extract the important proteins embedded in complex signal transduction pathway models.The analysis is performed on a model of intracellular signalling along the janus-associated kinases/signal transducers and transcription factors (JAK/STAT) and mitogen activated protein kinases (MAPK) signal transduction pathways in interleukin-6 (IL6) stimulated hepatocytes, which produce signal transducer and activator of transcription factor 3 (STAT3).A region of redundancy within the MAPK pathway that does not affect the STAT3 transcription was identified using CCA. This is the core finding of this analysis and cannot be obtained by inspecting the model by eye. In addition, RRR was found to isolate terms that do not significantly contribute to changes in protein concentrations, while the application of PLS does not provide such a detailed picture by virtue of its construction.This analysis has a similar objective to conventional model reduction techniques with the advantage of maintaining the meaning of the states prior to and after the reduction process. A significant model reduction is performed, with a marginal loss in accuracy, offering a more concise model while maintaining the main influencing factors on the STAT3 transcription.The findings offer a deeper understanding of the reaction terms involved, confirm the relevance of several proteins to the production of Acute Phase Proteins and complement existing findings regarding cross-talk between the two signalling pathways.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: Recently, many univariate and several multivariate approaches have been suggested for testing differential expression of gene sets between different phenotypes. However, despite a wealth of literature studying their performance on simulated and real biological data, still there is a need to quantify their relative performance when they are testing different null hypotheses.

Results: In this article, we compare the performance of univariate and multivariate tests on both simulated and biological data. In the simulation study we demonstrate that high correlations equally affect the power of both, univariate as well as multivariate tests. In addition, for most of them the power is similarly affected by the dimensionality of the gene set and by the percentage of genes in the set, for which expression is changing between two phenotypes. The application of different test statistics to biological data reveals that three statistics (sum of squared t-tests, Hotelling's T2, N-statistic), testing different null hypotheses, find some common but also some complementing differentially expressed gene sets under specific settings. This demonstrates that due to complementing null hypotheses each test projects on different aspects of the data and for the analysis of biological data it is beneficial to use all three tests simultaneously instead of focusing exclusively on just one.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The monitoring of multivariate systems that exhibit non-Gaussian behavior is addressed. Existing work advocates the use of independent component analysis (ICA) to extract the underlying non-Gaussian data structure. Since some of the source signals may be Gaussian, the use of principal component analysis (PCA) is proposed to capture the Gaussian and non-Gaussian source signals. A subsequent application of ICA then allows the extraction of non-Gaussian components from the retained principal components (PCs). A further contribution is the utilization of a support vector data description to determine a confidence limit for the non-Gaussian components. Finally, a statistical test is developed for determining how many non-Gaussian components are encapsulated within the retained PCs, and associated monitoring statistics are defined. The utility of the proposed scheme is demonstrated by a simulation example, and the analysis of recorded data from an industrial melter.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a simple and flexible framework for forecasting the joint density of asset returns. The multinormal distribution is augmented with a polynomial in (time-varying) non-central co-moments of assets. We estimate the coefficients of the polynomial via the Method of Moments for a carefully selected set of co-moments. In an extensive empirical study, we compare the proposed model with a range of other models widely used in the literature. Employing a recently proposed as well as standard techniques to evaluate multivariate forecasts, we conclude that the augmented joint density provides highly accurate forecasts of the “negative tail” of the joint distribution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research aims to use the multivariate geochemical dataset, generated by the Tellus project, to investigate the appropriate use of transformation methods to maintain the integrity of geochemical data and inherent constrained behaviour in multivariate relationships. The widely used normal score transform is compared with the use of a stepwise conditional transform technique. The Tellus Project, managed by GSNI and funded by the Department of Enterprise Trade and Development and the EU’s Building Sustainable Prosperity Fund, involves the most comprehensive geological mapping project ever undertaken in Northern Ireland. Previous study has demonstrated spatial variability in the Tellus data but geostatistical analysis and interpretation of the datasets requires use of an appropriate methodology that reproduces the inherently complex multivariate relations. Previous investigation of the Tellus geochemical data has included use of Gaussian-based techniques. However, earth science variables are rarely Gaussian, hence transformation of data is integral to the approach. The multivariate geochemical dataset generated by the Tellus project provides an opportunity to investigate the appropriate use of transformation methods, as required for Gaussian-based geostatistical analysis. In particular, the stepwise conditional transform is investigated and developed for the geochemical datasets obtained as part of the Tellus project. The transform is applied to four variables in a bivariate nested fashion due to the limited availability of data. Simulation of these transformed variables is then carried out, along with a corresponding back transformation to original units. Results show that the stepwise transform is successful in reproducing both univariate statistics and the complex bivariate relations exhibited by the data. Greater fidelity to multivariate relationships will improve uncertainty models, which are required for consequent geological, environmental and economic inferences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Slow release drugs must be manufactured to meet target specifications with respect to dissolution curve profiles. In this paper we consider the problem of identifying the drivers of dissolution curve variability of a drug from historical manufacturing data. Several data sources are considered: raw material parameters, coating data, loss on drying and pellet size statistics. The methodology employed is to develop predictive models using LASSO, a powerful machine learning algorithm for regression with high-dimensional datasets. LASSO provides sparse solutions facilitating the identification of the most important causes of variability in the drug fabrication process. The proposed methodology is illustrated using manufacturing data for a slow release drug.