918 resultados para improved principal components analysis (IPCA) algorithm


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we develop a new approach to sparse principal component analysis (sparse PCA). We propose two single-unit and two block optimization formulations of the sparse PCA problem, aimed at extracting a single sparse dominant principal component of a data matrix, or more components at once, respectively. While the initial formulations involve nonconvex functions, and are therefore computationally intractable, we rewrite them into the form of an optimization program involving maximization of a convex function on a compact set. The dimension of the search space is decreased enormously if the data matrix has many more columns (variables) than rows. We then propose and analyze a simple gradient method suited for the task. It appears that our algorithm has best convergence properties in the case when either the objective function or the feasible set are strongly convex, which is the case with our single-unit formulations and can be enforced in the block case. Finally, we demonstrate numerically on a set of random and gene expression test problems that our approach outperforms existing algorithms both in quality of the obtained solution and in computational speed. © 2010 Michel Journée, Yurii Nesterov, Peter Richtárik and Rodolphe Sepulchre.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper emerged from work supported by EPSRC grant GR/S84354/01 and proposes a method of determining principal curves, using spline functions, in principal component analysis (PCA) for the representation of non-linear behaviour in process monitoring. Although principal curves are well established, they are difficult to implement in practice if a large number of variables are analysed. The significant contribution of this paper is that the proposed method has minimal complexity, assuming simple spline geometry, thus enabling efficient computation. The paper provides a foundation for further work where multiple curves may be required to represent underlying non-linear information in complex data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, Independent Components Analysis (ICA) has proven itself to be a powerful signal-processing technique for solving the Blind-Source Separation (BSS) problems in different scientific domains. In the present work, an application of ICA for processing NIR hyperspectral images to detect traces of peanut in wheat flour is presented. Processing was performed without a priori knowledge of the chemical composition of the two food materials. The aim was to extract the source signals of the different chemical components from the initial data set and to use them in order to determine the distribution of peanut traces in the hyperspectral images. To determine the optimal number of independent component to be extracted, the Random ICA by blocks method was used. This method is based on the repeated calculation of several models using an increasing number of independent components after randomly segmenting the matrix data into two blocks and then calculating the correlations between the signals extracted from the two blocks. The extracted ICA signals were interpreted and their ability to classify peanut and wheat flour was studied. Finally, all the extracted ICs were used to construct a single synthetic signal that could be used directly with the hyperspectral images to enhance the contrast between the peanut and the wheat flours in a real multi-use industrial environment. Furthermore, feature extraction methods (connected components labelling algorithm followed by flood fill method to extract object contours) were applied in order to target the spatial location of the presence of peanut traces. A good visualization of the distributions of peanut traces was thus obtained

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As one of the newest members in the field of articial immune systems (AIS), the Dendritic Cell Algorithm (DCA) is based on behavioural models of natural dendritic cells (DCs). Unlike other AIS, the DCA does not rely on training data, instead domain or expert knowledge is required to predetermine the mapping between input signals from a particular instance to the three categories used by the DCA. This data preprocessing phase has received the criticism of having manually over-fitted the data to the algorithm, which is undesirable. Therefore, in this paper we have attempted to ascertain if it is possible to use principal component analysis (PCA) techniques to automatically categorise input data while still generating useful and accurate classication results. The integrated system is tested with a biometrics dataset for the stress recognition of automobile drivers. The experimental results have shown the application of PCA to the DCA for the purpose of automated data preprocessing is successful.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective To explore the characteristics of regional distribution of cancer deaths in Shandong Province with the principle components analysis. Methods The principle components analysis with co-variance matrix for age-adjusted mortality rates and percentages of 20 types of cancer in 22 counties (cities) were carried out using SAS Software. Results Over 90% of the total information could be reflected by the top 3 principle components and the first principle component alone represented more than half of the overall regional variances. The first component mainly reflected the area differences of esophageal cancer. The second component mainly reflected the area differences of lung cancer, stomach cancer and liver cancer. The value of the first principal component scores showed a clear trend that the west areas possessed higher values and the east the lower values. Based on the top two components,the 22 counties (cities) could be divided into several geographical clusters. Conclusion The overall difference of regional distribution of cancers in Shandong is dominated by several major cancers including esophageal cancer, lung cancer, stomach cancer and liver cancer. Among them,esophageal cancer makes the largest contribution. If the range of counties (cities) analyzed could be further widened, the characteristics of regional distribution of cancer mortality would be better examined. Abstract in Chinese 目的 利用主成分分析探讨山东省恶性肿瘤死亡的地区分布特征. 方法 利用SAS软件对山东省22个县市区2004~2006午的20种恶性肿瘤标化死亡率和构成比分别进行协方差矩阵主成分分析. 结果 前3个主成分就反映了总体差异90%以上的信息,其中仅第1主成分就提供了总体差异一半以上的信息.第1主成分主要反映了食管癌的地区差异,第2主成分主要反映肺癌的地区差异,兼顾胃癌和肝癌.各地区第1主成分得分呈现西高东低的趋势,根据第1和第2主成分可以将调查地区分为若干类别,表现为明显的地理聚集性. 结论 山东省各地区恶性肿瘤死亡的总体差异主要取决于少数高发肿瘤,包括食管癌、肺癌、胃癌、肝癌等,其中以食管癌地位最为突出.如能进一步扩大分析范围,可更好地查明恶性肿瘤死亡的地区特征.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study unsupervised learning in a probabilistic generative model for occlusion. The model uses two types of latent variables: one indicates which objects are present in the image, and the other how they are ordered in depth. This depth order then determines how the positions and appearances of the objects present, specified in the model parameters, combine to form the image. We show that the object parameters can be learnt from an unlabelled set of images in which objects occlude one another. Exact maximum-likelihood learning is intractable. However, we show that tractable approximations to Expectation Maximization (EM) can be found if the training images each contain only a small number of objects on average. In numerical experiments it is shown that these approximations recover the correct set of object parameters. Experiments on a novel version of the bars test using colored bars, and experiments on more realistic data, show that the algorithm performs well in extracting the generating causes. Experiments based on the standard bars benchmark test for object learning show that the algorithm performs well in comparison to other recent component extraction approaches. The model and the learning algorithm thus connect research on occlusion with the research field of multiple-causes component extraction methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper shows that current multivariate statistical monitoring technology may not detect incipient changes in the variable covariance structure nor changes in the geometry of the underlying variable decomposition. To overcome these deficiencies, the local approach is incorporated into the multivariate statistical monitoring framework to define two new univariate statistics for fault detection. Fault isolation is achieved by constructing a fault diagnosis chart which reveals changes in the covariance structure resulting from the presence of a fault. A theoretical analysis is presented and the proposed monitoring approach is exemplified using application studies involving recorded data from two complex industrial processes. © 2007 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a multispectral analysis system using wavelet based Principal Component Analysis (PCA), to improve the brain tissue classification from MRI images. Global transforms like PCA often neglects significant small abnormality details, while dealing with a massive amount of multispectral data. In order to resolve this issue, input dataset is expanded by detail coefficients from multisignal wavelet analysis. Then, PCA is applied on the new dataset to perform feature analysis. Finally, an unsupervised classification with Fuzzy C-Means clustering algorithm is used to measure the improvement in reproducibility and accuracy of the results. A detailed comparative analysis of classified tissues with those from conventional PCA is also carried out. Proposed method yielded good improvement in classification of small abnormalities with high sensitivity/accuracy values, 98.9/98.3, for clinical analysis. Experimental results from synthetic and clinical data recommend the new method as a promising approach in brain tissue analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In human Population Genetics, routine applications of principal component techniques are often required. Population biologists make widespread use of certain discrete classifications of human samples into haplotypes, the monophyletic units of phylogenetic trees constructed from several single nucleotide bimorphisms hierarchically ordered. Compositional frequencies of the haplotypes are recorded within the different samples. Principal component techniques are then required as a dimension-reducing strategy to bring the dimension of the problem to a manageable level, say two, to allow for graphical analysis. Population biologists at large are not aware of the special features of compositional data and normally make use of the crude covariance of compositional relative frequencies to construct principal components. In this short note we present our experience with using traditional linear principal components or compositional principal components based on logratios, with reference to a specific dataset

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Constrained principal component analysis (CPCA) with a finite impulse response (FIR) basis set was used to reveal functionally connected networks and their temporal progression over a multistage verbal working memory trial in which memory load was varied. Four components were extracted, and all showed statistically significant sensitivity to the memory load manipulation. Additionally, two of the four components sustained this peak activity, both for approximately 3 s (Components 1 and 4). The functional networks that showed sustained activity were characterized by increased activations in the dorsal anterior cingulate cortex, right dorsolateral prefrontal cortex, and left supramarginal gyrus, and decreased activations in the primary auditory cortex and "default network" regions. The functional networks that did not show sustained activity were instead dominated by increased activation in occipital cortex, dorsal anterior cingulate cortex, sensori-motor cortical regions, and superior parietal cortex. The response shapes suggest that although all four components appear to be invoked at encoding, the two sustained-peak components are likely to be additionally involved in the delay period. Our investigation provides a unique view of the contributions made by a network of brain regions over the course of a multiple-stage working memory trial.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new multivariate process capability index (MPCI) which is based on the principal component analysis (PCA) and is dependent on a parameter (Formula presented.) which can take on any real number. This MPCI generalises some existing multivariate indices based on PCA proposed by several authors when (Formula presented.) or (Formula presented.). One of the key contributions of this paper is to show that there is a direct correspondence between this MPCI and process yield for a unique value of (Formula presented.). This result is used to establish a relationship between the capability status of the process and to show that under some mild conditions, the estimators of this MPCI is consistent and converge to a normal distribution. This is then applied to perform tests of statistical hypotheses and in determining sample sizes. Several numerical examples are presented with the objective of illustrating the procedures and demonstrating how they can be applied to determine the viability and capacity of different manufacturing processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A total of 61,528 weight records from 22,246 Nellore animals born between 1984 and 2002 were used to compare different multiple-trait analysis methods for birth to mature weights. The following models were used: standard multivarite model (MV), five reduced-rank models fitting the first 1, 2, 3, 4 and 5 genetic principal components, and five models using factor analysis with 1, 2, 3, 4 and 5 factors. Direct additive genetic random effects and residual effects were included in all models. In addition, maternal genetic and maternal permanent environmental effects were included as random effects for birth and weaning weight. The models included contemporary group as fixed effect and age of animal at recording (except for birth weight) and age of dam at calving as linear and quadratic effects (for birth weight and weaning weight). The maternal genetic, maternal permanent environmental and residual (co)variance matrices were assumed to be full rank. According to model selection criteria, the model fitting the three first principal components (PC3) provided the best fit, without the need for factor analysis models. Similar estimates of phenotypic, direct additive and maternal genetic, maternal permanent environmental and residual (co)variances were obtained with models MV and PC3. Direct heritability ranged from 0.21 (birth weight) to 0.45 (weight at 6 years of age). The genetic and phenotypic correlations obtained with model PC3 were slightly higher than those estimated with model MV. In general, the reduced-rank model substantially decreased the number of parameters in the analyses without reducing the goodness-of-fit. © 2013 Elsevier B.V.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Phenotypic data from female Canchim beef cattle were used to obtain estimates of genetic parameters for reproduction and growth traits using a linear animal mixed model. In addition, relationships among animal estimated breeding values (EBVs) for these traits were explored using principal component analysis. The traits studied in female Canchim cattle were age at first calving (AFC), age at second calving (ASC), calving interval (CI), and bodyweight at 420 days of age (BW420). The heritability estimates for AFC, ASC, CI and BW420 were 0.03±0.01, 0.07±0.01, 0.06±0.02, and 0.24±0.02, respectively. The genetic correlations for AFC with ASC, AFC with CI, AFC with BW420, ASC with CI, ASC with BW420, and CI with BW420 were 0.87±0.07, 0.23±0.02, -0.15±0.01, 0.67±0.13, -0.07±0.13, and 0.02±0.14, respectively. Standardised EBVs for AFC, ASC and CI exhibited a high association with the first principal component, whereas the standardised EBV for BW420 was closely associated with the second principal component. The heritability estimates for AFC, ASC and CI suggest that these traits would respond slowly to selection. However, selection response could be enhanced by constructing selection indices based on the principal components. © CSIRO 2013.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel time domain approach for Structural Health Monitoring (SHM) systems based on Electromechanical Impedance (EMI) principle and Principal Component Coefficients (PCC), also known as loadings. Differently of typical applications of EMI applied to SHM, which are based on computing the Frequency Response Function (FRF), in this work the procedure is based on the EMI principle but all analysis is conducted directly in time-domain. For this, the PCC are computed from the time response of PZT (Lead Zirconate Titanate) transducers bonded to the monitored structure, which act as actuator and sensor at the same time. The procedure is carried out exciting the PZT transducers using a wide band chirp signal and getting their time responses. The PCC are obtained in both healthy and damaged conditions and used to compute statistics indexes. Tests were carried out on an aircraft aluminum plate and the results have demonstrated the effectiveness of the proposed method making it an excellent approach for SHM applications. Finally, the results using EMI signals in both frequency and time responses are obtained and compared. © The Society for Experimental Mechanics 2014.