855 resultados para Principle Component Analysis (PCA)
Resumo:
This paper describes a chemotaxonomic analysis of a database of triterpenoid compounds from the Celastraceae family using principal component analysis (PCA). The numbers of occurrences of thirty types of triterpene skeleton in different tribes of the family were used as variables. The study shows that PCA applied to chemical data can contribute to an intrafamilial classification of Celastraceae, once some questionable taxa affinity was observed, from chemotaxonomic inferences about genera and they are in agreement with the phylogeny previously proposed. The inclusion of Hippocrateaceae within Celastraceae is supported by the triterpene chemistry.
Resumo:
This paper characterizes humic substances (HS) extracted from soil samples collected in the Rio Negro basin in the state of Amazonas, Brazil, particularly investigating their reduction capabilities towards Hg(II) in order to elucidate potential mercury cycling/volatilization in this environment. For this reason, a multimethod approach was used, consisting of both instrumental methods (elemental analysis, EPR, solid-state NMR, FIA combined with cold-vapor AAS of Hg(0)) and statistical methods such as principal component analysis (PCA) and a central composite factorial planning method. The HS under study were divided into groups, complexing and reducing ones, owing to different distribution of their functionalities. The main functionalities (cor)related with reduction of Hg(II) were phenolic, carboxylic and amide groups, while the groups related with complexation of Hg(II) were ethers, hydroxyls, aldehydes and ketones. The HS extracted from floodable regions of the Rio Negro basin presented a greater capacity to retain (to complex, to adsorb physically and/or chemically) Hg(II), while nonfloodable regions showed a greater capacity to reduce Hg(II), indicating that HS extracted from different types of regions contribute in different ways to the biogeochemical mercury cycle in the basin of the mid-Rio Negro, AM, Brazil. (c) 2007 Published by Elsevier B.V.
Resumo:
This paper characterizes humic substances (HS) extracted from soil samples collected in the Rio Negro basin in the state of Amazonas, Brazil, particularly investigating their reduction capabilities towards Hg(II) in order to elucidate potential mercury cycling/volatilization in this environment. For this reason, a multimethod approach was used, consisting of both instrumental methods (elemental analysis, EPR, solid-state NMR, FIA combined with cold-vapor AAS of Hg(0)) and statistical methods such as principal component analysis (PCA) and a central composite factorial planning method. The HS under study were divided into groups, complexing and reducing ones, owing to different distribution of their functionalities. The main functionalities (cor)related with reduction of Hg(II) were phenolic, carboxylic and amide groups, while the groups related with complexation of Hg(II) were ethers, hydroxyls, aldehydes and ketones. The HS extracted from floodable regions of the Rio Negro basin presented a greater capacity to retain (to complex, to adsorb physically and/or chemically) Hg(II), while nonfloodable regions showed a greater capacity to reduce Hg(II), indicating that HS extracted from different types of regions contribute in different ways to the biogeochemical mercury cycle in the basin of the mid-Rio Negro, AM, Brazil. (c) 2007 Published by Elsevier B.V.
Resumo:
As one of the newest members in the field of articial immune systems (AIS), the Dendritic Cell Algorithm (DCA) is based on behavioural models of natural dendritic cells (DCs). Unlike other AIS, the DCA does not rely on training data, instead domain or expert knowledge is required to predetermine the mapping between input signals from a particular instance to the three categories used by the DCA. This data preprocessing phase has received the criticism of having manually over-fitted the data to the algorithm, which is undesirable. Therefore, in this paper we have attempted to ascertain if it is possible to use principal component analysis (PCA) techniques to automatically categorise input data while still generating useful and accurate classication results. The integrated system is tested with a biometrics dataset for the stress recognition of automobile drivers. The experimental results have shown the application of PCA to the DCA for the purpose of automated data preprocessing is successful.
Resumo:
Objective To explore the characteristics of regional distribution of cancer deaths in Shandong Province with the principle components analysis. Methods The principle components analysis with co-variance matrix for age-adjusted mortality rates and percentages of 20 types of cancer in 22 counties (cities) were carried out using SAS Software. Results Over 90% of the total information could be reflected by the top 3 principle components and the first principle component alone represented more than half of the overall regional variances. The first component mainly reflected the area differences of esophageal cancer. The second component mainly reflected the area differences of lung cancer, stomach cancer and liver cancer. The value of the first principal component scores showed a clear trend that the west areas possessed higher values and the east the lower values. Based on the top two components,the 22 counties (cities) could be divided into several geographical clusters. Conclusion The overall difference of regional distribution of cancers in Shandong is dominated by several major cancers including esophageal cancer, lung cancer, stomach cancer and liver cancer. Among them,esophageal cancer makes the largest contribution. If the range of counties (cities) analyzed could be further widened, the characteristics of regional distribution of cancer mortality would be better examined. Abstract in Chinese 目的 利用主成分分析探讨山东省恶性肿瘤死亡的地区分布特征. 方法 利用SAS软件对山东省22个县市区2004~2006午的20种恶性肿瘤标化死亡率和构成比分别进行协方差矩阵主成分分析. 结果 前3个主成分就反映了总体差异90%以上的信息,其中仅第1主成分就提供了总体差异一半以上的信息.第1主成分主要反映了食管癌的地区差异,第2主成分主要反映肺癌的地区差异,兼顾胃癌和肝癌.各地区第1主成分得分呈现西高东低的趋势,根据第1和第2主成分可以将调查地区分为若干类别,表现为明显的地理聚集性. 结论 山东省各地区恶性肿瘤死亡的总体差异主要取决于少数高发肿瘤,包括食管癌、肺癌、胃癌、肝癌等,其中以食管癌地位最为突出.如能进一步扩大分析范围,可更好地查明恶性肿瘤死亡的地区特征.
Resumo:
Some statistical procedures already available in literature are employed in developing the water quality index, WQI. The nature of complexity and interdependency that occur in physical and chemical processes of water could be easier explained if statistical approaches were applied to water quality indexing. The most popular statistical method used in developing WQI is the principal component analysis (PCA). In literature, the WQI development based on the classical PCA mostly used water quality data that have been transformed and normalized. Outliers may be considered in or eliminated from the analysis. However, the classical mean and sample covariance matrix used in classical PCA methodology is not reliable if the outliers exist in the data. Since the presence of outliers may affect the computation of the principal component, robust principal component analysis, RPCA should be used. Focusing in Langat River, the RPCA-WQI was introduced for the first time in this study to re-calculate the DOE-WQI. Results show that the RPCA-WQI is capable to capture similar distribution in the existing DOE-WQI.
Resumo:
The transient changes in resistances of Cr0.8Fe0.2NbO4 thick film sensors towards specified concentrations of H-2, NH3, acetonitrile, acetone, alcohol, cyclohexane and petroleum gas at different operating temperatures were recorded. The analyte-specific characteristics such as slopes of the response and retrace curves, area under the curve and sensitivity deduced from the transient curve of the respective analyte gas have been used to construct a data matrix. Principal component analysis (PCA) was applied to this data and the score plot was obtained. Distinguishing one reducing gas from the other is demonstrated based on this approach, which otherwise is not possible by measuring relative changes in conductivity. This methodology is extended for three Cr0.8Fe0.2NbO4 thick film sensor array operated at different temperatures. (C) 2015 Elsevier B.V. All rights reserved.
Resumo:
Gene microarray technology is highly effective in screening for differential gene expression and has hence become a popular tool in the molecular investigation of cancer. When applied to tumours, molecular characteristics may be correlated with clinical features such as response to chemotherapy. Exploitation of the huge amount of data generated by microarrays is difficult, however, and constitutes a major challenge in the advancement of this methodology. Independent component analysis (ICA), a modern statistical method, allows us to better understand data in such complex and noisy measurement environments. The technique has the potential to significantly increase the quality of the resulting data and improve the biological validity of subsequent analysis. We performed microarray experiments on 31 postmenopausal endometrial biopsies, comprising 11 benign and 20 malignant samples. We compared ICA to the established methods of principal component analysis (PCA), Cyber-T, and SAM. We show that ICA generated patterns that clearly characterized the malignant samples studied, in contrast to PCA. Moreover, ICA improved the biological validity of the genes identified as differentially expressed in endometrial carcinoma, compared to those found by Cyber-T and SAM. In particular, several genes involved in lipid metabolism that are differentially expressed in endometrial carcinoma were only found using this method. This report highlights the potential of ICA in the analysis of microarray data.
Resumo:
A central question in Neuroscience is that of how the nervous system generates the spatiotemporal commands needed to realize complex gestures, such as handwriting. A key postulate is that the central nervous system (CNS) builds up complex movements from a set of simpler motor primitives or control modules. In this study we examined the control modules underlying the generation of muscle activations when performing different types of movement: discrete, point-to-point movements in eight different directions and continuous figure-eight movements in both the normal, upright orientation and rotated 90 degrees. To test for the effects of biomechanical constraints, movements were performed in the frontal-parallel or sagittal planes, corresponding to two different nominal flexion/abduction postures of the shoulder. In all cases we measured limb kinematics and surface electromyographic activity (EMB) signals for seven different muscles acting around the shoulder. We first performed principal component analysis (PCA) of the EMG signals on a movement-by-movement basis. We found a surprisingly consistent pattern of muscle groupings across movement types and movement planes, although we could detect systematic differences between the PCs derived from movements performed in each sholder posture and between the principal components associated with the different orientations of the figure. Unexpectedly we found no systematic differences between the figute eights and the point-to-point movements. The first three principal components could be associated with a general co-contraction of all seven muscles plus two patterns of reciprocal activatoin. From these results, we surmise that both "discrete-rhythmic movements" such as the figure eight, and discrete point-to-point movement may be constructed from three different fundamental modules, one regulating the impedance of the limb over the time span of the movement and two others operating to generate movement, one aligned with the vertical and the other aligned with the horizontal.
Resumo:
Copyright © (2014) by the International Machine Learning Society (IMLS) All rights reserved. Classical methods such as Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are ubiquitous in statistics. However, these techniques are only able to reveal linear re-lationships in data. Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale. In a separate strand of recent research, randomized methods have been proposed to construct features that help reveal nonlinear patterns in data. For basic tasks such as regression or classification, random features exhibit little or no loss in performance, while achieving drastic savings in computational requirements. In this paper we leverage randomness to design scalable new variants of nonlinear PCA and CCA; our ideas extend to key multivariate analysis tools such as spectral clustering or LDA. We demonstrate our algorithms through experiments on real- world data, on which we compare against the state-of-the-art. A simple R implementation of the presented algorithms is provided.
Resumo:
Heart disease is one of the main factor causing death in the developed countries. Over several decades, variety of electronic and computer technology have been developed to assist clinical practices for cardiac performance monitoring and heart disease diagnosis. Among these methods, Ballistocardiography (BCG) has an interesting feature that no electrodes are needed to be attached to the body during the measurement. Thus, it is provides a potential application to asses the patients heart condition in the home. In this paper, a comparison is made for two neural networks based BCG signal classification models. One system uses a principal component analysis (PCA) method, and the other a discrete wavelet transform, to reduce the input dimensionality. It is indicated that the combined wavelet transform and neural network has a more reliable performance than the combined PCA and neural network system. Moreover, the wavelet transform requires no prior knowledge of the statistical distribution of data samples and the computation complexity and training time are reduced.
Resumo:
In this paper, we present a Statistical Shape Model for Human Figure Segmentation in gait sequences. Point Distribution Models (PDM) generally use Principal Component analysis (PCA) to describe the main directions of variation in the training set. However, PCA assumes a number of restrictions on the data that do not always hold. In this work, we explore the potential of Independent Component Analysis (ICA) as an alternative shape decomposition to the PDM-based Human Figure Segmentation. The shape model obtained enables accurate estimation of human figures despite segmentation errors in the input silhouettes and has really good convergence qualities.
Resumo:
This paper emerged from work supported by EPSRC grant GR/S84354/01 and proposes a method of determining principal curves, using spline functions, in principal component analysis (PCA) for the representation of non-linear behaviour in process monitoring. Although principal curves are well established, they are difficult to implement in practice if a large number of variables are analysed. The significant contribution of this paper is that the proposed method has minimal complexity, assuming simple spline geometry, thus enabling efficient computation. The paper provides a foundation for further work where multiple curves may be required to represent underlying non-linear information in complex data.