34 resultados para probabilistic principal component analysis (probabilistic PCA)
Resumo:
Biological experiments often produce enormous amount of data, which are usually analyzed by data clustering. Cluster analysis refers to statistical methods that are used to assign data with similar properties into several smaller, more meaningful groups. Two commonly used clustering techniques are introduced in the following section: principal component analysis (PCA) and hierarchical clustering. PCA calculates the variance between variables and groups them into a few uncorrelated groups or principal components (PCs) that are orthogonal to each other. Hierarchical clustering is carried out by separating data into many clusters and merging similar clusters together. Here, we use an example of human leukocyte antigen (HLA) supertype classification to demonstrate the usage of the two methods. Two programs, Generating Optimal Linear Partial Least Square Estimations (GOLPE) and Sybyl, are used for PCA and hierarchical clustering, respectively. However, the reader should bear in mind that the methods have been incorporated into other software as well, such as SIMCA, statistiXL, and R.
Resumo:
Principal component analysis (PCA) is well recognized in dimensionality reduction, and kernel PCA (KPCA) has also been proposed in statistical data analysis. However, KPCA fails to detect the nonlinear structure of data well when outliers exist. To reduce this problem, this paper presents a novel algorithm, named iterative robust KPCA (IRKPCA). IRKPCA works well in dealing with outliers, and can be carried out in an iterative manner, which makes it suitable to process incremental input data. As in the traditional robust PCA (RPCA), a binary field is employed for characterizing the outlier process, and the optimization problem is formulated as maximizing marginal distribution of a Gibbs distribution. In this paper, this optimization problem is solved by stochastic gradient descent techniques. In IRKPCA, the outlier process is in a high-dimensional feature space, and therefore kernel trick is used. IRKPCA can be regarded as a kernelized version of RPCA and a robust form of kernel Hebbian algorithm. Experimental results on synthetic data demonstrate the effectiveness of IRKPCA. © 2010 Taylor & Francis.
Resumo:
Background: Identifying biological markers to aid diagnosis of bipolar disorder (BD) is critically important. To be considered a possible biological marker, neural patterns in BD should be discriminant from those in healthy individuals (HI). We examined patterns of neuromagnetic responses revealed by magnetoencephalography (MEG) during implicit emotion-processing using emotional (happy, fearful, sad) and neutral facial expressions, in sixteen BD and sixteen age- and gender-matched healthy individuals. Methods: Neuromagnetic data were recorded using a 306-channel whole-head MEG ELEKTA Neuromag System, and preprocessed using Signal Space Separation as implemented in MaxFilter (ELEKTA). Custom Matlab programs removed EOG and ECG signals from filtered MEG data, and computed means of epoched data (0-250ms, 250-500ms, 500-750ms). A generalized linear model with three factors (individual, emotion intensity and time) compared BD and HI. A principal component analysis of normalized mean channel data in selected brain regions identified principal components that explained 95% of data variation. These components were used in a quadratic support vector machine (SVM) pattern classifier. SVM classifier performance was assessed using the leave-one-out approach. Results: BD and HI showed significantly different patterns of activation for 0-250ms within both left occipital and temporal regions, specifically for neutral facial expressions. PCA analysis revealed significant differences between BD and HI for mild fearful, happy, and sad facial expressions within 250-500ms. SVM quadratic classifier showed greatest accuracy (84%) and sensitivity (92%) for neutral faces, in left occipital regions within 500-750ms. Conclusions: MEG responses may be used in the search for disease specific neural markers.
Resumo:
Using a Markov switching unobserved component model we decompose the term premium of the North American CDX index into a permanent and a stationary component. We establish that the inversion of the CDX term premium is induced by sudden changes in the unobserved stationary component, which represents the evolution of the fundamentals underpinning the probability of default in the economy. We find evidence that the monetary policy response from the Fed during the crisis period was effective in reducing the volatility of the term premium. We also show that equity returns make a substantial contribution to the term premium over the entire sample period.