3 resultados para Principal componente analysis
em DigitalCommons@The Texas Medical Center
Resumo:
Background and Objective. Ever since the human development index was published in 1990 by the United Nations Development Programme (UNDP), many researchers started searching and corporative studying for more effective methods to measure the human development. Published in 1999, Lai’s “Temporal analysis of human development indicators: principal component approach” provided a valuable statistical way on human developmental analysis. This study presented in the thesis is the extension of Lai’s 1999 research. ^ Methods. I used the weighted principal component method on the human development indicators to measure and analyze the progress of human development in about 180 countries around the world from the year 1999 to 2010. The association of the main principal component obtained from the study and the human development index reported by the UNDP was estimated by the Spearman’s rank correlation coefficient. The main principal component was then further applied to quantify the temporal changes of the human development of selected countries by the proposed Z-test. ^ Results. The weighted means of all three human development indicators, health, knowledge, and standard of living, were increased from 1999 to 2010. The weighted standard deviation for GDP per capita was also increased across years indicated the rising inequality of standard of living among countries. The ranking of low development countries by the main principal component (MPC) is very similar to that by the human development index (HDI). Considerable discrepancy between MPC and HDI ranking was found among high development countries with high GDP per capita shifted to higher ranks. The Spearman’s rank correlation coefficient between the main principal component and the human development index were all around 0.99. All the above results were very close to outcomes in Lai’s 1999 report. The Z test result on temporal analysis of main principal components from 1999 to 2010 on Qatar was statistically significant, but not on other selected countries, such as Brazil, Russia, India, China, and U.S.A.^ Conclusion. To synthesize the multi-dimensional measurement of human development into a single index, the weighted principal component method provides a good model by using the statistical tool on a comprehensive ranking and measurement. Since the weighted main principle component index is more objective because of using population of nations as weight, more effective when the analysis is across time and space, and more flexible when the countries reported to the system has been changed year after year. Thus, in conclusion, the index generated by using weighted main principle component has some advantage over the human development index created in UNDP reports.^
Resumo:
Improvements in the analysis of microarray images are critical for accurately quantifying gene expression levels. The acquisition of accurate spot intensities directly influences the results and interpretation of statistical analyses. This dissertation discusses the implementation of a novel approach to the analysis of cDNA microarray images. We use a stellar photometric model, the Moffat function, to quantify microarray spots from nylon microarray images. The inherent flexibility of the Moffat shape model makes it ideal for quantifying microarray spots. We apply our novel approach to a Wilms' tumor microarray study and compare our results with a fixed-circle segmentation approach for spot quantification. Our results suggest that different spot feature extraction methods can have an impact on the ability of statistical methods to identify differentially expressed genes. We also used the Moffat function to simulate a series of microarray images under various experimental conditions. These simulations were used to validate the performance of various statistical methods for identifying differentially expressed genes. Our simulation results indicate that tests taking into account the dependency between mean spot intensity and variance estimation, such as the smoothened t-test, can better identify differentially expressed genes, especially when the number of replicates and mean fold change are low. The analysis of the simulations also showed that overall, a rank sum test (Mann-Whitney) performed well at identifying differentially expressed genes. Previous work has suggested the strengths of nonparametric approaches for identifying differentially expressed genes. We also show that multivariate approaches, such as hierarchical and k-means cluster analysis along with principal components analysis, are only effective at classifying samples when replicate numbers and mean fold change are high. Finally, we show how our stellar shape model approach can be extended to the analysis of 2D-gel images by adapting the Moffat function to take into account the elliptical nature of spots in such images. Our results indicate that stellar shape models offer a previously unexplored approach for the quantification of 2D-gel spots. ^
Resumo:
Pathway based genome wide association study evolves from pathway analysis for microarray gene expression and is under rapid development as a complementary for single-SNP based genome wide association study. However, it faces new challenges, such as the summarization of SNP statistics to pathway statistics. The current study applies the ridge regularized Kernel Sliced Inverse Regression (KSIR) to achieve dimension reduction and compared this method to the other two widely used methods, the minimal-p-value (minP) approach of assigning the best test statistics of all SNPs in each pathway as the statistics of the pathway and the principal component analysis (PCA) method of utilizing PCA to calculate the principal components of each pathway. Comparison of the three methods using simulated datasets consisting of 500 cases, 500 controls and100 SNPs demonstrated that KSIR method outperformed the other two methods in terms of causal pathway ranking and the statistical power. PCA method showed similar performance as the minP method. KSIR method also showed a better performance over the other two methods in analyzing a real dataset, the WTCCC Ulcerative Colitis dataset consisting of 1762 cases, 3773 controls as the discovery cohort and 591 cases, 1639 controls as the replication cohort. Several immune and non-immune pathways relevant to ulcerative colitis were identified by these methods. Results from the current study provided a reference for further methodology development and identified novel pathways that may be of importance to the development of ulcerative colitis.^