5 resultados para principal component regression

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background and Objective. Ever since the human development index was published in 1990 by the United Nations Development Programme (UNDP), many researchers started searching and corporative studying for more effective methods to measure the human development. Published in 1999, Lai’s “Temporal analysis of human development indicators: principal component approach” provided a valuable statistical way on human developmental analysis. This study presented in the thesis is the extension of Lai’s 1999 research. ^ Methods. I used the weighted principal component method on the human development indicators to measure and analyze the progress of human development in about 180 countries around the world from the year 1999 to 2010. The association of the main principal component obtained from the study and the human development index reported by the UNDP was estimated by the Spearman’s rank correlation coefficient. The main principal component was then further applied to quantify the temporal changes of the human development of selected countries by the proposed Z-test. ^ Results. The weighted means of all three human development indicators, health, knowledge, and standard of living, were increased from 1999 to 2010. The weighted standard deviation for GDP per capita was also increased across years indicated the rising inequality of standard of living among countries. The ranking of low development countries by the main principal component (MPC) is very similar to that by the human development index (HDI). Considerable discrepancy between MPC and HDI ranking was found among high development countries with high GDP per capita shifted to higher ranks. The Spearman’s rank correlation coefficient between the main principal component and the human development index were all around 0.99. All the above results were very close to outcomes in Lai’s 1999 report. The Z test result on temporal analysis of main principal components from 1999 to 2010 on Qatar was statistically significant, but not on other selected countries, such as Brazil, Russia, India, China, and U.S.A.^ Conclusion. To synthesize the multi-dimensional measurement of human development into a single index, the weighted principal component method provides a good model by using the statistical tool on a comprehensive ranking and measurement. Since the weighted main principle component index is more objective because of using population of nations as weight, more effective when the analysis is across time and space, and more flexible when the countries reported to the system has been changed year after year. Thus, in conclusion, the index generated by using weighted main principle component has some advantage over the human development index created in UNDP reports.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Radiomics is the high-throughput extraction and analysis of quantitative image features. For non-small cell lung cancer (NSCLC) patients, radiomics can be applied to standard of care computed tomography (CT) images to improve tumor diagnosis, staging, and response assessment. The first objective of this work was to show that CT image features extracted from pre-treatment NSCLC tumors could be used to predict tumor shrinkage in response to therapy. This is important since tumor shrinkage is an important cancer treatment endpoint that is correlated with probability of disease progression and overall survival. Accurate prediction of tumor shrinkage could also lead to individually customized treatment plans. To accomplish this objective, 64 stage NSCLC patients with similar treatments were all imaged using the same CT scanner and protocol. Quantitative image features were extracted and principal component regression with simulated annealing subset selection was used to predict shrinkage. Cross validation and permutation tests were used to validate the results. The optimal model gave a strong correlation between the observed and predicted shrinkages with . The second objective of this work was to identify sets of NSCLC CT image features that are reproducible, non-redundant, and informative across multiple machines. Feature sets with these qualities are needed for NSCLC radiomics models to be robust to machine variation and spurious correlation. To accomplish this objective, test-retest CT image pairs were obtained from 56 NSCLC patients imaged on three CT machines from two institutions. For each machine, quantitative image features with concordance correlation coefficient values greater than 0.90 were considered reproducible. Multi-machine reproducible feature sets were created by taking the intersection of individual machine reproducible feature sets. Redundant features were removed through hierarchical clustering. The findings showed that image feature reproducibility and redundancy depended on both the CT machine and the CT image type (average cine 4D-CT imaging vs. end-exhale cine 4D-CT imaging vs. helical inspiratory breath-hold 3D CT). For each image type, a set of cross-machine reproducible, non-redundant, and informative image features was identified. Compared to end-exhale 4D-CT and breath-hold 3D-CT, average 4D-CT derived image features showed superior multi-machine reproducibility and are the best candidates for clinical correlation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background. It is important to understand the association between diet and risk of pancreatic cancer in order to better understand the etiology of pancreatic cancer.^ Objectives. Describe the dietary patterns of cases of adenocarcinoma of the pancreas and non-cancer controls and evaluate the odds of having a healthy eating pattern among cases and non-cancer controls.^ Design and Methods. An ongoing hospital-based case-control study was conducted in Houston, Texas from 2000-2008 with 678 pancreatic adenocarcinoma cases and 724 controls. Participants completed a food frequency questionnaire and a risk factor questionnaire. Dietary patterns were derived by principal component analysis and associations between dietary patterns and pancreatic cancer risk were assessed using unconditional logistic regression.^ Results. Two dietary patterns were derived: fruit-vegetable and high fat-meat. There were no statistically significant associations between the fruit-vegetable pattern and pancreatic cancer. An inverse association was seen between the high fat-meat pattern and pancreatic cancer risk when comparing those in the upper intake quintile to those scoring in the lowest quintile after adjusting for demographic and risk factor variables (OR=0.67, p=0.03). In sex-stratified analysis adjusted for demographic and risk factor variables, females scoring in the upper intake quintile of the fruit-vegetable pattern had a 49% lower risk of pancreatic cancer compared to females scoring in the lowest quintile (OR=0.51, p=0.03). An inverse relationship was also seen for the high fat-meat pattern when comparing females in the upper intake quintile to females in the lowest quintile (OR=0.50, p=0.03). In males, neither dietary pattern was significantly associated with pancreatic cancer.^ Conclusions. The current findings for the fruit-vegetable pattern are similar to those of previous studies and support the hypothesis that there is an inverse association between a “healthy” diet (comprised of fruits, vegetables, and whole grains) and risk of having pancreatic cancer (in females only). However, the inverse relationship with the high fat-meat pattern and risk of pancreatic cancer is contrary to other results. Further research on dietary patters and pancreatic cancer risk may lead to better understanding of the etiologic cause of pancreatic cancer.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Longitudinal principal components analyses on a combination of four subcutaneous skinfolds (biceps, triceps, subscapular and suprailiac) were performed using data from the London Longitudinal Growth Study. The main objectives were to discover at what age during growth sex differences in body fat distribution occur and to see if there is continuity in body fatness and body fat distribution from childhood into the adult status (18 years). The analyses were done for four age sectors (3mon-3yrs, 3yrs-8yrs, 8yrs-18yrs and 3yrs-18yrs). Longitudinal principal component one (LPC1) for each age interval in both sexes represents the population mean fat curve. Component two (LPC2) is a velocity of fatness component. Component three (LPC3) in the 3mon-3yrs age sector represents infant fat wave in both sexes. In the next two age sectors component three in males represents peaks and shifts in fat growth (change in velocity), while in females it represents body fat distribution. Component four (LPC4) in the same two age sectors is a reversal in the sexes of the patterns seen for component three, i.e., in males it is body fat distribution and in females velocity shifts. Components five and above represent more complicated patterns of change (multiple increases and decreases across the age interval). In both sexes there is strong tracking in fatness from middle childhood to adolescence. In males only there is also a low to moderate tracking of infant fat with middle to late childhood fat. These data are strongly supported in the literature. Several factors are known to predict adult fatness among the most important being previous levels of fatness (at earlier ages) and the age at rebound. In addition we found that the velocity of fat change in middle childhood was highly predictive of later fatness (r $\approx -$0.7), even more so than age at rebound (r $\approx -$0.5). In contrast to fatness (LPC1), body fat distribution (LPC3-LPC4) did not track well even though significant components of body fat distribution occur at each age. Tracking of body fat distribution was higher in females than males. Sex differences in body fat distribution are non existent. Some sex differences are evident with the peripheral-to-central ratios after age 14 years. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Pathway based genome wide association study evolves from pathway analysis for microarray gene expression and is under rapid development as a complementary for single-SNP based genome wide association study. However, it faces new challenges, such as the summarization of SNP statistics to pathway statistics. The current study applies the ridge regularized Kernel Sliced Inverse Regression (KSIR) to achieve dimension reduction and compared this method to the other two widely used methods, the minimal-p-value (minP) approach of assigning the best test statistics of all SNPs in each pathway as the statistics of the pathway and the principal component analysis (PCA) method of utilizing PCA to calculate the principal components of each pathway. Comparison of the three methods using simulated datasets consisting of 500 cases, 500 controls and100 SNPs demonstrated that KSIR method outperformed the other two methods in terms of causal pathway ranking and the statistical power. PCA method showed similar performance as the minP method. KSIR method also showed a better performance over the other two methods in analyzing a real dataset, the WTCCC Ulcerative Colitis dataset consisting of 1762 cases, 3773 controls as the discovery cohort and 591 cases, 1639 controls as the replication cohort. Several immune and non-immune pathways relevant to ulcerative colitis were identified by these methods. Results from the current study provided a reference for further methodology development and identified novel pathways that may be of importance to the development of ulcerative colitis.^