5 resultados para Mean-variance analysis
em DigitalCommons@The Texas Medical Center
Resumo:
I have undertaken measurements of the genetic (or inherited) and nongenetic (or noninherited) components of the variability of metastasis formation and tumor diameter doubling time in more than 100 metastatic lines from each of three murine tumors (sarcoma SANH, sarcoma SA4020, and hepatocarcinoma HCA-I) syngeneic to C3Hf/Kam mice. These lines were isolated twice from lung metastases and analysed immediately thereafter to obtain the variance to spontaneous lung metastasis and tumor diameter doubling time. Additional studies utilized cells obtained from within 4 passages of isolation. Under the assumption that no genetic differences in metastasis formation or diameter doubling time existed among the cells of a given line, the variance within a line would estimate nongenetic variation. The variability derived from differences between lines would represent genetic origin. The estimates of the genetic contribution to the variation of metastasis and tumor diameter doubling time were significantly greater than zero, but only in the metastatic lines of tumor SANH was genetic variation the major source of metastatic variability (contributing 53% of the variability). In the tumor cell lines of SA4020 and HCA-I, however, the contribution of nongenetic factors predominated over genetic factors in the variability of the number of metastasis and tumor diameter doubling time. A number of other parameters examined, such as DNA content, karyotype, and selection and variance analysis with passage in vivo, indicated that genetic differences existed within the cell lines and that these differences were probably created by genetic instability. The mean metastatic propensity of the lines may have increased somewhat during their isolation and isotransplantation, but the variance was only slightly affected, if at all. Analysis of the DNA profiles of the metastatic lines of SA4020 and HCA-I revealed differences between these lines and their primary parent tumors, but not among the SANH lines and their parent tumor. Furthermore, there was a direct correlation between the extent of genetic influence on metastasis formation and the ability of the tumor cells to develop resistance to cisplatinum. Thus although nongenetic factors might predominate in contributing to metastasis formation, it is probably genetic variation and genetic instability that cause the progression of tumor cells to a more metastatic phenotype and leads to the emergence of drug resistance. ^
Resumo:
The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^
Resumo:
Improvements in the analysis of microarray images are critical for accurately quantifying gene expression levels. The acquisition of accurate spot intensities directly influences the results and interpretation of statistical analyses. This dissertation discusses the implementation of a novel approach to the analysis of cDNA microarray images. We use a stellar photometric model, the Moffat function, to quantify microarray spots from nylon microarray images. The inherent flexibility of the Moffat shape model makes it ideal for quantifying microarray spots. We apply our novel approach to a Wilms' tumor microarray study and compare our results with a fixed-circle segmentation approach for spot quantification. Our results suggest that different spot feature extraction methods can have an impact on the ability of statistical methods to identify differentially expressed genes. We also used the Moffat function to simulate a series of microarray images under various experimental conditions. These simulations were used to validate the performance of various statistical methods for identifying differentially expressed genes. Our simulation results indicate that tests taking into account the dependency between mean spot intensity and variance estimation, such as the smoothened t-test, can better identify differentially expressed genes, especially when the number of replicates and mean fold change are low. The analysis of the simulations also showed that overall, a rank sum test (Mann-Whitney) performed well at identifying differentially expressed genes. Previous work has suggested the strengths of nonparametric approaches for identifying differentially expressed genes. We also show that multivariate approaches, such as hierarchical and k-means cluster analysis along with principal components analysis, are only effective at classifying samples when replicate numbers and mean fold change are high. Finally, we show how our stellar shape model approach can be extended to the analysis of 2D-gel images by adapting the Moffat function to take into account the elliptical nature of spots in such images. Our results indicate that stellar shape models offer a previously unexplored approach for the quantification of 2D-gel spots. ^
Resumo:
The relative influence of race, income, education, and Food Stamp Program participation/nonparticipation on the food and nutrient intake of 102 fecund women ages 18-45 years in a Florida urban clinic population was assessed using the technique of multiple regression analysis. Study subgroups were defined by race and Food Stamp Program participation status. Education was found to have the greatest influence on food and nutrient intake. Race was the next most influential factor followed in order by Food Stamp Program participation and income. The combined effect of the four independent variables explained no more than 19 percent of the variance for any of the food and nutrient intake variables. This would indicate that a more complex model of influences is needed if variations in food and nutrient intake are to be fully explained.^ A socioeconomic questionnaire was administered to investigate other factors of influence. The influence of the mother, frequency and type of restaurant dining, and perceptions of food intake and weight were found to be factors deserving further study.^ Dietary data were collected using the 24-hour recall and food frequency checklist. Descriptive dietary findings indicated that iron and calcium were nutrients where adequacy was of concern for all study subgroups. White Food Stamp Program participants had the greatest number of mean nutrient intake values falling below the 1980 Recommended Dietary Allowances (RDAs). When Food Stamp Program participants were contrasted to nonparticipants, mean intakes of six nutrients (kilocalories, calcium, iron, vitamin A, thiamin, and riboflavin) were below the 1980 RDA compared to five mean nutrient intakes (kilocalories, calcium, iron, thiamin and riboflavin) for the nonparticipants. Use of the Index of Nutritional Quality (INQ), however, revealed that the quality of the diet of Food Stamp Program participants per 1000 kilocalories was adequate with exception of calcium and iron. Intakes of these nutrients were also not adequate on a 1000 kilocalorie basis for the nonparticipant group. When mean nutrient intakes of the groups were compared using Student's t-test oleicacid intake was the only significant difference found. Being a nonparticipant in the Food Stamp Program was found to be associated with more frequent consumption of cookies, sweet rolls, doughnuts, and honey. The findings of this study contradict the negative image of the Food Stamp Program participant and emphasize the importance of education. ^
Resumo:
The electroencephalogram (EEG) is a physiological time series that measures electrical activity at different locations in the brain, and plays an important role in epilepsy research. Exploring the variance and/or volatility may yield insights for seizure prediction, seizure detection and seizure propagation/dynamics.^ Maximal Overlap Discrete Wavelet Transforms (MODWTs) and ARMA-GARCH models were used to determine variance and volatility characteristics of 66 channels for different states of an epileptic EEG – sleep, awake, sleep-to-awake and seizure. The wavelet variances, changes in wavelet variances and volatility half-lives for the four states were compared for possible differences between seizure and non-seizure channels.^ The half-lives of two of the three seizure channels were found to be shorter than all of the non-seizure channels, based on 95% CIs for the pre-seizure and awake signals. No discernible patterns were found the wavelet variances of the change points for the different signals. ^