Biblioteca Digital

15 resultados para Linear multivariate methods

em DigitalCommons@The Texas Medical Center

Multivariate methods for correlated effects sizes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current statistical methods for estimation of parametric effect sizes from a series of experiments are generally restricted to univariate comparisons of standardized mean differences between two treatments. Multivariate methods are presented for the case in which effect size is a vector of standardized multivariate mean differences and the number of treatment groups is two or more. The proposed methods employ a vector of independent sample means for each response variable that leads to a covariance structure which depends only on correlations among the $p$ responses on each subject. Using weighted least squares theory and the assumption that the observations are from normally distributed populations, multivariate hypotheses analogous to common hypotheses used for testing effect sizes were formulated and tested for treatment effects which are correlated through a common control group, through multiple response variables observed on each subject, or both conditions.^ The asymptotic multivariate distribution for correlated effect sizes is obtained by extending univariate methods for estimating effect sizes which are correlated through common control groups. The joint distribution of vectors of effect sizes (from $p$ responses on each subject) from one treatment and one control group and from several treatment groups sharing a common control group are derived. Methods are given for estimation of linear combinations of effect sizes when certain homogeneity conditions are met, and for estimation of vectors of effect sizes and confidence intervals from $p$ responses on each subject. Computational illustrations are provided using data from studies of effects of electric field exposure on small laboratory animals. ^

Airliner cabin air quality exposure assessment

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The airliner cabin environment and its effects on occupant health have not been fully characterized. This dissertation is: (1) A review of airliner environmental control systems (ECSs) that modulate the ventilation, temperature, relative humidity (RH), and barometric pressure (PB) of the cabin environment---variables related to occupant comfort and health. (2) A review and assessment of the methods and findings of key cabin air quality (CAQ) investigations. Several significant deficiencies impede the drawing of inferences about CAQ, e.g., lack of detail about investigative methods, differences in methods between investigations, limited assessment of CAQ variables, small sample sizes, and technological deficiencies of data collection. (3) A comprehensive evaluation of the methods used in the subsequent NIOSH-FAA Airliner CAQ Exposure Assessment Feasibility Study (STUDY) in which this author participated. A number of problems were identified which limit the usefulness of the data. (4) An analysis of the reliable 10-flight STUDY data. Univariate and multivariate methods applied to CO2 (a surrogate for air contaminants), temperature, RH, and PB, in association with percent passenger load, ventilation system, flight duration, airliner body type, and measurement location within the cabin, revealed neither the measured values nor their variability exceeded established health-based exposure limits. Regression analyses suggest CO2, temperature, and RH were affected by percent passenger load. In-flight measurements of CO2 and RH were relatively independent of ventilation system type or flight duration. Cabin temperature was associated with percent passenger load, ventilation system type, and flight duration. (5) A synthesis of the implications of the airliner ECS and cabin O2 environment on occupant health. A model was developed to predict consequences of the airliner cabin pressure altitude 8,000 ft limit and resulting model-estimated PO2 on cardiopulmonary status. Based on the PB, altitude, and environmental data derived from the 10 STUDY flights, the predicted PaO2 of adults with COPD, or elderly adults with or without COPD, breathing ambient cabin air could be < 55 mm Hg (SaO2 < 88%). Reduction in cabin PB found in the STUDY flights could aggravate various medical conditions and require the use of in-flight supplemental O2. ^

An epidemiologic evaluation of a worksite based intervention

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The healthcare industry spends billions on worker injury and employee turnover. Hospitals and healthcare settings have one of the highest rates of lost days due to injuries. The occupational hazards for healthcare workers can be classified into biological, chemical, ergonomic, physical, organizational, and psychosocial. Therefore, interventions addressing a range of occupational health risks are needed to prevent injuries and reduce turnover and reduce costs. ^ The Sacred Vocation Program (SVP) seeks to change the content of work, i.e., the meaningfulness of work, to improve work environments. The SVP intervenes at both the individual and organizational level. First the SVP attempts to connect healthcare workers with meaning from their work through a series of 5 self-discovery group sessions. In a sixth session the graduates take an oath recommitting them to do their work as a vocation. Once motivated to connect with meaning in their work, a representative employee group meets in a second set of five meetings. This representative group suggests organizational changes to create a culture that supports employees in their calling. The employees present their plan in the twelfth session to management beginning a new phase in the existing dialogue between employees and management. ^ The SVP was implemented in a large Dallas hospital (almost 1000 licensed beds). The Baylor University Medical Center (BUMC) Pastoral Care department invited front-line caregivers (primarily Patient Care Assistants, PCAs, or Patient Care Technicians, PCTs) to participate in the SVP. Participants completed SVP questionnaires at the beginning and following SVP implementation. Following implementation, employer records were collected on injury, absence and turnover to further evaluate the program's effectiveness on metrics that are meaningful to managers in assessing organizational performance. This provided an opportunity to perform an epidemiological evaluation of the intervention using the two sources of information: employee self-reports and employer administrative data. ^ The ability to evaluate the effectiveness of the SVP on program outcomes could be limited by the strength of the measures used. An ordinal CFA performed on baseline SVP questionnaire measurements examined the construct validity and reliability of the SVP scales. Scales whose item-factor structure was confirmed in ordinal CFA were evaluated for their psychometric properties (i.e., reliability, mean, ceiling and floor effects). CFA supported the construct validity of six of the proposed scales: blocks to spirituality, meaning at work, work satisfaction, affective commitment, collaborative communication, and MHI-5. Five of the six scales confirmed had acceptable measures of reliability (all but MHI-5 had α>0.7). All six scales had a high percentage (>30%) of the scores at the ceiling. These findings supported the use of these items in the evaluation of change although strong ceiling effects may hinder discerning change. ^ Next, the confirmed SVP scales were used to evaluate whether the intervention improved program constructs. To evaluate the SVP a one group pretest-posttest design compared participants’ self-reports before and after the intervention. It was hypothesized that measurements of reduced blocks to spirituality (α = 0.76), meaning at work (α = 0.86), collaborative communication (α = 0.67) and SVP job tasks (α = 0.97) would improve following SVP implementation. The SVP job tasks scale was included even though it was not included in the ordinal CFA analysis due to a limited sample and high inter-item correlation. Changes in scaled measurements were assessed using multilevel linear regression methods. All post-intervention measurements increased (increases <0.28 points) but only reduced blocks to spirituality was statistically significant (0.22 points on a scale from 1 to 7, p < 0.05) after adjustment for covariates. Intensity of the intervention (stratifying on high participation units) strengthened effects; but were not statistically significant. The findings provide preliminary support for the hypothesis that meaning in work can be improved and, importantly, lend greater credence to any observed improvements in the outcomes. (Abstract shortened by UMI.)^

BAYESIAN STATISTICAL METHODS IN GENE-ENVIRONMENT AND GENE-GENE INTERACTION STUDIES

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.

A comparison of Markov and generalized linear models of hospital mortality

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reports a comparison of three modeling strategies for the analysis of hospital mortality in a sample of general medicine inpatients in a Department of Veterans Affairs medical center. Logistic regression, a Markov chain model, and longitudinal logistic regression were evaluated on predictive performance as measured by the c-index and on accuracy of expected numbers of deaths compared to observed. The logistic regression used patient information collected at admission; the Markov model was comprised of two absorbing states for discharge and death and three transient states reflecting increasing severity of illness as measured by laboratory data collected during the hospital stay; longitudinal regression employed Generalized Estimating Equations (GEE) to model covariance structure for the repeated binary outcome. Results showed that the logistic regression predicted hospital mortality as well as the alternative methods but was limited in scope of application. The Markov chain provides insights into how day to day changes of illness severity lead to discharge or death. The longitudinal logistic regression showed that increasing illness trajectory is associated with hospital mortality. The conclusion is reached that for standard applications in modeling hospital mortality, logistic regression is adequate, but for new challenges facing health services research today, alternative methods are equally predictive, practical, and can provide new insights. ^

Genetic alterations associated with prostate cancer progression

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prostate cancer is the second most commonly diagnosed cancer among men in the United States. In this study, evidence is presented to support the hypothesis that specific chromosomal aberrations (involving one or more chromosomal regions) are associated with prostate cancer progression from organ-confined to locally advanced tumors and that some aberrations seen in high frequency in metastatic tumors may also be present in a subset of primary tumors. To determine the appropriate approach to address this hypothesis, I have established a modified CGH protocol by microdissection and DOP-PCR for use in detecting chromosomal changes in clinical prostate tumor specimens that is more sensitive and accurate than conventional CGH methods. I have successfully performed the improved CGH protocol to screen for genetic changes of 24 organ confined (pT2) and 21 locally advanced (pT3b) clinical prostate cancer specimens without metastases (N0M0). Comparisons of tumors by stage or Gleason scores following contingency table analysis showed that seven regions of the genome differed significantly between pT2 and pT3b tumors or between low and high Gleason tumors suggesting that these regions may be important in local prostate cancer progression. These included losses on 6p21–25, 6q24–27, 8p, 10q25–26, 15q22–26, and 18cen–q12 as well as gain of 3p13–q13. Multivariate analyses showed that loss of 8p (step1) and loss of 6q25–26 (or 6p21–25 or 10q25–26) (step 2) were predictive of pathologic stage or Gleason groups with 80% accuracy. Additional 5–7 steps in the multivariate model increased the predictive value to 91–95%. Comparison of the CGH data from the primary prostate tumors of this study with those obtained from published literature on metastases and recurrent tumors showed that the clinically more aggressive stage pT3b tumors shared more abnormalities in high frequency with metastases and recurrent tumors than less aggressive stage pT2 tumors. Furthermore, loss of 11cen–q22 was shared only between the primary tumors and metastases while gain of Xcen–q13 and loss of 18cen–q12 were in common between primary and recurrent tumors. These analyses suggest that the multistage model of prostate cancer progression is not linear and that some early primary tumors may be predisposed to metastasize or evolve into recurrent tumors due to the presence of specific genetic alterations. ^

Bayesian generalized linear models for meta-analysis of diagnostic tests

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the recognition of the importance of evidence-based medicine, there is an emerging need for methods to systematically synthesize available data. Specifically, methods to provide accurate estimates of test characteristics for diagnostic tests are needed to help physicians make better clinical decisions. To provide more flexible approaches for meta-analysis of diagnostic tests, we developed three Bayesian generalized linear models. Two of these models, a bivariate normal and a binomial model, analyzed pairs of sensitivity and specificity values while incorporating the correlation between these two outcome variables. Noninformative independent uniform priors were used for the variance of sensitivity, specificity and correlation. We also applied an inverse Wishart prior to check the sensitivity of the results. The third model was a multinomial model where the test results were modeled as multinomial random variables. All three models can include specific imaging techniques as covariates in order to compare performance. Vague normal priors were assigned to the coefficients of the covariates. The computations were carried out using the 'Bayesian inference using Gibbs sampling' implementation of Markov chain Monte Carlo techniques. We investigated the properties of the three proposed models through extensive simulation studies. We also applied these models to a previously published meta-analysis dataset on cervical cancer as well as to an unpublished melanoma dataset. In general, our findings show that the point estimates of sensitivity and specificity were consistent among Bayesian and frequentist bivariate normal and binomial models. However, in the simulation studies, the estimates of the correlation coefficient from Bayesian bivariate models are not as good as those obtained from frequentist estimation regardless of which prior distribution was used for the covariance matrix. The Bayesian multinomial model consistently underestimated the sensitivity and specificity regardless of the sample size and correlation coefficient. In conclusion, the Bayesian bivariate binomial model provides the most flexible framework for future applications because of its following strengths: (1) it facilitates direct comparison between different tests; (2) it captures the variability in both sensitivity and specificity simultaneously as well as the intercorrelation between the two; and (3) it can be directly applied to sparse data without ad hoc correction. ^

Analysis of prognostic factors for the development of metastases and survival in renal cell carcinoma patients

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction and objective. A number of prognostic factors have been reported for predicting survival in patients with renal cell carcinoma. Yet few studies have analyzed the effects of those factors at different stages of the disease process. In this study, different stages of disease progression starting from nephrectomy to metastasis, from metastasis to death, and from evaluation to death were evaluated. ^ Methods. In this retrospective follow-up study, records of 97 deceased renal cell carcinoma (RCC) patients were reviewed between September 2006 to October 2006. Patients with TNM Stage IV disease before nephrectomy or with cancer diagnoses other than RCC were excluded leaving 64 records for analysis. Patient TNM staging, Furhman Grade, age, tumor size, tumor volume, histology and patient gender were analyzed in relation to time to metastases. Time from nephrectomy to metastasis, TNM staging, Furhman Grade, age, tumor size, tumor volume, histology and patient gender were tested for significance in relation to time from metastases to death. Finally, analysis of laboratory values at time of evaluation, Eastern Cooperative Oncology Group performance status (ECOG), UCLA Integrated Staging System (UISS), time from nephrectomy to metastasis, TNM staging, Furhman Grade, age, tumor size, tumor volume, histology and patient gender were tested for significance in relation to time from evaluation to death. Linear regression and Cox Proportional Hazard (univariate and multivariate) was used for testing significance. Kaplan-Meier Log-Rank test was used to detect any significance between groups at various endpoints. ^ Results. Compared to negative lymph nodes at time of nephrectomy, a single positive lymph node had significantly shorter time to metastasis (p<0.0001). Compared to other histological types, clear cell histology had significant metastasis free survival (p=0.003). Clear cell histology compared to other types (p=0.0002 univariate, p=0.038 multivariate) and time to metastasis with log conversion (p=0.028) significantly affected time from metastasis to death. A greater than one year and greater than two year metastasis free interval, compared to patients that had metastasis before one and two years, had statistically significant survival benefit (p=0.004 and p=0.0318). Time from evaluation to death was affected by greater than one year metastasis free interval (p=0.0459), alcohol consumption (p=0.044), LDH (p=0.006), ECOG performance status (p<0.001), and hemoglobin level (p=0.0092). The UISS risk stratified the patient population in a statistically significant manner for survival (p=0.001). No other factors were found to be significant. ^ Conclusion. Clear cell histology is predictive for both time to metastasis and metastasis to death. Nodal status at time of nephrectomy may predict risk of metastasis. The time interval to metastasis significantly predicts time from metastasis to death and time from evaluation to death. ECOG performance status, and hemoglobin levels predicts survival outcome at evaluation. Finally, UISS appropriately stratifies risk in our population. ^

A multivariate frailty model for disease recurrences and survival

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A multivariate frailty hazard model is developed for joint-modeling of three correlated time-to-event outcomes: (1) local recurrence, (2) distant recurrence, and (3) overall survival. The term frailty is introduced to model population heterogeneity. The dependence is modeled by conditioning on a shared frailty that is included in the three hazard functions. Independent variables can be included in the model as covariates. The Markov chain Monte Carlo methods are used to estimate the posterior distributions of model parameters. The algorithm used in present application is the hybrid Metropolis-Hastings algorithm, which simultaneously updates all parameters with evaluations of gradient of log posterior density. The performance of this approach is examined based on simulation studies using Exponential and Weibull distributions. We apply the proposed methods to a study of patients with soft tissue sarcoma, which motivated this research. Our results indicate that patients with chemotherapy had better overall survival with hazard ratio of 0.242 (95% CI: 0.094 - 0.564) and lower risk of distant recurrence with hazard ratio of 0.636 (95% CI: 0.487 - 0.860), but not significantly better in local recurrence with hazard ratio of 0.799 (95% CI: 0.575 - 1.054). The advantages and limitations of the proposed models, and future research directions are discussed. ^

Statistical and methodological challenges for disaster preparedness and medical needs assessment in Rio Grande Valley of Texas

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, disaster preparedness through assessment of medical and special needs persons (MSNP) has taken a center place in public eye in effect of frequent natural disasters such as hurricanes, storm surge or tsunami due to climate change and increased human activity on our planet. Statistical methods complex survey design and analysis have equally gained significance as a consequence. However, there exist many challenges still, to infer such assessments over the target population for policy level advocacy and implementation. ^ Objective. This study discusses the use of some of the statistical methods for disaster preparedness and medical needs assessment to facilitate local and state governments for its policy level decision making and logistic support to avoid any loss of life and property in future calamities. ^ Methods. In order to obtain precise and unbiased estimates for Medical Special Needs Persons (MSNP) and disaster preparedness for evacuation in Rio Grande Valley (RGV) of Texas, a stratified and cluster-randomized multi-stage sampling design was implemented. US School of Public Health, Brownsville surveyed 3088 households in three counties namely Cameron, Hidalgo, and Willacy. Multiple statistical methods were implemented and estimates were obtained taking into count probability of selection and clustering effects. Statistical methods for data analysis discussed were Multivariate Linear Regression (MLR), Survey Linear Regression (Svy-Reg), Generalized Estimation Equation (GEE) and Multilevel Mixed Models (MLM) all with and without sampling weights. ^ Results. Estimated population for RGV was 1,146,796. There were 51.5% female, 90% Hispanic, 73% married, 56% unemployed and 37% with their personal transport. 40% people attained education up to elementary school, another 42% reaching high school and only 18% went to college. Median household income is less than $15,000/year. MSNP estimated to be 44,196 (3.98%) [95% CI: 39,029; 51,123]. All statistical models are in concordance with MSNP estimates ranging from 44,000 to 48,000. MSNP estimates for statistical methods are: MLR (47,707; 95% CI: 42,462; 52,999), MLR with weights (45,882; 95% CI: 39,792; 51,972), Bootstrap Regression (47,730; 95% CI: 41,629; 53,785), GEE (47,649; 95% CI: 41,629; 53,670), GEE with weights (45,076; 95% CI: 39,029; 51,123), Svy-Reg (44,196; 95% CI: 40,004; 48,390) and MLM (46,513; 95% CI: 39,869; 53,157). ^ Conclusion. RGV is a flood zone, most susceptible to hurricanes and other natural disasters. People in the region are mostly Hispanic, under-educated with least income levels in the U.S. In case of any disaster people in large are incapacitated with only 37% have their personal transport to take care of MSNP. Local and state government’s intervention in terms of planning, preparation and support for evacuation is necessary in any such disaster to avoid loss of precious human life. ^ Key words: Complex Surveys, statistical methods, multilevel models, cluster randomized, sampling weights, raking, survey regression, generalized estimation equations (GEE), random effects, Intracluster correlation coefficient (ICC).^

THE USE OF MULTIVARIATE ANALYSIS OF COVARIANCE IN EVALUATING THE COMPARABILITY OF LABORATORY DETERMINATIONS IN COOPERATIVE STUDIES

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The role of clinical chemistry has traditionally been to evaluate acutely ill or hospitalized patients. Traditional statistical methods have serious drawbacks in that they use univariate techniques. To demonstrate alternative methodology, a multivariate analysis of covariance model was developed and applied to the data from the Cooperative Study of Sickle Cell Disease.^ The purpose of developing the model for the laboratory data from the CSSCD was to evaluate the comparability of the results from the different clinics. Several variables were incorporated into the model in order to control for possible differences among the clinics that might confound any real laboratory differences.^ Differences for LDH, alkaline phosphatase and SGOT were identified which will necessitate adjustments by clinic whenever these data are used. In addition, aberrant clinic values for LDH, creatinine and BUN were also identified.^ The use of any statistical technique including multivariate analysis without thoughtful consideration may lead to spurious conclusions that may not be corrected for some time, if ever. However, the advantages of multivariate analysis far outweigh its potential problems. If its use increases as it should, the applicability to the analysis of laboratory data in prospective patient monitoring, quality control programs, and interpretation of data from cooperative studies could well have a major impact on the health and well being of a large number of individuals. ^

Detecting genetic and nutritional lung cancer risk factors related to folate metabolism using Bayesian generalized linear models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Complex diseases, such as cancer, are caused by various genetic and environmental factors, and their interactions. Joint analysis of these factors and their interactions would increase the power to detect risk factors but is statistically. Bayesian generalized linear models using student-t prior distributions on coefficients, is a novel method to simultaneously analyze genetic factors, environmental factors, and interactions. I performed simulation studies using three different disease models and demonstrated that the variable selection performance of Bayesian generalized linear models is comparable to that of Bayesian stochastic search variable selection, an improved method for variable selection when compared to standard methods. I further evaluated the variable selection performance of Bayesian generalized linear models using different numbers of candidate covariates and different sample sizes, and provided a guideline for required sample size to achieve a high power of variable selection using Bayesian generalize linear models, considering different scales of number of candidate covariates. ^ Polymorphisms in folate metabolism genes and nutritional factors have been previously associated with lung cancer risk. In this study, I simultaneously analyzed 115 tag SNPs in folate metabolism genes, 14 nutritional factors, and all possible genetic-nutritional interactions from 1239 lung cancer cases and 1692 controls using Bayesian generalized linear models stratified by never, former, and current smoking status. SNPs in MTRR were significantly associated with lung cancer risk across never, former, and current smokers. In never smokers, three SNPs in TYMS and three gene-nutrient interactions, including an interaction between SHMT1 and vitamin B12, an interaction between MTRR and total fat intake, and an interaction between MTR and alcohol use, were also identified as associated with lung cancer risk. These lung cancer risk factors are worthy of further investigation.^

New methods for quantification and analysis of quantitative real-time polymerase chain reaction data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quantitative real-time polymerase chain reaction (qPCR) is a sensitive gene quantitation method that has been widely used in the biological and biomedical fields. The currently used methods for PCR data analysis, including the threshold cycle (CT) method, linear and non-linear model fitting methods, all require subtracting background fluorescence. However, the removal of background fluorescence is usually inaccurate, and therefore can distort results. Here, we propose a new method, the taking-difference linear regression method, to overcome this limitation. Briefly, for each two consecutive PCR cycles, we subtracted the fluorescence in the former cycle from that in the later cycle, transforming the n cycle raw data into n-1 cycle data. Then linear regression was applied to the natural logarithm of the transformed data. Finally, amplification efficiencies and the initial DNA molecular numbers were calculated for each PCR run. To evaluate this new method, we compared it in terms of accuracy and precision with the original linear regression method with three background corrections, being the mean of cycles 1-3, the mean of cycles 3-7, and the minimum. Three criteria, including threshold identification, max R2, and max slope, were employed to search for target data points. Considering that PCR data are time series data, we also applied linear mixed models. Collectively, when the threshold identification criterion was applied and when the linear mixed model was adopted, the taking-difference linear regression method was superior as it gave an accurate estimation of initial DNA amount and a reasonable estimation of PCR amplification efficiencies. When the criteria of max R2 and max slope were used, the original linear regression method gave an accurate estimation of initial DNA amount. Overall, the taking-difference linear regression method avoids the error in subtracting an unknown background and thus it is theoretically more accurate and reliable. This method is easy to perform and the taking-difference strategy can be extended to all current methods for qPCR data analysis.^

Application of the general linear model to assess the effect of missing data on bone marrow mononuclear cell therapy with infusion timing and follow-up MRI

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With most clinical trials, missing data presents a statistical problem in evaluating a treatment's efficacy. There are many methods commonly used to assess missing data; however, these methods leave room for bias to enter the study. This thesis was a secondary analysis on data taken from TIME, a phase 2 randomized clinical trial conducted to evaluate the safety and effect of the administration timing of bone marrow mononuclear cells (BMMNC) for subjects with acute myocardial infarction (AMI).^ We evaluated the effect of missing data by comparing the variance inflation factor (VIF) of the effect of therapy between all subjects and only subjects with complete data. Through the general linear model, an unbiased solution was made for the VIF of the treatment's efficacy using the weighted least squares method to incorporate missing data. Two groups were identified from the TIME data: 1) all subjects and 2) subjects with complete data (baseline and follow-up measurements). After the general solution was found for the VIF, it was migrated Excel 2010 to evaluate data from TIME. The resulting numerical value from the two groups was compared to assess the effect of missing data.^ The VIF values from the TIME study were considerably less in the group with missing data. By design, we varied the correlation factor in order to evaluate the VIFs of both groups. As the correlation factor increased, the VIF values increased at a faster rate in the group with only complete data. Furthermore, while varying the correlation factor, the number of subjects with missing data was also varied to see how missing data affects the VIF. When subjects with only baseline data was increased, we saw a significant rate increase in VIF values in the group with only complete data while the group with missing data saw a steady and consistent increase in the VIF. The same was seen when we varied the group with follow-up only data. This essentially showed that the VIFs steadily increased when missing data is not ignored. When missing data is ignored as with our comparison group, the VIF values sharply increase as correlation increases.^

Robust effect sizes and their confidence intervals for group difference between trajectories in hierarchical linear growth model

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hierarchical linear growth model (HLGM), as a flexible and powerful analytic method, has played an increased important role in psychology, public health and medical sciences in recent decades. Mostly, researchers who conduct HLGM are interested in the treatment effect on individual trajectories, which can be indicated by the cross-level interaction effects. However, the statistical hypothesis test for the effect of cross-level interaction in HLGM only show us whether there is a significant group difference in the average rate of change, rate of acceleration or higher polynomial effect; it fails to convey information about the magnitude of the difference between the group trajectories at specific time point. Thus, reporting and interpreting effect sizes have been increased emphases in HLGM in recent years, due to the limitations and increased criticisms for statistical hypothesis testing. However, most researchers fail to report these model-implied effect sizes for group trajectories comparison and their corresponding confidence intervals in HLGM analysis, since lack of appropriate and standard functions to estimate effect sizes associated with the model-implied difference between grouping trajectories in HLGM, and also lack of computing packages in the popular statistical software to automatically calculate them. ^ The present project is the first to establish the appropriate computing functions to assess the standard difference between grouping trajectories in HLGM. We proposed the two functions to estimate effect sizes on model-based grouping trajectories difference at specific time, we also suggested the robust effect sizes to reduce the bias of estimated effect sizes. Then, we applied the proposed functions to estimate the population effect sizes (d ) and robust effect sizes (du) on the cross-level interaction in HLGM by using the three simulated datasets, and also we compared the three methods of constructing confidence intervals around d and du recommended the best one for application. At the end, we constructed 95% confidence intervals with the suitable method for the effect sizes what we obtained with the three simulated datasets. ^ The effect sizes between grouping trajectories for the three simulated longitudinal datasets indicated that even though the statistical hypothesis test shows no significant difference between grouping trajectories, effect sizes between these grouping trajectories can still be large at some time points. Therefore, effect sizes between grouping trajectories in HLGM analysis provide us additional and meaningful information to assess group effect on individual trajectories. In addition, we also compared the three methods to construct 95% confident intervals around corresponding effect sizes in this project, which handled with the uncertainty of effect sizes to population parameter. We suggested the noncentral t-distribution based method when the assumptions held, and the bootstrap bias-corrected and accelerated method when the assumptions are not met.^