28 resultados para Generalized Logistic Model

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In 2011, there will be an estimated 1,596,670 new cancer cases and 571,950 cancer-related deaths in the US. With the ever-increasing applications of cancer genetics in epidemiology, there is great potential to identify genetic risk factors that would help identify individuals with increased genetic susceptibility to cancer, which could be used to develop interventions or targeted therapies that could hopefully reduce cancer risk and mortality. In this dissertation, I propose to develop a new statistical method to evaluate the role of haplotypes in cancer susceptibility and development. This model will be flexible enough to handle not only haplotypes of any size, but also a variety of covariates. I will then apply this method to three cancer-related data sets (Hodgkin Disease, Glioma, and Lung Cancer). I hypothesize that there is substantial improvement in the estimation of association between haplotypes and disease, with the use of a Bayesian mathematical method to infer haplotypes that uses prior information from known genetics sources. Analysis based on haplotypes using information from publically available genetic sources generally show increased odds ratios and smaller p-values in both the Hodgkin, Glioma, and Lung data sets. For instance, the Bayesian Joint Logistic Model (BJLM) inferred haplotype TC had a substantially higher estimated effect size (OR=12.16, 95% CI = 2.47-90.1 vs. 9.24, 95% CI = 1.81-47.2) and more significant p-value (0.00044 vs. 0.008) for Hodgkin Disease compared to a traditional logistic regression approach. Also, the effect sizes of haplotypes modeled with recessive genetic effects were higher (and had more significant p-values) when analyzed with the BJLM. Full genetic models with haplotype information developed with the BJLM resulted in significantly higher discriminatory power and a significantly higher Net Reclassification Index compared to those developed with haplo.stats for lung cancer. Future analysis for this work could be to incorporate the 1000 Genomes project, which offers a larger selection of SNPs can be incorporated into the information from known genetic sources as well. Other future analysis include testing non-binary outcomes, like the levels of biomarkers that are present in lung cancer (NNK), and extending this analysis to full GWAS studies.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The ordinal logistic regression models are used to analyze the dependant variable with multiple outcomes that can be ranked, but have been underutilized. In this study, we describe four logistic regression models for analyzing the ordinal response variable. ^ In this methodological study, the four regression models are proposed. The first model uses the multinomial logistic model. The second is adjacent-category logit model. The third is the proportional odds model and the fourth model is the continuation-ratio model. We illustrate and compare the fit of these models using data from the survey designed by the University of Texas, School of Public Health research project PCCaSO (Promoting Colon Cancer Screening in people 50 and Over), to study the patient’s confidence in the completion colorectal cancer screening (CRCS). ^ The purpose of this study is two fold: first, to provide a synthesized review of models for analyzing data with ordinal response, and second, to evaluate their usefulness in epidemiological research, with particular emphasis on model formulation, interpretation of model coefficients, and their implications. Four ordinal logistic models that are used in this study include (1) Multinomial logistic model, (2) Adjacent-category logistic model [9], (3) Continuation-ratio logistic model [10], (4) Proportional logistic model [11]. We recommend that the analyst performs (1) goodness-of-fit tests, (2) sensitivity analysis by fitting and comparing different models.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Unlike infections occurring during periods of chemotherapy-induced neutropenia, postoperative infections in patients with solid malignancy remain largely understudied. The purpose of this population-based study was to evaluate the clinical and economic burden, as well as the relationship of hospital surgical volume and outcomes associated with serious postoperative infection (SPI) – i.e., bacteremia/sepsis, pneumonia, and wound infection – following resection of common solid tumors.^ From the Texas Discharge Data Research File, we identified all Texas residents who underwent resection of cancer of the lung, esophagus, stomach, pancreas, colon, or rectum between 2002 and 2006. From their billing records, we identified ICD-9 codes indicating SPI and also subsequent SPI-related readmissions occurring within 30 days of surgery. Random-effects logistic regression was used to calculate the impact of SPI on mortality, as well as the association between surgical volume and SPI, adjusting for case-mix, hospital characteristics, and clustering of multiple surgical admissions within the same patient and patients within the same hospital. Excess bed days and costs were calculated by subtracting values for patients without infections from those with infections computed using multilevel mixed-effects generalized linear model by fitting a gamma distribution to the data using log link.^ Serious postoperative infection occurred following 9.4% of the 37,582 eligible tumor resections and was independently associated with an 11-fold increase in the odds of in-hospital mortality (95% Confidence Interval [95% CI], 6.7-18.5, P < 0.001). Patients with SPI required 6.3 additional hospital days (95% CI, 6.1 - 6.5) at an incremental cost of $16,396 (95% CI, $15,927–$16,875). There was a significant trend toward lower overall rates of SPI with higher surgical volume (P=0.037). ^ Due to the substantial morbidity, mortality, and excess costs associated with SPI following solid tumor resections and given that, under current reimbursement practices, most of this heavy burden is borne by acute care providers, it is imperative for hospitals to identify more effective prophylactic measures, so that these potentially preventable infections and their associated expenditures can be averted. Additional volume-outcomes research is also needed to identify infection prevention processes that can be transferred from higher- to lower-volume providers.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Generalized linear Poisson and logistic regression models were utilized to examine the relationship between temperature and precipitation and cases of Saint Louis encephalitis virus spread in the Houston metropolitan area. The models were investigated with and without repeated measures, with a first order autoregressive (AR1) correlation structure used for the repeated measures model. The two types of Poisson regression models, with and without correlation structure, showed that a unit increase in temperature measured in degrees Fahrenheit increases the occurrence of the virus 1.7 times and a unit increase in precipitation measured in inches increases the occurrence of the virus 1.5 times. Logistic regression did not show these covariates to be significant as predictors for encephalitis activity in Houston for either correlation structure. This discrepancy for the logistic model could be attributed to the small data set.^ Keywords: Saint Louis Encephalitis; Generalized Linear Model; Poisson; Logistic; First Order Autoregressive; Temperature; Precipitation. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A Bayesian approach to estimation of the regression coefficients of a multinominal logit model with ordinal scale response categories is presented. A Monte Carlo method is used to construct the posterior distribution of the link function. The link function is treated as an arbitrary scalar function. Then the Gauss-Markov theorem is used to determine a function of the link which produces a random vector of coefficients. The posterior distribution of the random vector of coefficients is used to estimate the regression coefficients. The method described is referred to as a Bayesian generalized least square (BGLS) analysis. Two cases involving multinominal logit models are described. Case I involves a cumulative logit model and Case II involves a proportional-odds model. All inferences about the coefficients for both cases are described in terms of the posterior distribution of the regression coefficients. The results from the BGLS method are compared to maximum likelihood estimates of the regression coefficients. The BGLS method avoids the nonlinear problems encountered when estimating the regression coefficients of a generalized linear model. The method is not complex or computationally intensive. The BGLS method offers several advantages over Bayesian approaches. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The history of the logistic function since its introduction in 1838 is reviewed, and the logistic model for a polychotomous response variable is presented with a discussion of the assumptions involved in its derivation and use. Following this, the maximum likelihood estimators for the model parameters are derived along with a Newton-Raphson iterative procedure for evaluation. A rigorous mathematical derivation of the limiting distribution of the maximum likelihood estimators is then presented using a characteristic function approach. An appendix with theorems on the asymptotic normality of sample sums when the observations are not identically distributed, with proofs, supports the presentation on asymptotic properties of the maximum likelihood estimators. Finally, two applications of the model are presented using data from the Hypertension Detection and Follow-up Program, a prospective, population-based, randomized trial of treatment for hypertension. The first application compares the risk of five-year mortality from cardiovascular causes with that from noncardiovascular causes; the second application compares risk factors for fatal or nonfatal coronary heart disease with those for fatal or nonfatal stroke. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The main objective of this study was to develop and validate a computer-based statistical algorithm based on a multivariable logistic model that can be translated into a simple scoring system in order to ascertain stroke cases using hospital admission medical records data. This algorithm, the Risk Index Score (RISc), was developed using data collected prospectively by the Brain Attack Surveillance in Corpus Christ (BASIC) project. The validity of the RISc was evaluated by estimating the concordance of scoring system stroke ascertainment to stroke ascertainment accomplished by physician review of hospital admission records. The goal of this study was to develop a rapid, simple, efficient, and accurate method to ascertain the incidence of stroke from routine hospital admission hospital admission records for epidemiologic investigations. ^ The main objectives of this study were to develop and validate a computer-based statistical algorithm based on a multivariable logistic model that could be translated into a simple scoring system to ascertain stroke cases using hospital admission medical records data. (Abstract shortened by UMI.)^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The study objectives were to determine risk factors for preterm labor (PTL) in Colorado Springs, CO, with emphasis on altitude and psychosocial factors, and to develop a model that identifies women at high risk for PTL. Three hundred and thirty patients with PTL were matched to 460 control patients without PTL using insurance category as an indirect measure of social class. Data were gathered by patient interview and review of medical records. Seven risk groups were compared: (1) Altitude change and travel; (2) Psychosocial ((a) child, sexual, spouse, alcohol and drug abuse; (b) neuroses and psychoses; (c) serious accidents and injuries; (d) broken home (maternal parental separation); (e) assault (physical and sexual); and (f) stress (emotional, domestic, occupational, financial and general)); (3) demographic; (4) maternal physical condition; (5) Prenatal care; (6) Behavioral risks; and (7) Medical factors. Analysis was by logistic regression. Results demonstrated altitude change before or after conception and travel during pregnancy to be non-significant, even after adjustment for potential confounding variables. Five significant psychosocial risk factors were determined: Maternal sex abuse (p = 0.006), physical assault (p = 0.025), nervous breakdown (p = 0.011), past occupational injury (p = 0.016), and occupational stress (p = 0.028). Considering all seven risk groups in the logistic regression, we chose a logistic model with 11 risk factors. Two risk factors were psychosocial (maternal spouse abuse and past occupational injury), 1 was pertinent to maternal physical condition ($\le$130 lbs. pre-pregnancy weight), 1 to prenatal care ($\le$10 prenatal care visits), 2 pertinent to behavioral risks ($>$15 cigarettes per day and $\le$30 lbs. weight gain) and 5 medical factors (abnormal genital culture, previous PTB, primiparity, vaginal bleeding and vaginal discharge). We conclude that altitude change is not a risk factor for PTL and that selected psychosocial factors are significant risk factors for PTL. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A case-control study has been conducted examining the relationship between preterm birth and occupational physical activity among U.S. Army enlisted gravidas from 1981 to 1984. The study includes 604 cases (37 or less weeks gestation) and 6,070 controls (greater than 37 weeks gestation) treated at U.S. Army medical treatment facilities worldwide. Occupational physical activity was measured using existing physical demand ratings of military occupational specialties.^ A statistically significant trend of preterm birth with increasing physical demand level was found (p = 0.0056). The relative risk point estimates for the two highest physical demand categories were statistically significant, RR's = 1.69 (p = 0.02) and 1.75 (p = 0.01), respectively. Six of eleven additional variables were also statistically significant predictors of preterm birth: age (less than 20), race (non-white), marital status (single, never married), paygrade (E1 - E3), length of military service (less than 2 years), and aptitude score (less than 100).^ Multivariate analyses using the logistic model resulted in three statistically significant risk factors for preterm birth: occupational physical demand; lower paygrade; and non-white race. Controlling for race and paygrade, the two highest physical demand categories were again statistically significant with relative risk point estimates of 1.56 and 1.70, respectively. The population attributable risk for military occupational physical demand was 26%, adjusted for paygrade and race; 17.5% of the preterm births were attributable to the two highest physical demand categories. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The relationship between serum cholesterol and cancer incidence was investigated in the population of the Hypertension Detection and Follow-up Program (HDFP). The HDFP was a multi-center trial designed to test the effectiveness of a stepped program of medication in reducing mortality associated with hypertension. Over 10,000 participants, ages 30-69, were followed with clinic and home visits for a minimum of five years. Cancer incidence was ascertained from existing study documents, which included hospitalization records, autopsy reports and death certificates. During the five years of follow-up, 286 new cancer cases were documented. The distribution of sites and total number of cases were similar to those predicted using rates from the Third National Cancer Survey. A non-fasting baseline serum cholesterol level was available for most participants. Age, sex, and race specific five-year cancer incidence rates were computed for each cholesterol quartile. Rates were also computed by smoking status, education status, and percent ideal weight quartiles. In addition, these and other factors were investigated with the use of the multiple logistic model.^ For all cancers combined, a significant inverse relationship existed between baseline serum cholesterol levels and cancer incidence. Previously documented associations between smoking, education and cancer were also demonstrated but did not account for the relationship between serum cholesterol and cancer. The relationship was more evident in males than females but this was felt to represent the different distribution of occurrence of specific cancer sites in the two sexes. The inverse relationship existed for all specific sites investigated (except breast) although a level of statistical significance was reached only for prostate carcinoma. Analyses after exclusion of cases diagnosed during the first two years of follow-up still yielded an inverse relationship. Life table analysis indicated that competing risks during the period of follow-up did not account for the existence of an inverse relationship. It is concluded that a weak inverse relationship does exist between serum cholesterol for many but not all cancer sites. This relationship is not due to confounding by other known cancer risk factors, competing risks or persons entering the study with undiagnosed cancer. Not enough information is available at the present time to determine whether this relationship is causal and further research is suggested. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background Past and recent evidence shows that radionuclides in drinking water may be a public health concern. Developmental thresholds for birth defects with respect to chronic low level domestic radiation exposures, such as through drinking water, have not been definitely recognized, and there is a strong need to address this deficiency in information. In this study we examined the geographic distribution of orofacial cleft birth defects in and around uranium mining district Counties in South Texas (Atascosa, Bee, Brooks, Calhoun, Duval, Goliad, Hidalgo, Jim Hogg, Jim Wells, Karnes, Kleberg, Live Oak, McMullen, Nueces, San Patricio, Refugio, Starr, Victoria, Webb, and Zavala), from 1999 to 2007. The probable association of cleft birth defect rates by ZIP codes classified according to uranium and radium concentrations in drinking water supplies was evaluated. Similar associations between orofacial cleft birth defects and radium/radon in drinking water were reported earlier by Cech and co-investigators in another of the Gulf Coast region (Harris County, Texas).50, 55 Since substantial uranium mining activity existed and still exists in South Texas, contamination of drinking water sources with radiation and its relation to birth defects is a ground for concern. ^ Methods Residential addresses of orofacial cleft birth defect cases, as well as live births within the twenty Counties during 1999-2007 were geocoded and mapped. Prevalence rates were calculated by ZIP codes and were mapped accordingly. Locations of drinking water supplies were also geocoded and mapped. ZIP codes were stratified as having high combined uranium (≥30μg/L) vs. low combined uranium (<30μg/L). Likewise, ZIP codes having the uranium isotope, Ra-226 in drinking water, were also stratified as having elevated radium (≥3 pCi/L) vs. low radium (<3 pCi/L). A linear regression was performed using STATA® generalized linear model (GLM) program to evaluate the probable association between cleft birth defect rates by ZIP codes and concentration of uranium and radium via domestic water supply. These rates were further adjusted for potentially confounding variables such as maternal age, education, occupation, and ethnicity. ^ Results This study showed higher rates of cleft births in ZIP codes classified as having high combined uranium versus ZIP codes having low combined uranium. The model was further improved by adding radium stratified as explained above. Adjustment for maternal age and ethnicity did not substantially affect the statistical significance of uranium or radium concentrations in household water supplies. ^ Conclusion Although this study lacks individual exposure levels, the findings suggest a significant association between elevated uranium and radium concentrations in tap water and high orofacial birth defect rates by ZIP codes. Future case-control studies that can measure individual exposure levels and adjust for contending risk factors could result in a better understanding of the exposure-disease association.^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

My dissertation focuses mainly on Bayesian adaptive designs for phase I and phase II clinical trials. It includes three specific topics: (1) proposing a novel two-dimensional dose-finding algorithm for biological agents, (2) developing Bayesian adaptive screening designs to provide more efficient and ethical clinical trials, and (3) incorporating missing late-onset responses to make an early stopping decision. Treating patients with novel biological agents is becoming a leading trend in oncology. Unlike cytotoxic agents, for which toxicity and efficacy monotonically increase with dose, biological agents may exhibit non-monotonic patterns in their dose-response relationships. Using a trial with two biological agents as an example, we propose a phase I/II trial design to identify the biologically optimal dose combination (BODC), which is defined as the dose combination of the two agents with the highest efficacy and tolerable toxicity. A change-point model is used to reflect the fact that the dose-toxicity surface of the combinational agents may plateau at higher dose levels, and a flexible logistic model is proposed to accommodate the possible non-monotonic pattern for the dose-efficacy relationship. During the trial, we continuously update the posterior estimates of toxicity and efficacy and assign patients to the most appropriate dose combination. We propose a novel dose-finding algorithm to encourage sufficient exploration of untried dose combinations in the two-dimensional space. Extensive simulation studies show that the proposed design has desirable operating characteristics in identifying the BODC under various patterns of dose-toxicity and dose-efficacy relationships. Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Simulation studies show that the proposed design substantially outperforms the conventional multi-arm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while at the same time allocating substantially more patients to efficacious treatments. The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while at the same time providing higher power to identify the best treatment at the end of the trial. Phase II trial studies usually are single-arm trials which are conducted to test the efficacy of experimental agents and decide whether agents are promising to be sent to phase III trials. Interim monitoring is employed to stop the trial early for futility to avoid assigning unacceptable number of patients to inferior treatments. We propose a Bayesian single-arm phase II design with continuous monitoring for estimating the response rate of the experimental drug. To address the issue of late-onset responses, we use a piece-wise exponential model to estimate the hazard function of time to response data and handle the missing responses using the multiple imputation approach. We evaluate the operating characteristics of the proposed method through extensive simulation studies. We show that the proposed method reduces the total length of the trial duration and yields desirable operating characteristics for different physician-specified lower bounds of response rate with different true response rates.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background. Kidney disease is a growing public health phenomenon in the U.S. and in the world. Downstream interventions, dialysis and renal transplants covered by Medicare's renal disease entitlement policy in those who are 65 years and over have been expensive treatments that have been not foolproof. The shortage of kidney donors in the U.S. has grown in the last two decades. Therefore study of upstream events in kidney disease development and progression is justified to prevent the rising prevalence of kidney disease. Previous studies have documented the biological route by which obesity can progress and accelerate kidney disease, but health services literature on quantifying the effects of overweight and obesity on economic outcomes in the context of renal disease were lacking. Objectives . The specific aims of this study were (1) to determine the likelihood of overweight and obesity in renal disease and in three specific adult renal disease sub-populations, hypertensive, diabetic and both hypertensive and diabetic (2) to determine the incremental health service use and spending in overweight and obese renal disease populations and (3) to determine who financed the cost of healthcare for renal disease in overweight and obese adult populations less than 65 years of age. Methods. This study was a retrospective cross-sectional study of renal disease cases pooled for years 2002 to 2009 from the Medical Expenditure Panel Survey. The likelihood of overweight and obesity was estimated using chi-square test. Negative binomial regression and generalized gamma model with log link were used to estimate healthcare utilization and healthcare expenditures for six health event categories. Payments by self/family, public and private insurance were described for overweight and obese kidney disease sub-populations. Results. The likelihood of overweight and obesity was 0.29 and 0.46 among renal disease and obesity was common in hypertensive and diabetic renal disease population. Among obese renal disease population, negative binomial regression estimates of healthcare utilization per person per year as compared to normal weight renal disease persons were significant for office-based provider visits and agency home health visits respectively (p=0.001; p=0.005). Among overweight kidney disease population health service use was significant for inpatient hospital discharges (p=0.027). Over years 2002 to 2009, overweight and obese renal disease sub-populations had 53% and 63% higher inpatient facility and doctor expenditures as compared to normal weight renal disease population and these result were statistically significant (p=0.007; p=0.026). Overweigh renal disease population had significant total expenses per person per year for office-based and outpatient associated care. Overweight and obese renal disease persons paid less from out-of-pocket overall compared to normal weight renal disease population. Medicare and Medicaid had the highest mean annual payments for obese renal disease persons, while mean annual payments per year were highest for private insurance among normal weight renal disease population. Conclusion. Overweight and obesity were common in those with acute and chronic kidney disease and resulted in higher healthcare spending and increased utilization of office-based providers, hospital inpatient department and agency home healthcare. Healthcare for overweight and obese renal disease persons younger than 65 years of age was financed more by private and public insurance and less by out of pocket payments. With the increasing epidemic of obesity in the U.S. and the aging of the baby boomer population, the findings of the present study have implications for public health and for greater dissemination of healthcare resources to prevent, manage and delay the onset of overweight and obesity that can progress and accelerate the course of the kidney disease.^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.