7 resultados para Empirical Methods in NLP
em DigitalCommons@The Texas Medical Center
Resumo:
Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.
Resumo:
Li-Fraumeni syndrome (LFS) is characterized by a variety of neoplasms occurring at a young age with an apparent autosomal dominant transmission. Individuals in pedigrees with LFS have high incidence of second malignancies. Recently LFS has been found to be associated with germline mutations of a tumor-suppressor gene, p53. Because LFS is rare and indeed not a clear-cut disease, it is not known whether all cases of LFS are attributable to p53 germline mutations and how p53 plays in cancer occurrence in such cancer syndrome families. In the present study, DNAs from constitutive cells of two-hundred and thirty-three family members from ten extended pedigrees were screened for p53 mutations. Six out of the ten LFS families had germline mutations at the p53 locus, including point and deletion mutations. In these six families, 55 out of 146 members were carriers of p53 mutations. Except one, all mutations occurred in exons 5 to 8 (i.e., the "hot spot" region) of the p53 gene. The age-specific penetrance of cancer was estimated after the genotype for each family member at risk was determined. The penetrance was 0.15, 0.29, 0.35, 0.77, and 0.91 by 20, 30, 40, 50 and 60 year-old, respectively, in male carriers; 0.19, 0.44, 0.76, and 0.90 by 20, 30, 40, and 50 year-old, respectively, in female carriers. These results indicated that one cannot escape from tumorigenesis if one inherits a p53 mutant allele; at least ninety percent of p53 carriers will develop cancer by the age of 60. To evaluate the possible bias due to the unexamined blood-relatives in LFS families, I performed a simulation analysis in which a p53 genotype was assigned to each unexamined person based on his cancer status and liability to cancer. The results showed that the penetrance estimates were not biased by the unexamined relatives. I also determined the sex, site, and age-specific penetrance of breast cancer in female carriers and lung cancer in male carriers. The penetrance of breast cancer in female carriers was 0.81 by age 45; the penetrance of lung cancer in male carriers was 0.78 by age 60, indicating that p53 play a key role for tumorigenesis in common cancers. ^
Resumo:
OBJECTIVE: To systematically review published literature to examine the complications associated with the use of misoprostol and compare these complications to those associated with other forms of abortion induction. ^ DATA SOURCES: Studies were identified through searches of medical literature databases including Medline (Ovid), PubMed (NLM), LILACS, sciELO, and AIM (AFRO), and review of references of relevant articles. ^ STUDY SELECTION AND METHODS: A descriptive systematic review that included studies reported in English and published before December 2012. Eligibility criteria included: misoprostol (with or without other methods) and any other method of abortion in a developing country, as well as quantitative data on the complication of each method. The following is information extracted from each study: author/year, country/city, study design/study sample, age range, setting of data collection, sample size, the method of abortion induction, the number of cases for each method, and the percentage of complications with each method. RESULTS: A total of 4 studies were identified (all in Latin America) describing post-abortion complications of misoprostol and other methods in countries where abortion is generally considered unsafe and/or illegal. The four studies reported on a range of complications including: bleeding, infection, incomplete abortion, intense pelvic pain, uterine perforation, headache, diarrhea, nausea, mechanical lesions, and systemic collapse. The most prevalent complications of misoprostol-induced abortion reported were: bleeding (7-82%), incomplete abortion (33-70%), and infection (0.8-67%). The prevalence of these complications reported from other abortion methods include: bleeding (16-25%), incomplete abortion (15-82%), and infection (13-50%). ^ CONCLUSION: The literature identified by this systematic review is inadequate for determining the complications of misoprostol used in unsafe settings. Abortion is considered an illicit behavior in these countries, therefore making it difficult to investigate the details needed to conduct a study on abortion complications. Given the differences between the reviewed studies as well as a variety of study limitations, it is not possible to draw firm conclusions about the rates of specific-abortion related complications.^
Resumo:
The research project is an extension of a series of administrative science and health care research projects evaluating the influence of external context, organizational strategy, and organizational structure upon organizational success or performance. The research will rely on the assumption that there is not one single best approach to the management of organizations (the contingency theory). As organizational effectiveness is dependent on an appropriate mix of factors, organizations may be equally effective based on differing combinations of factors. The external context of the organization is expected to influence internal organizational strategy and structure and in turn the internal measures affect performance (discriminant theory). The research considers the relationship of external context and organization performance.^ The unit of study for the research will be the health maintenance organization (HMO); an organization the accepts in exchange for a fixed, advance capitation payment, contractual responsibility to assure the delivery of a stated range of health sevices to a voluntary enrolled population. With the current Federal resurgence of interest in the Health Maintenance Organization (HMO) as a major component in the health care system, attention must be directed at maximizing development of HMOs from the limited resources available. Increased skills are needed in both Federal and private evaluation of HMO feasibility in order to prevent resource investment and in projects that will fail while concurrently identifying potentially successful projects that will not be considered using current standards.^ The research considers 192 factors measuring contextual milieu (social, educational, economic, legal, demographic, health and technological factors). Through intercorrelation and principle components data reduction techniques this was reduced to 12 variables. Two measures of HMO performance were identified, they are (1) HMO status (operational or defunct), and (2) a principle components factor score considering eight measures of performance. The relationship between HMO context and performance was analysed using correlation and stepwise multiple regression methods. In each case it has been concluded that the external contextual variables are not predictive of success or failure of study Health Maintenance Organizations. This suggests that performance of an HMO may rely on internal organizational factors. These findings have policy implications as contextual measures are used as a major determinant in HMO feasibility analysis, and as a factor in the allocation of limited Federal funds. ^
Resumo:
Studies have shown that rare genetic variants have stronger effects in predisposing common diseases, and several statistical methods have been developed for association studies involving rare variants. In order to better understand how these statistical methods perform, we seek to compare two recently developed rare variant statistical methods (VT and C-alpha) on 10,000 simulated re-sequencing data sets with disease status and the corresponding 10,000 simulated null data sets. The SLC1A1 gene has been suggested to be associated with diastolic blood pressure (DBP) in previous studies. In the current study, we applied VT and C-alpha methods to the empirical re-sequencing data for the SLC1A1 gene from 300 whites and 200 blacks. We found that VT method obtains higher power and performs better than C-alpha method with the simulated data we used. The type I errors were well-controlled for both methods. In addition, both VT and C-alpha methods suggested no statistical evidence for the association between the SLC1A1 gene and DBP. Overall, our findings provided an important comparison of the two statistical methods for future reference and provided preliminary and pioneer findings on the association between the SLC1A1 gene and blood pressure.^