9 resultados para rainfall-runoff empirical statistical model

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study was to analyze the implementation of national family planning policy in the United States, which was embedded in four separate statutes during the period of study, Fiscal Years 1976-81. The design of the study utilized a modification of the Sabatier and Mazmanian framework for policy analysis, which defined implementation as the carrying out of statutory policy. The study was divided into two phases. The first part of the study compared the implementation of family planning policy by each of the pertinent statutes. The second part of the study identified factors that were associated with implementation of federal family planning policy within the context of block grants.^ Implemention was measured here by federal dollars spent for family planning, adjusted for the size of the respective state target populations. Expenditure data were collected from the Alan Guttmacher Institute and from each of the federal agencies having administrative authority for the four pertinent statutes, respectively. Data from the former were used for most of the analysis because they were more complete and more reliable.^ The first phase of the study tested the hypothesis that the coherence of a statute is directly related to effective implementation. Equity in the distribution of funds to the states was used to operationalize effective implementation. To a large extent, the results of the analysis supported the hypothesis. In addition to their theoretical significance, these findings were also significant for policymakers insofar they demonstrated the effectiveness of categorical legislation in implementing desired health policy.^ Given the current and historically intermittent emphasis on more state and less federal decision-making in health and human serives, the second phase of the study focused on state level factors that were associated with expenditures of social service block grant funds for family planning. Using the Sabatier-Mazmanian implementation model as a framework, many factors were tested. Those factors showing the strongest conceptual and statistical relationship to the dependent variable were used to construct a statistical model. Using multivariable regression analysis, this model was applied cross-sectionally to each of the years of the study. The most striking finding here was that the dominant determinants of the state spending varied for each year of the study (Fiscal Years 1976-1981). The significance of these results was that they provided empirical support of current implementation theory, showing that the dominant determinants of implementation vary greatly over time. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this dissertation was to design and implement strategies for assessment of exposures to organic chemicals used in the production of a styrene-butadiene polymer at the Texas Plastics Company (TPC). Linear statistical retrospective exposure models, univariate and multivariate, were developed based on the validation of historical industrial hygiene monitoring data collected by industrial hygienists at TPC, and additional current industrial hygiene monitoring data collected for the purposes of this study. The current monitoring data served several purposes. First, it provided information on current exposure data, in the form of unbiased estimates of mean exposure to organic chemicals for each job title included. Second, it provided information on homogeneity of exposure within each job title, through the use of a carefully designed sampling scheme which addressed variability of exposure both between and within job titles. Third, it permitted the investigation of how well current exposure data can serve as an evaluation tool for retrospective exposure estimation. Finally, this dissertation investigated the simultaneous evaluation of exposure to several chemicals, as well as the use of values below detection limits in a multivariate linear statistical model of exposures. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The importance of race as a factor in mental health status has been a topic of controversy. This study reviews the history of research in this area and examines racial variances in the relationship between selected socio-demographic variables and general well-being. The study also examines the appropriateness of an additive versus an interactive statistical model for this investigation.^ The sample consists of 6,913 persons who completed the General Well-Being Schedules as administered in the detailed component of the first National Health and Nutrition Examination Survey (NHANES I) conducted by the National Center for Health Statistics between April, 1971 and October, 1975. The sampling design is a multistage, probability sample of clusters of persons in area based segments. Of the 6,913 persons, 873 are Black.^ Unlike other recent community based mental health studies, this study revealed significant differences between the general well-being of Blacks and Whites. Blacks continued to exhibit significantly lower levels of well-being even after adjustments were made for income, education, marital status, sex, age and place of residence. Statistical interaction was found between race and sex with Black females reporting lower levels of well-being than either Black or White males or their White female counterparts.^ The study includes a detailed review of the NHANES I sample design. It is shown that selected aspects of the design make it difficult to render appropriate national comparisons of Black-White differences. As a result conclusions pertaining to these differences based on NHANES I may be of questionable validity. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The efficacy of waste stabilization lagoons for the treatment of five priority pollutants and two widely used commercial compounds was evaluated in laboratory model ponds. Three ponds were designed to simulate a primary anaerobic lagoon, a secondary facultative lagoon, and a tertiary aerobic lagoon. Biodegradation, volatilization, and sorption losses were quantified for bis(2-chloroethyl) ether, benzene, toluene, naphthalene, phenanthrene, ethylene glycol, and ethylene glycol monoethyl ether. A statistical model using a log normal transformation indicated biodegradation of bis(2-chloroethyl) ether followed first-order kinetics. Additionally, multiple regression analysis indicated biochemical oxygen demand was the water quality variable most highly correlated with bis(2-chloroethyl) ether effluent concentration. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Genetic anticipation is defined as a decrease in age of onset or increase in severity as the disorder is transmitted through subsequent generations. Anticipation has been noted in the literature for over a century. Recently, anticipation in several diseases including Huntington's Disease, Myotonic Dystrophy and Fragile X Syndrome were shown to be caused by expansion of triplet repeats. Anticipation effects have also been observed in numerous mental disorders (e.g. Schizophrenia, Bipolar Disorder), cancers (Li-Fraumeni Syndrome, Leukemia) and other complex diseases. ^ Several statistical methods have been applied to determine whether anticipation is a true phenomenon in a particular disorder, including standard statistical tests and newly developed affected parent/affected child pair methods. These methods have been shown to be inappropriate for assessing anticipation for a variety of reasons, including familial correlation and low power. Therefore, we have developed family-based likelihood modeling approaches to model the underlying transmission of the disease gene and penetrance function and hence detect anticipation. These methods can be applied in extended families, thus improving the power to detect anticipation compared with existing methods based only upon parents and children. The first method we have proposed is based on the regressive logistic hazard model. This approach models anticipation by a generational covariate. The second method allows alleles to mutate as they are transmitted from parents to offspring and is appropriate for modeling the known triplet repeat diseases in which the disease alleles can become more deleterious as they are transmitted across generations. ^ To evaluate the new methods, we performed extensive simulation studies for data simulated under different conditions to evaluate the effectiveness of the algorithms to detect genetic anticipation. Results from analysis by the first method yielded empirical power greater than 87% based on the 5% type I error critical value identified in each simulation depending on the method of data generation and current age criteria. Analysis by the second method was not possible due to the current formulation of the software. The application of this method to Huntington's Disease and Li-Fraumeni Syndrome data sets revealed evidence for a generation effect in both cases. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Objective. To measure the demand for primary care and its associated factors by building and estimating a demand model of primary care in urban settings.^ Data source. Secondary data from 2005 California Health Interview Survey (CHIS 2005), a population-based random-digit dial telephone survey, conducted by the UCLA Center for Health Policy Research in collaboration with the California Department of Health Services, and the Public Health Institute between July 2005 and April 2006.^ Study design. A literature review was done to specify the demand model by identifying relevant predictors and indicators. CHIS 2005 data was utilized for demand estimation.^ Analytical methods. The probit regression was used to estimate the use/non-use equation and the negative binomial regression was applied to the utilization equation with the non-negative integer dependent variable.^ Results. The model included two equations in which the use/non-use equation explained the probability of making a doctor visit in the past twelve months, and the utilization equation estimated the demand for primary conditional on at least one visit. Among independent variables, wage rate and income did not affect the primary care demand whereas age had a negative effect on demand. People with college and graduate educational level were associated with 1.03 (p < 0.05) and 1.58 (p < 0.01) more visits, respectively, compared to those with no formal education. Insurance was significantly and positively related to the demand for primary care (p < 0.01). Need for care variables exhibited positive effects on demand (p < 0.01). Existence of chronic disease was associated with 0.63 more visits, disability status was associated with 1.05 more visits, and people with poor health status had 4.24 more visits than those with excellent health status. ^ Conclusions. The average probability of visiting doctors in the past twelve months was 85% and the average number of visits was 3.45. The study emphasized the importance of need variables in explaining healthcare utilization, as well as the impact of insurance, employment and education on demand. The two-equation model of decision-making, and the probit and negative binomial regression methods, was a useful approach to demand estimation for primary care in urban settings.^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The use of coal for fuel in place of oil and natural gas has been increasing in the United States. Typically, users store their reserves of coal outdoors in large piles and rainfall on the coal creates runoffs which may contain materials hazardous to the environment and the public's health. To study this hazard, rainfall on model coal piles was simulated, using deionized water and four coals of varying sulfur content. The simulated surface runoffs were collected during 9 rainfall simulations spaced 15 days apart. The runoffs were analyzed for 13 standard water quality parameters, extracted with organic solvents and then analyzed with capillary column GC/MS, and the extracts were tested for mutagenicity with the Ames Salmonella microsomal assay and for clastogenicity with Chinese hamster ovary cells.^ The runoffs from the high-sulfur coals and the lignite exhibited extremes of pH (acidity), specific conductance, chemical oxygen demand, and total suspended solids; the low-sulfur coal runoffs did not exhibit these extremes. Without treatment, effluents from these high-sulfur coals and lignite would not comply with federal water quality guidelines.^ Most extracts of the simulated surface runoffs contained at least 10 organic compounds including polycyclic aromatic hydrocarbons, their methyl and ethyl homologs, olefins, paraffins, and some terpenes. The concentrations of these compounds were generally less than 50 (mu)g/l in most extracts.^ Some of the extracts were weakly mutagenic and affected both a DNA-repair proficient and deficient Salmonella strain. The addition of S9 decreased the effect significantly. Extracts of runoffs from the low-sulfur coal were not mutagenic.^ All extracts were clastogenic. Extracts of runoffs from the high-sulfur coals were both clastogenic and cytotoxic; those from the low-sulfur coal and the lignite were less clastogenic and not cytotoxic. Clastogenicity occurred with and without S9 activation. Chromosomal lesions included gaps, breaks and exchanges. These data suggest a relationship between the sulfur content of a coal, its mutagenicity and also its clastogenicity.^ The runoffs from actual coal piles should be investigated for possible genotoxic effects in view of the data presented in this study.^