12 resultados para Model selection

em DigitalCommons@The Texas Medical Center


Relevância:

60.00% 60.00%

Publicador:

Resumo:

When conducting a randomized comparative clinical trial, ethical, scientific or economic considerations often motivate the use of interim decision rules after successive groups of patients have been treated. These decisions may pertain to the comparative efficacy or safety of the treatments under study, cost considerations, the desire to accelerate the drug evaluation process, or the likelihood of therapeutic benefit for future patients. At the time of each interim decision, an important question is whether patient enrollment should continue or be terminated; either due to a high probability that one treatment is superior to the other, or a low probability that the experimental treatment will ultimately prove to be superior. The use of frequentist group sequential decision rules has become routine in the conduct of phase III clinical trials. In this dissertation, we will present a new Bayesian decision-theoretic approach to the problem of designing a randomized group sequential clinical trial, focusing on two-arm trials with time-to-failure outcomes. Forward simulation is used to obtain optimal decision boundaries for each of a set of possible models. At each interim analysis, we use Bayesian model selection to adaptively choose the model having the largest posterior probability of being correct, and we then make the interim decision based on the boundaries that are optimal under the chosen model. We provide a simulation study to compare this method, which we call Bayesian Doubly Optimal Group Sequential (BDOGS), to corresponding frequentist designs using either O'Brien-Fleming (OF) or Pocock boundaries, as obtained from EaSt 2000. Our simulation results show that, over a wide variety of different cases, BDOGS either performs at least as well as both OF and Pocock, or on average provides a much smaller trial. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Getting evidence-based sexual health education activities into schools can be a complicated process. Working models that assist our educational system in the selection, implementation, and maintenance of effective school-based adolescent health programs are needed. Replicating sexual health programs in school-based settings: A model for schools provides a comprehensive and applied approach that engages all of the important stakeholders within a school district. The results from this study hold much potential to inform Texas and the nation about how a coordinated and practical model can assist school districts to increase the use of evidence-based programs addressing teen pregnancy prevention and sexual health issues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In 2011, there will be an estimated 1,596,670 new cancer cases and 571,950 cancer-related deaths in the US. With the ever-increasing applications of cancer genetics in epidemiology, there is great potential to identify genetic risk factors that would help identify individuals with increased genetic susceptibility to cancer, which could be used to develop interventions or targeted therapies that could hopefully reduce cancer risk and mortality. In this dissertation, I propose to develop a new statistical method to evaluate the role of haplotypes in cancer susceptibility and development. This model will be flexible enough to handle not only haplotypes of any size, but also a variety of covariates. I will then apply this method to three cancer-related data sets (Hodgkin Disease, Glioma, and Lung Cancer). I hypothesize that there is substantial improvement in the estimation of association between haplotypes and disease, with the use of a Bayesian mathematical method to infer haplotypes that uses prior information from known genetics sources. Analysis based on haplotypes using information from publically available genetic sources generally show increased odds ratios and smaller p-values in both the Hodgkin, Glioma, and Lung data sets. For instance, the Bayesian Joint Logistic Model (BJLM) inferred haplotype TC had a substantially higher estimated effect size (OR=12.16, 95% CI = 2.47-90.1 vs. 9.24, 95% CI = 1.81-47.2) and more significant p-value (0.00044 vs. 0.008) for Hodgkin Disease compared to a traditional logistic regression approach. Also, the effect sizes of haplotypes modeled with recessive genetic effects were higher (and had more significant p-values) when analyzed with the BJLM. Full genetic models with haplotype information developed with the BJLM resulted in significantly higher discriminatory power and a significantly higher Net Reclassification Index compared to those developed with haplo.stats for lung cancer. Future analysis for this work could be to incorporate the 1000 Genomes project, which offers a larger selection of SNPs can be incorporated into the information from known genetic sources as well. Other future analysis include testing non-binary outcomes, like the levels of biomarkers that are present in lung cancer (NNK), and extending this analysis to full GWAS studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have developed a novel way to assess the mutagenicity of environmentally important metal carcinogens, such as nickel, by creating a positive selection system based upon the conditional expression of a retroviral transforming gene. The target gene is the v-mos gene in MuSVts110, a murine retrovirus possessing a growth temperature dependent defect in expression of the transforming gene due to viral RNA splicing. In normal rat kidney cells infected with MuSVts110 (6m2 cells), splicing of the MuSVts110 RNA to form the mRNA from which the transforming protein, p85$\sp{\rm gag-mos}$, is translated is growth-temperature dependent, occurring at 33 C and below but not at 39 C and above. This splicing "defect" is mediated by cis-acting viral sequences. Nickel chloride treatment of 6m2 cells followed by growth at 39 C, allowed the selection of "revertant" cells which constitutively express p85$\sp{\rm gag-mos}$ due to stable changes in the viral RNA splicing phenotype, suggesting that nickel, a carcinogen whose mutagenicity has not been well established, could induce mutations in mammalian genes. We also show by direct sequencing of PCR-amplified integrated MuSVts110 DNA from a 6m2 nickel-revertant cell line that the nickel-induced mutation affecting the splicing phenotype is a cis-acting 70-base duplication of a region of the viral DNA surrounding the 3$\sp\prime$ splice site. These findings provide the first example of the molecular basis for a nickel-induced DNA lesion and establish the mutagenicity of this potent carcinogen. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Theoretical and empirical studies were conducted on the pattern of nucleotide and amino acid substitution in evolution, taking into account the effects of mutation at the nucleotide level and purifying selection at the amino acid level. A theoretical model for predicting the evolutionary change in electrophoretic mobility of a protein was also developed by using information on the pattern of amino acid substitution. The specific problems studied and the main results obtained are as follows: (1) Estimation of the pattern of nucleotide substitution in DNA nuclear genomes. The pattern of point mutations and nucleotide substitutions among the four different nucleotides are inferred from the evolutionary changes of pseudogenes and functional genes, respectively. Both patterns are non-random, the rate of change varying considerably with nucleotide pair, and that in both cases transitions occur somewhat more frequently than transversions. In protein evolution, substitution occurs more often between amino acids with similar physico-chemical properties than between dissimilar amino acids. (2) Estimation of the pattern of nucleotide substitution in RNA genomes. The majority of mutations in retroviruses accumulate at the reverse transcription stage. Selection at the amino acid level is very weak, and almost non-existent between synonymous codons. The pattern of mutation is very different from that in DNA genomes. Nevertheless, the pattern of purifying selection at the amino acid level is similar to that in DNA genomes, although selection intensity is much weaker. (3) Evaluation of the determinants of molecular evolutionary rates in protein-coding genes. Based on rates of nucleotide substitution for mammalian genes, the rate of amino acid substitution of a protein is determined by its amino acid composition. The content of glycine is shown to correlate strongly and negatively with the rate of substitution. Empirical formulae, called indices of mutability, are developed in order to predict the rate of molecular evolution of a protein from data on its amino acid sequence. (4) Studies on the evolutionary patterns of electrophoretic mobility of proteins. A theoretical model was constructed that predicts the electric charge of a protein at any given pH and its isoelectric point from data on its primary and quaternary structures. Using this model, the evolutionary change in electrophoretic mobilities of different proteins and the expected amount of electrophoretically hidden genetic variation were studied. In the absence of selection for the pI value, proteins will on the average evolve toward a mildly basic pI. (Abstract shortened with permission of author.) ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Natural selection is one of the major factors in the evolution of all organisms. Detecting the signature of natural selection has been a central theme in evolutionary genetics. With the availability of microsatellite data, it is of interest to study how natural selection can be detected with microsatellites. ^ The overall aim of this research is to detect signatures of natural selection with data on genetic variation at microsatellite loci. The null hypothesis to be tested is the neutral mutation theory of molecular evolution, which states that different alleles at a locus have equivalent effects on fitness. Currently used tests of this hypothesis based on data on genetic polymorphism in natural populations presume that mutations at the loci follow the infinite allele/site models (IAM, ISM), in the sense that at each site at most only one mutation event is recorded, and each mutation leads to an allele not seen before in the population. Microsatellite loci, which are abundant in the genome, do not obey these mutation models, since the new alleles at such loci can be created either by contraction or expansion of tandem repeat sizes of core motifs. Since the current genome map is mainly composed of microsatellite loci and this class of loci is still most commonly studied in the context of human genome diversity, this research explores how the current test procedures for testing the neutral mutation hypothesis should be modified to take into account a generalized model of forward-backward stepwise mutations. In addition, recent literature also suggested that past demographic history of populations, presence of population substructure, and varying rates of mutations across loci all have confounding effects for detecting signatures of natural selection. ^ The effects of the stepwise mutation model and other confounding factors on detecting signature of natural selection are the main results of the research. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With hundreds of single nucleotide polymorphisms (SNPs) in a candidate gene and millions of SNPs across the genome, selecting an informative subset of SNPs to maximize the ability to detect genotype-phenotype association is of great interest and importance. In addition, with a large number of SNPs, analytic methods are needed that allow investigators to control the false positive rate resulting from large numbers of SNP genotype-phenotype analyses. This dissertation uses simulated data to explore methods for selecting SNPs for genotype-phenotype association studies. I examined the pattern of linkage disequilibrium (LD) across a candidate gene region and used this pattern to aid in localizing a disease-influencing mutation. The results indicate that the r2 measure of linkage disequilibrium is preferred over the common D′ measure for use in genotype-phenotype association studies. Using step-wise linear regression, the best predictor of the quantitative trait was not usually the single functional mutation. Rather it was a SNP that was in high linkage disequilibrium with the functional mutation. Next, I compared three strategies for selecting SNPs for application to phenotype association studies: based on measures of linkage disequilibrium, based on a measure of haplotype diversity, and random selection. The results demonstrate that SNPs selected based on maximum haplotype diversity are more informative and yield higher power than randomly selected SNPs or SNPs selected based on low pair-wise LD. The data also indicate that for genes with small contribution to the phenotype, it is more prudent for investigators to increase their sample size than to continuously increase the number of SNPs in order to improve statistical power. When typing large numbers of SNPs, researchers are faced with the challenge of utilizing an appropriate statistical method that controls the type I error rate while maintaining adequate power. We show that an empirical genotype based multi-locus global test that uses permutation testing to investigate the null distribution of the maximum test statistic maintains a desired overall type I error rate while not overly sacrificing statistical power. The results also show that when the penetrance model is simple the multi-locus global test does as well or better than the haplotype analysis. However, for more complex models, haplotype analyses offer advantages. The results of this dissertation will be of utility to human geneticists designing large-scale multi-locus genotype-phenotype association studies. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the Practice Change Model, physicians act as key stakeholders, people who have both an investment in the practice and the capacity to influence how the practice performs. This leadership role is critical to the development and change of the practice. Leadership roles and effectiveness are an important factor in quality improvement in primary care practices.^ The study conducted involved a comparative case study analysis to identify leadership roles and the relationship between leadership roles and the number and type of quality improvement strategies adopted during a Practice Change Model-based intervention study. The research utilized secondary data from four primary care practices with various leadership styles. The practices are located in the San Antonio region and serve a large Hispanic population. The data was collected by two ABC Project Facilitators from each practice during a 12-month period including Key Informant Interviews (all staff members), MAP (Multi-method Assessment Process), and Practice Facilitation field notes. This data was used to evaluate leadership styles, management within the practice, and intervention tools that were implemented. The chief steps will be (1) to analyze if the leader-member relations contribute to the type of quality improvement strategy or strategies selected (2) to investigate if leader-position power contributes to the number of strategies selected and the type of strategy selected (3) and to explore whether the task structure varies across the four primary care practices.^ The research found that involving more members of the clinic staff in decision-making, building bridges between organizational staff and clinical staff, and task structure are all associated with the direct influence on the number and type of quality improvement strategies implemented in primary care practice.^ Although this research only investigated leadership styles of four different practices, it will offer future guidance on how to establish the priorities and implementation of quality improvement strategies that will have the greatest impact on patient care improvement. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Standardization is a common method for adjusting confounding factors when comparing two or more exposure category to assess excess risk. Arbitrary choice of standard population in standardization introduces selection bias due to healthy worker effect. Small sample in specific groups also poses problems in estimating relative risk and the statistical significance is problematic. As an alternative, statistical models were proposed to overcome such limitations and find adjusted rates. In this dissertation, a multiplicative model is considered to address the issues related to standardized index namely: Standardized Mortality Ratio (SMR) and Comparative Mortality Factor (CMF). The model provides an alternative to conventional standardized technique. Maximum likelihood estimates of parameters of the model are used to construct an index similar to the SMR for estimating relative risk of exposure groups under comparison. Parametric Bootstrap resampling method is used to evaluate the goodness of fit of the model, behavior of estimated parameters and variability in relative risk on generated sample. The model provides an alternative to both direct and indirect standardization method. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When choosing among models to describe categorical data, the necessity to consider interactions makes selection more difficult. With just four variables, considering all interactions, there are 166 different hierarchical models and many more non-hierarchical models. Two procedures have been developed for categorical data which will produce the "best" subset or subsets of each model size where size refers to the number of effects in the model. Both procedures are patterned after the Leaps and Bounds approach used by Furnival and Wilson for continuous data and do not generally require fitting all models. For hierarchical models, likelihood ratio statistics (G('2)) are computed using iterative proportional fitting and "best" is determined by comparing, among models with the same number of effects, the Pr((chi)(,k)('2) (GREATERTHEQ) G(,ij)('2)) where k is the degrees of freedom for ith model of size j. To fit non-hierarchical as well as hierarchical models, a weighted least squares procedure has been developed.^ The procedures are applied to published occupational data relating to the occurrence of byssinosis. These results are compared to previously published analyses of the same data. Also, the procedures are applied to published data on symptoms in psychiatric patients and again compared to previously published analyses.^ These procedures will make categorical data analysis more accessible to researchers who are not statisticians. The procedures should also encourage more complex exploratory analyses of epidemiologic data and contribute to the development of new hypotheses for study. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The genomic era brought by recent advances in the next-generation sequencing technology makes the genome-wide scans of natural selection a reality. Currently, almost all the statistical tests and analytical methods for identifying genes under selection was performed on the individual gene basis. Although these methods have the power of identifying gene subject to strong selection, they have limited power in discovering genes targeted by moderate or weak selection forces, which are crucial for understanding the molecular mechanisms of complex phenotypes and diseases. Recent availability and rapid completeness of many gene network and protein-protein interaction databases accompanying the genomic era open the avenues of exploring the possibility of enhancing the power of discovering genes under natural selection. The aim of the thesis is to explore and develop normal mixture model based methods for leveraging gene network information to enhance the power of natural selection target gene discovery. The results show that the developed statistical method, which combines the posterior log odds of the standard normal mixture model and the Guilt-By-Association score of the gene network in a naïve Bayes framework, has the power to discover moderate/weak selection gene which bridges the genes under strong selection and it helps our understanding the biology under complex diseases and related natural selection phenotypes.^