6 resultados para Resampling
em DigitalCommons@The Texas Medical Center
Resumo:
Voluntary control of information processing is crucial to allocate resources and prioritize the processes that are most important under a given situation; the algorithms underlying such control, however, are often not clear. We investigated possible algorithms of control for the performance of the majority function, in which participants searched for and identified one of two alternative categories (left or right pointing arrows) as composing the majority in each stimulus set. We manipulated the amount (set size of 1, 3, and 5) and content (ratio of left and right pointing arrows within a set) of the inputs to test competing hypotheses regarding mental operations for information processing. Using a novel measure based on computational load, we found that reaction time was best predicted by a grouping search algorithm as compared to alternative algorithms (i.e., exhaustive or self-terminating search). The grouping search algorithm involves sampling and resampling of the inputs before a decision is reached. These findings highlight the importance of investigating the implications of voluntary control via algorithms of mental operations.
Resumo:
Following up genetic linkage studies to identify the underlying susceptibility gene(s) for complex disease traits is an arduous yet biologically and clinically important task. Complex traits, such as hypertension, are considered polygenic with many genes influencing risk, each with small effects. Chromosome 2 has been consistently identified as a genomic region with genetic linkage evidence suggesting that one or more loci contribute to blood pressure levels and hypertension status. Using combined positional candidate gene methods, the Family Blood Pressure Program has concentrated efforts in investigating this region of chromosome 2 in an effort to identify underlying candidate hypertension susceptibility gene(s). Initial informatics efforts identified the boundaries of the region and the known genes within it. A total of 82 polymorphic sites in eight positional candidate genes were genotyped in a large hypothesis-generating sample consisting of 1640 African Americans, 1339 whites, and 1616 Mexican Americans. To adjust for multiple comparisons, resampling-based false discovery adjustment was applied, extending traditional resampling methods to sibship samples. Following this adjustment for multiple comparisons, SLC4A5, a sodium bicarbonate transporter, was identified as a primary candidate gene for hypertension. Polymorphisms in SLC4A5 were subsequently genotyped and analyzed for validation in two populations of African Americans (N = 461; N = 778) and two of whites (N = 550; N = 967). Again, SNPs within SLC4A5 were significantly associated with blood pressure levels and hypertension status. While not identifying a single causal DNA sequence variation that is significantly associated with blood pressure levels and hypertension status across all samples, the results further implicate SLC4A5 as a candidate hypertension susceptibility gene, validating previous evidence for one or more genes on chromosome 2 that influence hypertension related phenotypes in the population-at-large. The methodology and results reported provide a case study of one approach for following up the results of genetic linkage analyses to identify genes influencing complex traits. ^
Resumo:
Standardization is a common method for adjusting confounding factors when comparing two or more exposure category to assess excess risk. Arbitrary choice of standard population in standardization introduces selection bias due to healthy worker effect. Small sample in specific groups also poses problems in estimating relative risk and the statistical significance is problematic. As an alternative, statistical models were proposed to overcome such limitations and find adjusted rates. In this dissertation, a multiplicative model is considered to address the issues related to standardized index namely: Standardized Mortality Ratio (SMR) and Comparative Mortality Factor (CMF). The model provides an alternative to conventional standardized technique. Maximum likelihood estimates of parameters of the model are used to construct an index similar to the SMR for estimating relative risk of exposure groups under comparison. Parametric Bootstrap resampling method is used to evaluate the goodness of fit of the model, behavior of estimated parameters and variability in relative risk on generated sample. The model provides an alternative to both direct and indirect standardization method. ^
Resumo:
In geographical epidemiology, maps of disease rates and disease risk provide a spatial perspective for researching disease etiology. For rare diseases or when the population base is small, the rate and risk estimates may be unstable. Empirical Bayesian (EB) methods have been used to spatially smooth the estimates by permitting an area estimate to "borrow strength" from its neighbors. Such EB methods include the use of a Gamma model, of a James-Stein estimator, and of a conditional autoregressive (CAR) process. A fully Bayesian analysis of the CAR process is proposed. One advantage of this fully Bayesian analysis is that it can be implemented simply by using repeated sampling from the posterior densities. Use of a Markov chain Monte Carlo technique such as Gibbs sampler was not necessary. Direct resampling from the posterior densities provides exact small sample inferences instead of the approximate asymptotic analyses of maximum likelihood methods (Clayton & Kaldor, 1987). Further, the proposed CAR model provides for covariates to be included in the model. A simulation demonstrates the effect of sample size on the fully Bayesian analysis of the CAR process. The methods are applied to lip cancer data from Scotland, and the results are compared. ^
Resumo:
Logistic regression is one of the most important tools in the analysis of epidemiological and clinical data. Such data often contain missing values for one or more variables. Common practice is to eliminate all individuals for whom any information is missing. This deletion approach does not make efficient use of available information and often introduces bias.^ Two methods were developed to estimate logistic regression coefficients for mixed dichotomous and continuous covariates including partially observed binary covariates. The data were assumed missing at random (MAR). One method (PD) used predictive distribution as weight to calculate the average of the logistic regressions performing on all possible values of missing observations, and the second method (RS) used a variant of resampling technique. Additional seven methods were compared with these two approaches in a simulation study. They are: (1) Analysis based on only the complete cases, (2) Substituting the mean of the observed values for the missing value, (3) An imputation technique based on the proportions of observed data, (4) Regressing the partially observed covariates on the remaining continuous covariates, (5) Regressing the partially observed covariates on the remaining continuous covariates conditional on response variable, (6) Regressing the partially observed covariates on the remaining continuous covariates and response variable, and (7) EM algorithm. Both proposed methods showed smaller standard errors (s.e.) for the coefficient involving the partially observed covariate and for the other coefficients as well. However, both methods, especially PD, are computationally demanding; thus for analysis of large data sets with partially observed covariates, further refinement of these approaches is needed. ^
Resumo:
The tobacco-specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is an obvious carcinogen for lung cancer. Since CBMN (Cytokinesis-blocked micronucleus) has been found to be extremely sensitive to NNK-induced genetic damage, it is a potential important factor to predict the lung cancer risk. However, the association between lung cancer and NNK-induced genetic damage measured by CBMN assay has not been rigorously examined. ^ This research develops a methodology to model the chromosomal changes under NNK-induced genetic damage in a logistic regression framework in order to predict the occurrence of lung cancer. Since these chromosomal changes were usually not observed very long due to laboratory cost and time, a resampling technique was applied to generate the Markov chain of the normal and the damaged cell for each individual. A joint likelihood between the resampled Markov chains and the logistic regression model including transition probabilities of this chain as covariates was established. The Maximum likelihood estimation was applied to carry on the statistical test for comparison. The ability of this approach to increase discriminating power to predict lung cancer was compared to a baseline "non-genetic" model. ^ Our method offered an option to understand the association between the dynamic cell information and lung cancer. Our study indicated the extent of DNA damage/non-damage using the CBMN assay provides critical information that impacts public health studies of lung cancer risk. This novel statistical method could simultaneously estimate the process of DNA damage/non-damage and its relationship with lung cancer for each individual.^