902 resultados para Incidental parameter bias
Resumo:
Cognitive processes are influenced by underlying affective states, and tests of cognitive bias have recently been developed to assess the valence of affective states in animals. These tests are based on the fact that individuals in a negative affective state interpret ambiguous stimuli more pessimistically than individuals in a more positive state. Using two strains of mice we explored whether unpredictable chronic mild stress (UCMS) can induce a negative judgement bias and whether variation in the expression of stereotypic behaviour is associated with variation in judgement bias. Sixteen female CD-1 and 16 female C57BL/6 mice were trained on a tactile conditional discrimination test with grade of sandpaper as a cue for differential food rewards. Once they had learned the discrimination, half of the mice were subjected to UCMS for three weeks to induce a negative affective state. Although UCMS induced a reduced preference for the higher value reward in the judgement bias test, it did not affect saccharine preference or hypothalamic–pituitary–adrenal (HPA) activity. However, UCMS affected responses to ambiguous (intermediate) cues in the judgement bias test. While control mice showed a graded response to ambiguous cues, UCMS mice of both strains did not discriminate between ambiguous cues and tended to show shorter latencies to the ambiguous cues and the negative reference cue. UCMS also increased bar-mouthing in CD-1, but not in C57BL/6 mice. Furthermore, mice with higher levels of stereotypic behaviour made more optimistic choices in the judgement bias test. However, no such relationship was found for stereotypic bar-mouthing, highlighting the importance of investigating different types of stereotypic behaviour separately.
Resumo:
Behavioural tests to assess affective states are widely used in human research and have recently been extended to animals. These tests assume that affective state influences cognitive processing, and that animals in a negative affective state interpret ambiguous information as expecting a negative outcome (displaying a negative cognitive bias). Most of these tests however, require long discrimination training. The aim of the study was to validate an exploration based cognitive bias test, using two different handling methods, as previous studies have shown that standard tail handling of mice increases physiological and behavioural measures of anxiety compared to cupped handling. Therefore, we hypothesised that tail handled mice would display a negative cognitive bias. We handled 28 female CD-1 mice for 16 weeks using either tail handling or cupped handling. The mice were then trained in an eight arm radial maze, where two adjacent arms predicted a positive outcome (darkness and food), while the two opposite arms predicted a negative outcome (no food, white noise and light). After six days of training, the mice were also given access to the four previously unavailable intermediate ambiguous arms of the radial maze and tested for cognitive bias. We were unable to validate this test, as mice from both handling groups displayed a similar pattern of exploration. Furthermore, we examined whether maze exploration is affected by the expression of stereotypic behaviour in the home cage. Mice with higher levels of stereotypic behaviour spent more time in positive arms and avoided ambiguous arms, displaying a negative cognitive bias. While this test needs further validation, our results indicate that it may allow the assessment of affective state in mice with minimal training— a major confound in current cognitive bias paradigms.
Resumo:
Knowles, Persico, and Todd (2001) develop a model of police search and offender behavior. Their model implies that if police are unprejudiced the rate of guilt should not vary across groups. Using data from Interstate 95 in Maryland, they find equal guilt rates for African-Americans and whites and conclude that the data is not consistent with racial prejudice against African-Americans. This paper generalizes the model of Knowles, Persico, and Todd by accounting for the fact that potential offenders are frequently not observed by the police and by including two different levels of offense severity. The paper shows that for African-American males the data is consistent with prejudice against African-American males, no prejudice, and reverse discrimination depending on the form of equilibria that exists in the economy. Additional analyses based on stratification by type of vehicle and time of day were conducted, but did not shed any light on the form of equilibria that best represents the situation in Maryland during the sample period.
Resumo:
This paper proposes asymptotically optimal tests for unstable parameter process under the feasible circumstance that the researcher has little information about the unstable parameter process and the error distribution, and suggests conditions under which the knowledge of those processes does not provide asymptotic power gains. I first derive a test under known error distribution, which is asymptotically equivalent to LR tests for correctly identified unstable parameter processes under suitable conditions. The conditions are weak enough to cover a wide range of unstable processes such as various types of structural breaks and time varying parameter processes. The test is then extended to semiparametric models in which the underlying distribution in unknown but treated as unknown infinite dimensional nuisance parameter. The semiparametric test is adaptive in the sense that its asymptotic power function is equivalent to the power envelope under known error distribution.
Resumo:
Signatur des Originals: S 36/G10472
Resumo:
Signatur des Originals: S 36/G10473
Resumo:
Monte Carlo simulation has been conducted to investigate parameter estimation and hypothesis testing in some well known adaptive randomization procedures. The four urn models studied are Randomized Play-the-Winner (RPW), Randomized Pôlya Urn (RPU), Birth and Death Urn with Immigration (BDUI), and Drop-the-Loses Urn (DL). Two sequential estimation methods, the sequential maximum likelihood estimation (SMLE) and the doubly adaptive biased coin design (DABC), are simulated at three optimal allocation targets that minimize the expected number of failures under the assumption of constant variance of simple difference (RSIHR), relative risk (ORR), and odds ratio (OOR) respectively. Log likelihood ratio test and three Wald-type tests (simple difference, log of relative risk, log of odds ratio) are compared in different adaptive procedures. ^ Simulation results indicates that although RPW is slightly better in assigning more patients to the superior treatment, the DL method is considerably less variable and the test statistics have better normality. When compared with SMLE, DABC has slightly higher overall response rate with lower variance, but has larger bias and variance in parameter estimation. Additionally, the test statistics in SMLE have better normality and lower type I error rate, and the power of hypothesis testing is more comparable with the equal randomization. Usually, RSIHR has the highest power among the 3 optimal allocation ratios. However, the ORR allocation has better power and lower type I error rate when the log of relative risk is the test statistics. The number of expected failures in ORR is smaller than RSIHR. It is also shown that the simple difference of response rates has the worst normality among all 4 test statistics. The power of hypothesis test is always inflated when simple difference is used. On the other hand, the normality of the log likelihood ratio test statistics is robust against the change of adaptive randomization procedures. ^
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
Additive and multiplicative models of relative risk were used to measure the effect of cancer misclassification and DS86 random errors on lifetime risk projections in the Life Span Study (LSS) of Hiroshima and Nagasaki atomic bomb survivors. The true number of cancer deaths in each stratum of the cancer mortality cross-classification was estimated using sufficient statistics from the EM algorithm. Average survivor doses in the strata were corrected for DS86 random error ($\sigma$ = 0.45) by use of reduction factors. Poisson regression was used to model the corrected and uncorrected mortality rates with covariates for age at-time-of-bombing, age at-time-of-death and gender. Excess risks were in good agreement with risks in RERF Report 11 (Part 2) and the BEIR-V report. Bias due to DS86 random error typically ranged from $-$15% to $-$30% for both sexes, and all sites and models. The total bias, including diagnostic misclassification, of excess risk of nonleukemia for exposure to 1 Sv from age 18 to 65 under the non-constant relative projection model was $-$37.1% for males and $-$23.3% for females. Total excess risks of leukemia under the relative projection model were biased $-$27.1% for males and $-$43.4% for females. Thus, nonleukemia risks for 1 Sv from ages 18 to 85 (DRREF = 2) increased from 1.91%/Sv to 2.68%/Sv among males and from 3.23%/Sv to 4.02%/Sv among females. Leukemia excess risks increased from 0.87%/Sv to 1.10%/Sv among males and from 0.73%/Sv to 1.04%/Sv among females. Bias was dependent on the gender, site, correction method, exposure profile and projection model considered. Future studies that use LSS data for U.S. nuclear workers may be downwardly biased if lifetime risk projections are not adjusted for random and systematic errors. (Supported by U.S. NRC Grant NRC-04-091-02.) ^
Resumo:
This study establishes the extent and relevance of bias of population estimates of prevalence, incidence, and intensity of infection with Schistosoma mansoni caused by the relative sensitivity of stool examination techniques. The population studied was Parcelas de Boqueron in Las Piedras, Puerto Rico, where the Centers for Disease Control, had undertaken a prospective community-based study of infection with S. mansoni in 1972. During each January of the succeeding years stool specimens from this population were processed according to the modified Ritchie concentration (MRC) technique. During January 1979 additional stool specimens were collected from 30 individuals selected on the basis of their mean S. mansoni egg output during previous years. Each specimen was divided into ten 1-gm aliquots and three 42-mg aliquots. The relationship of egg counts obtained with the Kato-Katz (KK) thick smear technique as a function of the mean of ten counts obtained with the MRC technique was established by means of regression analysis. Additionally, the effect of fecal sample size and egg excretion level on technique sensitivity was evaluated during a blind assessment of single stool specimen samples, using both examination methods, from 125 residents with documented S. mansoni infections. The regression equation was: Ln KK = 2.3324 + 0.6319 Ln MRC, and the coefficient of determination (r('2)) was 0.73. The regression equation was then utilized to correct the term "m" for sample size in the expression P ((GREATERTHEQ) 1 egg) = 1 - e('-ms), which estimates the probability P of finding at least one egg as a function of the mean S. mansoni egg output "m" of the population and the effective stool sample size "s" utilized by the coprological technique. This algorithm closely approximated the observed sensitivity of the KK and MRC tests when these were utilized to blindly screen a population of known parasitologic status for infection with S. mansoni. In addition, the algorithm was utilized to adjust the apparent prevalence of infection for the degree of functional sensitivity exhibited by the diagnostic test. This permitted the estimation of true prevalence of infection and, hence, a means for correcting estimates of incidence of infection. ^
Resumo:
My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.
Resumo:
Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.