853 resultados para optimism bias
Resumo:
Signatur des Originals: S 36/G10473
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
Additive and multiplicative models of relative risk were used to measure the effect of cancer misclassification and DS86 random errors on lifetime risk projections in the Life Span Study (LSS) of Hiroshima and Nagasaki atomic bomb survivors. The true number of cancer deaths in each stratum of the cancer mortality cross-classification was estimated using sufficient statistics from the EM algorithm. Average survivor doses in the strata were corrected for DS86 random error ($\sigma$ = 0.45) by use of reduction factors. Poisson regression was used to model the corrected and uncorrected mortality rates with covariates for age at-time-of-bombing, age at-time-of-death and gender. Excess risks were in good agreement with risks in RERF Report 11 (Part 2) and the BEIR-V report. Bias due to DS86 random error typically ranged from $-$15% to $-$30% for both sexes, and all sites and models. The total bias, including diagnostic misclassification, of excess risk of nonleukemia for exposure to 1 Sv from age 18 to 65 under the non-constant relative projection model was $-$37.1% for males and $-$23.3% for females. Total excess risks of leukemia under the relative projection model were biased $-$27.1% for males and $-$43.4% for females. Thus, nonleukemia risks for 1 Sv from ages 18 to 85 (DRREF = 2) increased from 1.91%/Sv to 2.68%/Sv among males and from 3.23%/Sv to 4.02%/Sv among females. Leukemia excess risks increased from 0.87%/Sv to 1.10%/Sv among males and from 0.73%/Sv to 1.04%/Sv among females. Bias was dependent on the gender, site, correction method, exposure profile and projection model considered. Future studies that use LSS data for U.S. nuclear workers may be downwardly biased if lifetime risk projections are not adjusted for random and systematic errors. (Supported by U.S. NRC Grant NRC-04-091-02.) ^
Resumo:
This study establishes the extent and relevance of bias of population estimates of prevalence, incidence, and intensity of infection with Schistosoma mansoni caused by the relative sensitivity of stool examination techniques. The population studied was Parcelas de Boqueron in Las Piedras, Puerto Rico, where the Centers for Disease Control, had undertaken a prospective community-based study of infection with S. mansoni in 1972. During each January of the succeeding years stool specimens from this population were processed according to the modified Ritchie concentration (MRC) technique. During January 1979 additional stool specimens were collected from 30 individuals selected on the basis of their mean S. mansoni egg output during previous years. Each specimen was divided into ten 1-gm aliquots and three 42-mg aliquots. The relationship of egg counts obtained with the Kato-Katz (KK) thick smear technique as a function of the mean of ten counts obtained with the MRC technique was established by means of regression analysis. Additionally, the effect of fecal sample size and egg excretion level on technique sensitivity was evaluated during a blind assessment of single stool specimen samples, using both examination methods, from 125 residents with documented S. mansoni infections. The regression equation was: Ln KK = 2.3324 + 0.6319 Ln MRC, and the coefficient of determination (r('2)) was 0.73. The regression equation was then utilized to correct the term "m" for sample size in the expression P ((GREATERTHEQ) 1 egg) = 1 - e('-ms), which estimates the probability P of finding at least one egg as a function of the mean S. mansoni egg output "m" of the population and the effective stool sample size "s" utilized by the coprological technique. This algorithm closely approximated the observed sensitivity of the KK and MRC tests when these were utilized to blindly screen a population of known parasitologic status for infection with S. mansoni. In addition, the algorithm was utilized to adjust the apparent prevalence of infection for the degree of functional sensitivity exhibited by the diagnostic test. This permitted the estimation of true prevalence of infection and, hence, a means for correcting estimates of incidence of infection. ^
Resumo:
Of the large clinical trials evaluating screening mammography efficacy, none included women ages 75 and older. Recommendations on an upper age limit at which to discontinue screening are based on indirect evidence and are not consistent. Screening mammography is evaluated using observational data from the SEER-Medicare linked database. Measuring the benefit of screening mammography is difficult due to the impact of lead-time bias, length bias and over-detection. The underlying conceptual model divides the disease into two stages: pre-clinical (T0) and symptomatic (T1) breast cancer. Treating the time in these phases as a pair of dependent bivariate observations, (t0,t1), estimates are derived to describe the distribution of this random vector. To quantify the effect of screening mammography, statistical inference is made about the mammography parameters that correspond to the marginal distribution of the symptomatic phase duration (T1). This shows the hazard ratio of death from breast cancer comparing women with screen-detected tumors to those detected at their symptom onset is 0.36 (0.30, 0.42), indicating a benefit among the screen-detected cases. ^
Resumo:
The operator effect is a well-known methodological bias already quantified in some taphonomic studies. However, the replicability effect, i.e., the use of taphonomic attributes as a replicable scientific method, has not been taken into account to the present. Here, we quantified for the first time this replicability bias using different multivariate statistical techniques, testing if the operator effect is related to the replicability effect. We analyzed the results reported by 15 operators working on the same dataset. Each operator analyzed 30 biological remains (bivalve shells) from five different sites, considering the attributes fragmentation, edge rounding, corrasion, bioerosion and secondary color. The operator effect followed the same pattern reported in previous studies, characterized by a worse correspondence for those attributes having more than two levels of damage categories. However, the effect did not appear to have relation with the replicability effect, because nearly all operators found differences among sites. Despite the binary attribute bioerosion exhibited 83% of correspondence among operators it was the taphonomic attributes that showed the highest dispersion among operators (28%). Therefore, we conclude that binary attributes (despite showing a reduction of the operator effect) diminish replicability, resulting in different interpretations of concordant data. We found that a variance value of nearly 8% among operators, was enough to generate a different taphonomic interpretation, in a Q-mode cluster analysis. The results reported here showed that the statistical method employed influences the level of replicability and comparability of a study and that the availability of results may be a valid alternative to reduce bias.
Resumo:
These data form the basis of an analysis of a prevalent research bias in the field of ocean acidification, notably the ignoring of natural fluctuations and gradients in the experimental design. The data are extracted from published work and own experiments.
Resumo:
Quality assessment is one of the activities performed as part of systematic literature reviews. It is commonly accepted that a good quality experiment is bias free. Bias is considered to be related to internal validity (e.g., how adequately the experiment is planned, executed and analysed). Quality assessment is usually conducted using checklists and quality scales. It has not yet been proven;however, that quality is related to experimental bias. Aim: Identify whether there is a relationship between internal validity and bias in software engineering experiments. Method: We built a quality scale to determine the quality of the studies, which we applied to 28 experiments included in two systematic literature reviews. We proposed an objective indicator of experimental bias, which we applied to the same 28 experiments. Finally, we analysed the correlations between the quality scores and the proposed measure of bias. Results: We failed to find a relationship between the global quality score (resulting from the quality scale) and bias; however, we did identify interesting correlations between bias and some particular aspects of internal validity measured by the instrument. Conclusions: There is an empirically provable relationship between internal validity and bias. It is feasible to apply quality assessment in systematic literature reviews, subject to limits on the internal validity aspects for consideration.
Resumo:
Most empirical disciplines promote the reuse and sharing of datasets, as it leads to greater possibility of replication. While this is increasingly the case in Empirical Software Engineering, some of the most popular bug-fix datasets are now known to be biased. This raises two significants concerns: first, that sample bias may lead to underperforming prediction models, and second, that the external validity of the studies based on biased datasets may be suspect. This issue has raised considerable consternation in the ESE literature in recent years. However, there is a confounding factor of these datasets that has not been examined carefully: size. Biased datasets are sampling only some of the data that could be sampled, and doing so in a biased fashion; but biased samples could be smaller, or larger. Smaller data sets in general provide less reliable bases for estimating models, and thus could lead to inferior model performance. In this setting, we ask the question, what affects performance more? bias, or size? We conduct a detailed, large-scale meta-analysis, using simulated datasets sampled with bias from a high-quality dataset which is relatively free of bias. Our results suggest that size always matters just as much bias direction, and in fact much more than bias direction when considering information-retrieval measures such as AUC and F-score. This indicates that at least for prediction models, even when dealing with sampling bias, simply finding larger samples can sometimes be sufficient. Our analysis also exposes the complexity of the bias issue, and raises further issues to be explored in the future.
Resumo:
Current bias estimation algorithms for air traffic control (ATC) surveillance are focused on radar sensors, but the integration of new sensors (especially automatic dependent surveillance-broadcast and wide area multilateration) demands the extension of traditional procedures. This study describes a generic architecture for bias estimation applicable to multisensor multitarget surveillance systems. It consists on first performing bias estimations using measurements from each target, of a subset of sensors, assumed to be reliable, forming track bias estimations. All track bias estimations are combined to obtain, for each of those sensors, the corresponding sensor bias. Then, sensor bias terms are corrected, to subsequently calculate the target or sensor-target pair specific biases. Once these target-specific biases are corrected, the process is repeated recursively for other sets of less reliable sensors, assuming bias corrected measures from previous iterations are unbiased. This study describes the architecture and outlines the methodology for the estimation and the bias estimation design processes. Then the approach is validated through simulation, and compared with previous methods in the literature. Finally, the study describes the application of the methodology to the design of the bias estimation procedures for a modern ATC surveillance application, specifically for off-line assessment of ATC surveillance performance.
Resumo:
This paper presents an empirical evidence of user bias within a laboratory-oriented evaluation of a Spoken Dialog System. Specifically, we addressed user bias in their satisfaction judgements. We question the reliability of this data for modeling user emotion, focusing on contentment and frustration in a spoken dialog system. This bias is detected through machine learning experiments that were conducted on two datasets, users and annotators, which were then compared in order to assess the reliability of these datasets. The target used was the satisfaction rating and the predictors were conversational/dialog features. Our results indicated that standard classifiers were significantly more successful in discriminating frustration and contentment and the intensities of these emotions (reflected by user satisfaction ratings) from annotator data than from user data. Indirectly, the results showed that conversational features are reliable predictors of the two abovementioned emotions.
Resumo:
Radar technologies have been developed to improve the efficiency when detecting targets. Radar is a system composed by several devices connected and working together. Depending on the type of radar, the improvements are focused on different functionalities of the radar. One of the most important devices composing a radar is the antenna, that sends the radio-frequency signal to the space in order to detect targets. This project is focused on a specific type of radar called phased array radar. This type of radar is characterized by its antenna, which consist on a linear array of radiating elements, in this particular case, eight dipoles working at the frequency band S. The main advantage introduced by the phased array antenna is that using the fundamentals of arrays, the directivity of the antenna can change by shifting the phase of the signal at the input of each radiating element. This can be done using phase shifters. Phase shifter consists on a device which produces a phase shift in the radio-frequency input signal depending on a control DC voltage. Using a phased array antenna allows changing the directivity of the antenna without a mechanical rotating system. The objective of this project is to design the feed network and the bias network of the phased antenna. The feed network consists on a parallel-fed network composed by power dividers that sends the radio-frequency signal from the source to each radiating element of the antenna. The bias network consists on a system that generates the control DC voltages supplied to the phase shifters in order to change the directivity. The architecture of the bias network is composed by a software, implemented in Matlab and run in a laptop which is connected to a micro-controller by a serial communication port. The software calculates the control DC voltages needed to obtain a determined directivity or scan angle. These values are sent by the serial communication port to the micro-controller as data. Then the micro-controller generates the desired control DC voltages and supplies them to the phase shifters. In this project two solutions for bias network are designed. Each one is tested and final conclusions are obtained to determine the advantages and disadvantages. Finally a graphic user interface is developed in order to make the system easy to use. RESUMEN. Las tecnologías empleadas por lo dispositivos radar se han ido desarrollando para mejorar su eficiencia y usabilidad. Un radar es un sistema formado por varios subsistemas conectados entre sí. Por lo que dependiendo del tipo de radar las mejoras se centran en los subsistemas correspondientes. Uno de los elementos más importantes de un radar es la antena. Esta se emplea para enviar la señal de radiofrecuencia al espacio y así poder detectar los posibles obstáculos del entorno. Este proyecto se centra en un tipo específico de radar llamado phased array radar. Este tipo de radar se caracteriza por la antena que es un array de antenas, en concreto para este proyecto se trata de un array lineal de ocho dipolos en la banda de frequencia S. El uso de una antena de tipo phased array supone una ventaja importante. Empleando los fundamentos de radiación aplicado a array de antenas se obtiene que la directividad de la antena puede ser modificada. Esto se consigue aplicando distintos desfasajes a la señal de radiofrecuencia que alimenta a cada elemento del array. Para aplicar los desfasajes se emplea un desplazador de fase, este dispositivo aplica una diferencia de fase a su salida con respecto a la señal de entrada dependiendo de una tensión continua de control. Por tanto el empleo de una antena de tipo phased array supone una gran ventaja puesto que no se necesita un sistema de rotación para cambiar la directividad de la antena. El objetivo principal del proyecto consiste en el diseño de la red de alimentación y la red de polarización de la antena de tipo phased array. La red de alimentación consiste en un circuito pasivo que permite alimentar a cada elemento del array con la misma cantidad de señal. Dicha red estará formada por divisores de potencia pasivos y su configuración será en paralelo. Por otro lado la red de polarización consiste en el diseño de un sistema automático que permite cambiar la directividad de la antena. Este sistema consiste en un programa en Matlab que es ejecutado en un ordenador conectado a un micro-controlador mediante una comunicación serie. El funcionamiento se basa en calcular las tensiones continuas de control, que necesitan los desplazadores de fase, mediante un programa en Matlab y enviarlos como datos al micro-controlador. Dicho micro-controlador genera las tensiones de control deseadas y las proporciona a cada desplazador de fase, obteniendo así la directividad deseada. Debido al amplio abanico de posibilidades, se obtienen dos soluciones que son sometidas a pruebas. Se obtienen las ventajas y desventajas de cada una. Finalmente se implementa una interfaz gráfica de usuario con el objetivo de hacer dicho sistema manejable y entendible para cualquier usuario.