996 resultados para ROC ANALYSIS
Resumo:
ABSTRACT: BACKGROUND: Decision curve analysis has been introduced as a method to evaluate prediction models in terms of their clinical consequences if used for a binary classification of subjects into a group who should and into a group who should not be treated. The key concept for this type of evaluation is the "net benefit", a concept borrowed from utility theory. METHODS: We recall the foundations of decision curve analysis and discuss some new aspects. First, we stress the formal distinction between the net benefit for the treated and for the untreated and define the concept of the "overall net benefit". Next, we revisit the important distinction between the concept of accuracy, as typically assessed using the Youden index and a receiver operating characteristic (ROC) analysis, and the concept of utility of a prediction model, as assessed using decision curve analysis. Finally, we provide an explicit implementation of decision curve analysis to be applied in the context of case-control studies. RESULTS: We show that the overall net benefit, which combines the net benefit for the treated and the untreated, is a natural alternative to the benefit achieved by a model, being invariant with respect to the coding of the outcome, and conveying a more comprehensive picture of the situation. Further, within the framework of decision curve analysis, we illustrate the important difference between the accuracy and the utility of a model, demonstrating how poor an accurate model may be in terms of its net benefit. Eventually, we expose that the application of decision curve analysis to case-control studies, where an accurate estimate of the true prevalence of a disease cannot be obtained from the data, is achieved with a few modifications to the original calculation procedure. CONCLUSIONS: We present several interrelated extensions to decision curve analysis that will both facilitate its interpretation and broaden its potential area of application.
Resumo:
OBJETIVO: Analisar a acurácia do diagnóstico de dois protocolos de imunofluorescência indireta para leishmaniose visceral canina. MÉTODOS: Cães provenientes de inquérito soroepidemiológico realizado em área endêmica nos municípios de Araçatuba e de Andradina, na região noroeste do estado de São Paulo, em 2003, e área não endêmica da região metropolitana de São Paulo, foram utilizados para avaliar comparativamente dois protocolos da reação de imunofluorescência indireta (RIFI) para leishmaniose: um utilizando antígeno heterólogo Leishmania major (RIFI-BM) e outro utilizando antígeno homólogo Leishmania chagasi (RIFI-CH). Para estimar acurácia utilizou-se a análise two-graph receiver operating characteristic (TG-ROC). A análise TG-ROC comparou as leituras da diluição 1:20 do antígeno homólogo (RIFI-CH), consideradas como teste referência, com as diluições da RIFI-BM (antígeno heterólogo). RESULTADOS: A diluição 1:20 do teste RIFI-CH apresentou o melhor coeficiente de contingência (0,755) e a maior força de associação entre as duas variáveis estudadas (qui-quadrado=124,3), sendo considerada a diluição-referência do teste nas comparações com as diferentes diluições do teste RIFI-BM. Os melhores resultados do RIFI-BM foram obtidos na diluição 1:40, com melhor coeficiente de contingência (0,680) e maior força de associação (qui-quadrado=80,8). Com a mudança do ponto de corte sugerido nesta análise para a diluição 1:40 da RIFI-BM, o valor do parâmetro especificidade aumentou de 57,5% para 97,7%, embora a diluição 1:80 tivesse apresentado a melhor estimativa para sensibilidade (80,2%) com o novo ponto de corte. CONCLUSÕES: A análise TG-ROC pode fornecer importantes informações sobre os testes de diagnósticos, além de apresentar sugestões sobre pontos de cortes que podem melhorar as estimativas de sensibilidade e especificidade do teste, e avaliá-los a luz do melhor custo-benefício.
Resumo:
Purpose - The study evaluates the pre- and post-training lesion localisation ability of a group of novice observers. Parallels are drawn with the performance of inexperienced radiographers taking part in preliminary clinical evaluation (PCE) and ‘red-dot’ systems, operating within radiography practice. Materials and methods - Thirty-four novice observers searched 92 images for simulated lesions. Pre-training and post-training evaluations were completed following the free-response the receiver operating characteristic (FROC) method. Training consisted of observer performance methodology, the characteristics of the simulated lesions and information on lesion frequency. Jackknife alternative FROC (JAFROC) and highest rating inferred ROC analyses were performed to evaluate performance difference on lesion-based and case-based decisions. The significance level of the test was set at 0.05 to control the probability of Type I error. Results - JAFROC analysis (F(3,33) = 26.34, p < 0.0001) and highest-rating inferred ROC analysis (F(3,33) = 10.65, p = 0.0026) revealed a statistically significant difference in lesion detection performance. The JAFROC figure-of-merit was 0.563 (95% CI 0.512,0.614) pre-training and 0.677 (95% CI 0.639,0.715) post-training. Highest rating inferred ROC figure-of-merit was 0.728 (95% CI 0.701,0.755) pre-training and 0.772 (95% CI 0.750,0.793) post-training. Conclusions - This study has demonstrated that novice observer performance can improve significantly. This study design may have relevance in the assessment of inexperienced radiographers taking part in PCE or commenting scheme for trauma.
Resumo:
Tese de Doutoramento em Engenharia Industrial e de Sistemas
Resumo:
Objective: This paper presents a detailed study of fractal-based methods for texture characterization of mammographic mass lesions and architectural distortion. The purpose of this study is to explore the use of fractal and lacunarity analysis for the characterization and classification of both tumor lesions and normal breast parenchyma in mammography. Materials and methods: We conducted comparative evaluations of five popular fractal dimension estimation methods for the characterization of the texture of mass lesions and architectural distortion. We applied the concept of lacunarity to the description of the spatial distribution of the pixel intensities in mammographic images. These methods were tested with a set of 57 breast masses and 60 normal breast parenchyma (dataset1), and with another set of 19 architectural distortions and 41 normal breast parenchyma (dataset2). Support vector machines (SVM) were used as a pattern classification method for tumor classification. Results: Experimental results showed that the fractal dimension of region of interest (ROIs) depicting mass lesions and architectural distortion was statistically significantly lower than that of normal breast parenchyma for all five methods. Receiver operating characteristic (ROC) analysis showed that fractional Brownian motion (FBM) method generated the highest area under ROC curve (A z = 0.839 for dataset1, 0.828 for dataset2, respectively) among five methods for both datasets. Lacunarity analysis showed that the ROIs depicting mass lesions and architectural distortion had higher lacunarities than those of ROIs depicting normal breast parenchyma. The combination of FBM fractal dimension and lacunarity yielded the highest A z value (0.903 and 0.875, respectively) than those based on single feature alone for both given datasets. The application of the SVM improved the performance of the fractal-based features in differentiating tumor lesions from normal breast parenchyma by generating higher A z value. Conclusion: FBM texture model is the most appropriate model for characterizing mammographic images due to self-affinity assumption of the method being a better approximation. Lacunarity is an effective counterpart measure of the fractal dimension in texture feature extraction in mammographic images. The classification results obtained in this work suggest that the SVM is an effective method with great potential for classification in mammographic image analysis.
Resumo:
Objective Arterial lactate, base excess (BE), lactate clearance, and Sequential Organ Failure Assessment (SOFA) score have been shown to correlate with outcome in severely injured patients. The goal of the present study was to separately assess their predictive value in patients suffering from traumatic brain injury (TBI) as opposed to patients suffering from injuries not related to the brain. Materials and methods A total of 724 adult trauma patients with an Injury Severity Score (ISS) ≥ 16 were grouped into patients without TBI (non-TBI), patients with isolated TBI (isolated TBI), and patients with a combination of TBI and non-TBI injuries (combined injuries). The predictive value of the above parameters was then analyzed using both uni- and multivariate analyses. Results The mean age of the patients was 39 years (77 % males), with a mean ISS of 32 (range 16–75). Mortality ranged from 14 % (non-TBI) to 24 % (combined injuries). Admission and serial lactate/BE values were higher in non-survivors of all groups (all p < 0.01), but not in patients with isolated TBI. Admission SOFA scores were highest in non-survivors of all groups (p = 0.023); subsequently septic patients also showed elevated SOFA scores (p < 0.01), except those with isolated TBI. In this group, SOFA score was the only parameter which showed significant differences between survivors and non-survivors. Receiver operating characteristic (ROC) analysis revealed lactate to be the best overall predictor for increased mortality and further septic complications, irrespective of the leading injury. Conclusion Lactate showed the best performance in predicting sepsis or death in all trauma patients except those with isolated TBI, and the differences were greatest in patients with substantial bleeding. Following isolated TBI, SOFA score was the only parameter which could differentiate survivors from non-survivors on admission, although the SOFA score, too, was not an independent predictor of death following multivariate analysis.
Resumo:
Traditionally, machine learning algorithms have been evaluated in applications where assumptions can be reliably made about class priors and/or misclassification costs. In this paper, we consider the case of imprecise environments, where little may be known about these factors and they may well vary significantly when the system is applied. Specifically, the use of precision-recall analysis is investigated and compared to the more well known performance measures such as error-rate and the receiver operating characteristic (ROC). We argue that while ROC analysis is invariant to variations in class priors, this invariance in fact hides an important factor of the evaluation in imprecise environments. Therefore, we develop a generalised precision-recall analysis methodology in which variation due to prior class probabilities is incorporated into a multi-way analysis of variance (ANOVA). The increased sensitivity and reliability of this approach is demonstrated in a remote sensing application.
Resumo:
In this study, a new entropy measure known as kernel entropy (KerEnt), which quantifies the irregularity in a series, was applied to nocturnal oxygen saturation (SaO 2) recordings. A total of 96 subjects suspected of suffering from sleep apnea-hypopnea syndrome (SAHS) took part in the study: 32 SAHS-negative and 64 SAHS-positive subjects. Their SaO 2 signals were separately processed by means of KerEnt. Our results show that a higher degree of irregularity is associated to SAHS-positive subjects. Statistical analysis revealed significant differences between the KerEnt values of SAHS-negative and SAHS-positive groups. The diagnostic utility of this parameter was studied by means of receiver operating characteristic (ROC) analysis. A classification accuracy of 81.25% (81.25% sensitivity and 81.25% specificity) was achieved. Repeated apneas during sleep increase irregularity in SaO 2 data. This effect can be measured by KerEnt in order to detect SAHS. This non-linear measure can provide useful information for the development of alternative diagnostic techniques in order to reduce the demand for conventional polysomnography (PSG). © 2011 IEEE.
Resumo:
We develop, implement and study a new Bayesian spatial mixture model (BSMM). The proposed BSMM allows for spatial structure in the binary activation indicators through a latent thresholded Gaussian Markov random field. We develop a Gibbs (MCMC) sampler to perform posterior inference on the model parameters, which then allows us to assess the posterior probabilities of activation for each voxel. One purpose of this article is to compare the HJ model and the BSMM in terms of receiver operating characteristics (ROC) curves. Also we consider the accuracy of the spatial mixture model and the BSMM for estimation of the size of the activation region in terms of bias, variance and mean squared error. We perform a simulation study to examine the aforementioned characteristics under a variety of configurations of spatial mixture model and BSMM both as the size of the region changes and as the magnitude of activation changes.
Resumo:
Secondary caries has been reported as the main reason for restoration replacement. The aim of this in vitro study was to evaluate the performance of different methods - visual inspection, laser fluorescence (DIAGNOdent), radiography and tactile examination - for secondary caries detection in primary molars restored with amalgam. Fifty-four primary molars were photographed and 73 suspect sites adjacent to amalgam restorations were selected. Two examiners evaluated independently these sites using all methods. Agreement between examiners was assessed by the Kappa test. To validate the methods, a caries-detector dye was used after restoration removal. The best cut-off points for the sample were found by a Receiver Operator Characteristic (ROC) analysis, and the area under the ROC curve (Az), and the sensitivity, specificity and accuracy of the methods were calculated for enamel (D2) and dentine (D3) thresholds. These parameters were found for each method and then compared by the McNemar test. The tactile examination and visual inspection presented the highest inter-examiner agreement for the D2 and D3 thresholds, respectively. The visual inspection also showed better performance than the other methods for both thresholds (Az = 0.861 and Az = 0.841, respectively). In conclusion, the visual inspection presented the best performance for detecting enamel and dentin secondary caries in primary teeth restored with amalgam.
Resumo:
A warning system for sooty blotch and flyspeck (SBFS) of apple, developed in the southeastern United States, uses cumulative hours of leaf wetness duration (LWD) to predict the timing of the first appearance of signs. In the Upper Midwest United States, however, this warning system has resulted in sporadic disease control failures. The purpose of the present study was to determine whether the warning system`s algorithm could be modified to provide more reliable assessment of SBFS risk. Hourly LWD, rainfall, relative humidity (RH), and temperature data were collected from orchards in Iowa, North Carolina, and Wisconsin in 2005 and 2006. Timing of the first appearance of SBFS signs was determined by weekly scouting. Preliminary analysis using scatterplots and boxplots suggested that Cumulative hours of RH >= 97% could be a useful predictor of SBFS appearance. Receiver operating characteristic curve analysis was used to compare the predictive performance of cumulative LWD and cumulative hours of RH >= 97%. Cumulative hours of RH >= 97% was a more conservative and accurate predictor than cumulative LWD for 15 site years in the Upper Midwest, but not for four site years in North Carolina. Performance of the SBFS warning system in the Upper Midwest and climatically similar regions may be improved if cumulative hours of RH >= 97% were substituted for cumulative LWD to predict the first appearance of SBFS.
Resumo:
Objective. The purpose of this study was to determine whether the Hopkins Verbal Learning Test (HVLT) could be used as a valid and reliable screening test for mild dementia in older people, and to compare its performance to that of the Mini-Mental State Examination (MMSE). Method. Using a cross-sectional design, we studied three groups of older subjects recruited from a district geriatric psychiatry service: (1) 26 patients with DSM-IV dementia and MMSE scores of 18 or better; (2) 15 patients with psychiatric diagnoses other than dementia; and (3) 15 normal controls. The relationship of each potential cutting point on the HVLT and the MMSE was examined against the independently ascertained DSM-IV diagnoses of dementia using a Receiver Operating Characteristic (ROC) analysis. Results. The subjects consisted of 21 (37.5%) males and 35 (62.5%) females with a mean age of 74.7 (SD 6.1) years and a mean of 8.5 (SD 1.8) years of formal education. ROC analysis indicated that the optimal cutting point for detecting mild dementia in this group of subjects using the HVLT was 18/19 (sensitivity = 0.96, specificity = 0.80) and using the MMSE was 25/26 (sensitivity = 0.88, specificity = 0.93). Conclusions. The HVLT can be recommended as a valid and reliable screening test for mild dementia and as an adjunct in the clinical assessment of older people. The HVLT had better sensitivity than the MMSE in detecting patients with mild dementia, whereas the MMSE had better specificity. Copyright (C) 2000 John Wiley & Sons, Ltd.
Resumo:
Objective To assess the validity and the reliability of the Portuguese version of the Delirium Rating Scale-Revised-98 (DRS-R-98). Methods The scale was translated into Portuguese and back-translated into English. After assessing its face validity, five diagnostic groups (n = 64; delirium, depression, dementia, schizophrenia and others) were evaluated by two independent researchers blinded to the diagnosis. Diagnosis and severity of delirium as measured by the DRS-R-98 were compared to clinical diagnosis, Mini-Mental State Exam, Confusion Assessment Method, and Clinical Global Impressions scale (CGI). Results Mean and rnedian DRS-R-98 total scores significantly distinguished delirium from the other groups (p < 0.001). Inter-rater reliability (ICC between 0.9 and 1) and internal consistency (alpha = 0.91) were very high. DRS-R-98 severity scores correlated highly with the CGI. Mean DRS-R-98 severity scores during delirium differed significantly (p < 0.01) from the post-treatment values. The area under the curve established by ROC analysis was 0.99 and using the cut-off Value of 20 the scale showed sensitivity and specificity of 92.6% and 94.6%, respectively. Conclusion The Portuguese version of the DRS-R-98 is a valid and reliable measure of delirium that distinguishes delirium from other disorders and is sensitive to change in delirium severity, which may be of great value for longitudinal studies. Copyright (c) 2007 John Wiley & Sons, Ltd.
Resumo:
Aim: To demonstrate that the evaluation of erythrocyte dysmorphism by light microscopy with lowering of the condenser lens (LMLC) is useful to identify patients with a haematuria of glomerular or non-glomerular origin. Methods: A comparative double-blind study between phase contrast microscopy (PCM) and LMLC is reported to evaluate the efficacy of these techniques. Urine samples of 39 patients followed up for 9 months were analyzed, and classified as glomerular and non-glomerular haematuria. The different microscopic techniques were compared using receiver-operator curve (ROC) analysis and area under curve (AUC). Reproducibility was assessed by coefficient of variation (CV). Results: Specific cut-offs were set for each method according to their best rate of specificity and sensitivity as follows: 30% for phase contrast microscopy and 40% for standard LMLC, reaching in the first method the rate of 95% and 100% of sensitivity and specificity, respectively, and in the second method the rate of 90% and 100% of sensitivity and specificity, respectively. In ROC analysis, AUC for PCM was 0.99 and AUC for LMLC was 0.96. The CV was very similar in glomerular haematuria group for PCM (35%) and LMLC (35.3%). Conclusion: LMLC proved to be effective in contributing to the direction of investigation of haematuria, toward the nephrological or urological side. This method can substitute PCM when this equipment is not available.
Resumo:
Objective. The aim of this study was to investigate the influence of the menstrual cycle and oral contraceptive (OC) intake on the pressure pain threshold (PPT) of masticatory muscles in patients with masticatory myofascial pain (MFP). Study design. The sample was composed of 36 women, divided into 4 groups, according to the presence of MFP and the intake of OC (15 patients had MFP [7 taking OC] and 21 were pain-free controls [8 taking OC]). The algometer-based PPT of masseter and temporalis, and the record of subjective pain by visual analog scale (VAS) were determined during 2 consecutives menstrual cycles at 4 phases (menstrual, follicular, periovulatory, and luteal). A 3-way ANOVA for repeated measurements, Kruskal-Wallis, Friedman, and Dunn tests, with a 5% significant level analyzed the data. Results. PPT was significantly lower in MFP patients when compared with controls throughout the experiment (P < .001). The menstrual phases did not influence PPT (P > .05), while the intake of OC seems to raise PPT levels for the left temporalis (P = .01) and right masseter (P = .04). VAS was, in general, higher at the menstrual phase Conclusions. Different phases of the menstrual cycle have no influence on PPT values, regardless of the presence of a previous condition, as masticatory myofascial pain, while the intake of OC is associated with decreased levels of reported pain.