5 resultados para Reliability (Statistics).
em University of Queensland eSpace - Australia
Resumo:
Most of the modem developments with classification trees are aimed at improving their predictive capacity. This article considers a curiously neglected aspect of classification trees, namely the reliability of predictions that come from a given classification tree. In the sense that a node of a tree represents a point in the predictor space in the limit, the aim of this article is the development of localized assessment of the reliability of prediction rules. A classification tree may be used either to provide a probability forecast, where for each node the membership probabilities for each class constitutes the prediction, or a true classification where each new observation is predictively assigned to a unique class. Correspondingly, two types of reliability measure will be derived-namely, prediction reliability and classification reliability. We use bootstrapping methods as the main tool to construct these measures. We also provide a suite of graphical displays by which they may be easily appreciated. In addition to providing some estimate of the reliability of specific forecasts of each type, these measures can also be used to guide future data collection to improve the effectiveness of the tree model. The motivating example we give has a binary response, namely the presence or absence of a species of Eucalypt, Eucalyptus cloeziana, at a given sampling location in response to a suite of environmental covariates, (although the methods are not restricted to binary response data).
Resumo:
Accurate monitoring of prevalence and trends in population levels of physical activity (PA) is a fundamental public health need. Test-retest reliability (repeatability) was assessed in population samples for four self-report PA measures: the Active Australia survey (AA, N=356), the short International Physical Activity Questionnaire (IPAQ, N=104), the physical activity items in the Behavioral Risk Factor Surveillance System (BRFSS, N=127) and in the Australian National Health Survey (NHS, N=122). Percent agreement and Kappa statistics were used to assess reliability of classification of activity status as 'active', 'insufficiently active' or 'sedentary'. Intraclass correlations (ICCs) were used to assess agreement on minutes of activity reported for each item of each survey and for total minutes. Percent agreement scores for activity status were very good on all four instruments, ranging from 60% for the NHS to 79% for the IPAQ. Corresponding Kappa statistics ranged from 0.40 (NHS) to 0.52 (AA). For individual items, ICCs were highest for walking (0.45 to 0.78) and vigorous activity (0.22 to 0.64) and lowest for the moderate questions (0.16 to 0.44). All four measures provide acceptable levels of test-retest reliability for assessing both activity status and sedentariness, and moderate reliability for assessing total minutes of activity.
Resumo:
OBJECTIVE To determine the ability of pathologists to reproducibly diagnose a newly defined lesion, i.e. the papillary urothelial neoplasm of low malignant potential (PUNLMP) using the published criteria, defined by the 1998 World Health Organisation/International Society of Urological Pathology (WHO/ISUP) classification system; in addition, debate remains about the clinical behaviour of these lesions, thus the rates of recurrence and progression of PUNLMP lesions were assessed and compared with low-grade papillary urothelial carcinomas (LG-PUC) and high-grade (HG-PUC) over a 10-year follow-up. PATIENTS AND METHODS Forty-nine cases of superficial bladder cancer (G1-3 pTa) representing an initial diagnosis of transitional cell carcinoma made in 1990 were identified and re-graded using the 1998 WHO/ISUP classification by two pathologists. Inter-observer agreement was assessed using Cohen weighted kappa statistics. After reclassification the clinical follow-up was reviewed retrospectively, and episodes of recurrence and progression recorded. RESULTS The inter-observer agreement was moderate, regardless of whether one (kappa 0.45) or two (kappa 0.60) pathologists were used to grade these lesions. Re-classification identified 12 PUNLMP, 28 LG-PUC and nine HG-PUC. PUNLMP lesions recurred in 25% (3/12) of cases; no progression was documented. Recurrence rates were 75% (21/28) and 67% (6/9) for LG- and HG-PUC, respectively, and progression rates were 4% (1/28) and 22% (2/9). CONCLUSION The 1998 WHO/ISUP classification of urothelial neoplasms can be reproducibly applied by pathologists, with a moderate level of agreement. There is evidence that PUNLMP lesions have a more indolent clinical behaviour than urothelial carcinomas. However, the risk of recurrence and progression remains, and clinical monitoring of these patients is important.
Resumo:
The H I Parkes All Sky Survey (HIPASS) is a blind extragalactic H I 21-cm emission-line survey covering the whole southern sky from declination -90degrees to +25degrees. The HIPASS catalogue (HICAT), containing 4315 H I-selected galaxies from the region south of declination +2degrees, is presented in Meyer et al. (Paper I). This paper describes in detail the completeness and reliability of HICAT, which are calculated from the recovery rate of synthetic sources and follow-up observations, respectively. HICAT is found to be 99 per cent complete at a peak flux of 84 mJy and an integrated flux of 9.4 Jy km. s(-1). The overall reliability is 95 per cent, but rises to 99 per cent for sources with peak fluxes >58 mJy or integrated flux >8.2 Jy km s(-1). Expressions are derived for the uncertainties on the most important HICAT parameters: peak flux, integrated flux, velocity width and recessional velocity. The errors on HICAT parameters are dominated by the noise in the HIPASS data, rather than by the parametrization procedure.
Resumo:
Test-retest reliabilities and practice affects of measures from the Rapid Screen of Concussion (RSC), in addition to the Digit Symbol Substitution Test (Digit Symbol), were examined. Twenty five male participants were tested three times; each testing session scheduled a week apart. The test-retest reliability estimates for most measures were reasonably good, ranging from .79 to .97. An exception was the delayed word recall test, which has had a reliability estimate of .66 for the first retest, and .59 for the second retest. Practice effects were evident from Times 1 to 2 on the sentence comprehension and delayed recall subtests of the RSC, Digit Symbol and a composite score. There was also a practice effect of the same magnitude found from Time 2 to Time 3 on Digit Symbol, delayed recall and the composite score. Statistics on measures for both the first and second retest intervals, with associated practice affects, are presented to enable the calculation of reliable change indices (RCI). The RCI may be used to assess any improvement in cognitive functioning after mild Traumatic Brain Injury.