38 resultados para Inter-rater reliability

em BORIS: Bern Open Repository and Information System - Berna - Suiça


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Pulmonary Embolism Severity Index (PESI) is a validated clinical prognostic model for patients with acute pulmonary embolism (PE). Our goal was to assess the PESI's inter-rater reliability in patients diagnosed with PE. We prospectively identified consecutive patients diagnosed with PE in the emergency department of a Swiss teaching hospital. For all patients, resident and attending physician raters independently collected the 11 PESI variables. The raters then calculated the PESI total point score and classified patients into one of five PESI risk classes (I-V) and as low (risk classes I/II) versus higher-risk (risk classes III-V). We examined the inter-rater reliability for each of the 11 PESI variables, the PESI total point score, assignment to each of the five PESI risk classes, and classification of patients as low versus higher-risk using kappa ( ) and intra-class correlation coefficients (ICC). Among 48 consecutive patients with an objective diagnosis of PE, reliability coefficients between resident and attending physician raters were > 0.60 for 10 of the 11 variables comprising the PESI. The inter-rater reliability for the PESI total point score (ICC: 0.89, 95% CI: 0.81-0.94), PESI risk class assignment ( : 0.81, 95% CI: 0.66-0.94), and the classification of patients as low versus higher-risk ( : 0.92, 95% CI: 0.72-0.98) was near perfect. Our results demonstrate the high reproducibility of the PESI, supporting the use of the PESI for risk stratification of patients with PE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND The abstraction of data from medical records is a widespread practice in epidemiological research. However, studies using this means of data collection rarely report reliability. Within the Transition after Childhood Cancer Study (TaCC) which is based on a medical record abstraction, we conducted a second independent abstraction of data with the aim to assess a) intra-rater reliability of one rater at two time points; b) the possible learning effects between these two time points compared to a gold-standard; and c) inter-rater reliability. METHOD Within the TaCC study we conducted a systematic medical record abstraction in the 9 Swiss clinics with pediatric oncology wards. In a second phase we selected a subsample of medical records in 3 clinics to conduct a second independent abstraction. We then assessed intra-rater reliability at two time points, the learning effect over time (comparing each rater at two time-points with a gold-standard) and the inter-rater reliability of a selected number of variables. We calculated percentage agreement and Cohen's kappa. FINDINGS For the assessment of the intra-rater reliability we included 154 records (80 for rater 1; 74 for rater 2). For the inter-rater reliability we could include 70 records. Intra-rater reliability was substantial to excellent (Cohen's kappa 0-6-0.8) with an observed percentage agreement of 75%-95%. In all variables learning effects were observed. Inter-rater reliability was substantial to excellent (Cohen's kappa 0.70-0.83) with high agreement ranging from 86% to 100%. CONCLUSIONS Our study showed that data abstracted from medical records are reliable. Investigating intra-rater and inter-rater reliability can give confidence to draw conclusions from the abstracted data and increase data quality by minimizing systematic errors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To assess the intra-reader and inter-reader reliabilities of interpreting ultrasonography by several experts using video clips. METHOD: 99 video clips of healthy and rheumatic joints were recorded and delivered to 17 physician sonographers in two rounds. The intra-reader and inter-reader reliabilities of interpreting the ultrasound results were calculated using a dichotomous system (normal/abnormal) and a graded semiquantitative scoring system. RESULTS: The video reading method worked well. 70% of the readers could classify at least 70% of the cases correctly as normal or abnormal. The distribution of readers answering correctly was wide. The most difficult joints to assess were the elbow, wrist, metacarpophalangeal (MCP) and knee joints. The intra-reader and inter-reader agreements on interpreting dynamic ultrasound images as normal or abnormal, as well as detecting and scoring a Doppler signal were moderate to good (kappa = 0.52-0.82). CONCLUSIONS: Dynamic image assessment (video clips) can be used as an alternative method in ultrasonography reliability studies. The intra-reader and inter-reader reliabilities of ultrasonography in dynamic image reading are acceptable, but more definitions and training are needed to improve sonographic reproducibility.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to evaluate the reliability of the cardiothoracic ratio (CTR) in postmortem computed tomography (PMCT) and to assess a CTR threshold for the diagnosis of cardiomegaly based on the weight of the heart at autopsy. PMCT data of 170 deceased human adults were retrospectively evaluated by two blinded radiologists. The CTR was measured on axial computed tomography images and the actual cardiac weight was weighed at autopsy. Inter-rater reliability, sensitivity, and specificity were calculated. Receiver operating characteristic curves were calculated to assess enlarged heart weight by CTR. The autopsy definition of cardiomegaly was based on normal values of the Zeek method (within a range of both, one or two SD) and the Smith method (within the given range). Intra-class correlation coefficients demonstrated excellent agreements (0.983) regarding CTR measurements. In 105/170 (62 %) cases the CTR in PMCT was >0.5, indicating enlarged heart weight, according to clinical references. The mean heart weight measured in autopsy was 405 ± 105 g. As a result, 114/170 (67 %) cases were interpreted as having enlarged heart weights according to the normal values of Zeek within one SD, while 97/170 (57 %) were within two SD. 100/170 (59 %) were assessed as enlarged according to Smith's normal values. The sensitivity/specificity of the 0.5 cut-off of the CTR for the diagnosis of enlarged heart weight was 78/71 % (Zeek one SD), 74/55 % (Zeek two SD), and 76/59 % (Smith), respectively. The discriminative power between normal heart weight and cardiomegaly was 79, 73, and 74 % for the Zeek (1SD/2SD) and Smith methods respectively. Changing the CTR threshold to 0.57 resulted in a minimum specificity of 95 % for all three definitions of cardiomegaly. With a CTR threshold of 0.57, cardiomegaly can be identified with a very high specificity. This may be useful if PMCT is used by forensic pathologists as a screening tool for medico-legal autopsies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES To test the inter-rater reliability of the RoB tool applied to Physical Therapy (PT) trials by comparing ratings from Cochrane review authors with those of blinded external reviewers. METHODS Randomized controlled trials (RCTs) in PT were identified by searching the Cochrane Database of Systematic Reviews for meta-analysis of PT interventions. RoB assessments were conducted independently by 2 reviewers blinded to the RoB ratings reported in the Cochrane reviews. Data on RoB assessments from Cochrane reviews and other characteristics of reviews and trials were extracted. Consensus assessments between the two reviewers were then compared with the RoB ratings from the Cochrane reviews. Agreement between Cochrane and blinded external reviewers was assessed using weighted kappa (κ). RESULTS In total, 109 trials included in 17 Cochrane reviews were assessed. Inter-rater reliability on the overall RoB assessment between Cochrane review authors and blinded external reviewers was poor (κ  =  0.02, 95%CI: -0.06, 0.06]). Inter-rater reliability on individual domains of the RoB tool was poor (median κ  = 0.19), ranging from κ  =  -0.04 ("Other bias") to κ  =  0.62 ("Sequence generation"). There was also no agreement (κ  =  -0.29, 95%CI: -0.81, 0.35]) in the overall RoB assessment at the meta-analysis level. CONCLUSIONS Risk of bias assessments of RCTs using the RoB tool are not consistent across different research groups. Poor agreement was not only demonstrated at the trial level but also at the meta-analysis level. Results have implications for decision making since different recommendations can be reached depending on the group analyzing the evidence. Improved guidelines to consistently apply the RoB tool and revisions to the tool for different health areas are needed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Virtual patients (VPs) are increasingly used to train clinical reasoning. So far, no validated evaluation instruments for VP design are available. Aims: We examined the validity of an instrument for assessing the perception of VP design by learners. Methods: Three sources of validity evidence were examined: (i) Content was examined based on theory of clinical reasoning and an international VP expert team. (ii) The response process was explored in think-aloud pilot studies with medical students and in content analyses of free text questions accompanying each item of the instrument. (iii) Internal structure was assessed by exploratory factor analysis (EFA) and inter-rater reliability by generalizability analysis. Results: Content analysis was reasonably supported by the theoretical foundation and the VP expert team. The think-aloud studies and analysis of free text comments supported the validity of the instrument. In the EFA, using 2547 student evaluations of a total of 78 VPs, a three-factor model showed a reasonable fit with the data. At least 200 student responses are needed to obtain a reliable evaluation of a VP on all three factors. Conclusion: The instrument has the potential to provide valid information about VP design, provided that many responses per VP are available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Abstractor training is a key element in creating valid and reliable data collection procedures. The choice between in-person vs. remote or simultaneous vs. sequential abstractor training has considerable consequences for time and resource utilization. We conducted a web-based (webinar) abstractor training session to standardize training across six individual Cancer Research Network (CRN) sites for a study of breast cancer treatment effects in older women (BOWII). The goals of this manuscript are to describe the training session, its participants and participants' evaluation of webinar technology for abstraction training. Findings A webinar was held for all six sites with the primary purpose of simultaneously training staff and ensuring consistent abstraction across sites. The training session involved sequential review of over 600 data elements outlined in the coding manual in conjunction with the display of data entry fields in the study's electronic data collection system. Post-training evaluation was conducted via Survey Monkey©. Inter-rater reliability measures for abstractors within each site were conducted three months after the commencement of data collection. Ten of the 16 people who participated in the training completed the online survey. Almost all (90%) of the 10 trainees had previous medical record abstraction experience and nearly two-thirds reported over 10 years of experience. Half of the respondents had previously participated in a webinar, among which three had participated in a webinar for training purposes. All rated the knowledge and information delivered through the webinar as useful and reported it adequately prepared them for data collection. Moreover, all participants would recommend this platform for multi-site abstraction training. Consistent with participant-reported training effectiveness, results of data collection inter-rater agreement within sites ranged from 89 to 98%, with a weighted average of 95% agreement across sites. Conclusions Conducting training via web-based technology was an acceptable and effective approach to standardizing medical record review across multiple sites for this group of experienced abstractors. Given the substantial time and cost savings achieved with the webinar, coupled with participants' positive evaluation of the training session, researchers should consider this instructional method as part of training efforts to ensure high quality data collection in multi-site studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: To evaluate the agreement of blood pressure measurements and hypertension scores obtained by use of 3 indirect arterial blood pressure measurement devices in hospitalized dogs. Design-Diagnostic test evaluation. ANIMALS: 29 client-owned dogs. PROCEDURES: 5 to 7 consecutive blood pressure readings were obtained from each dog on each of 3 occasions with a Doppler ultrasonic flow detector, a standard oscillometric device (STO), and a high-definition oscillometric device (HDO). RESULTS: When the individual sets of 5 to 7 readings were evaluated, the coefficient of variation for systolic arterial blood pressure (SAP) exceeded 20% for 0% (Doppler), 11 % (STO), and 28% (HDO) of the sets of readings. After readings that exceeded a 20% coefficient of variation were discarded, repeatability was within 25 (Doppler), 37 (STO), and 39 (HDO) mm Hg for SAP. Correlation of mean values among the devices was between 0.47 and 0.63. Compared with Doppler readings, STO underestimated and HDO overestimated SAP. Limits of agreement between mean readings of any 2 devices were wide. With the hypertension scale used to score SAP, the intraclass correlation of scores was 0.48. Linear-weighted inter-rater reliability between scores was 0.40 (Doppler vs STO), 0.38 (Doppler vs HDO), and 0.29 (STO vs HDO). CONCLUSIONS AND CLINICAL RELEVANCE: Results of this study suggested that no meaningful clinical comparison can be made between blood pressure readings obtained from the same dog with different indirect blood pressure measurement devices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

INTRODUCTION Hemodynamic management in intensive care patients guided by blood pressure and flow measurements often do not sufficiently reveal common hemodynamic problems. Trans-esophageal echocardiography (TEE) allows for direct measurement of cardiac volumes and function. A new miniaturized probe for TEE (mTEE) potentially provides a rapid and simplified approach to monitor cardiac function. The aim of the study was to assess the feasibility of hemodynamic monitoring using mTEE in critically ill patients after a brief operator training period. METHODS In the context of the introduction of mTEE in a large ICU, 14 ICU staff specialists with no previous TEE experience received six hours of training as mTEE operators. The feasibility of mTEE and the quality of the obtained hemodynamic information were assessed. Three standard views were acquired in hemodynamically unstable patients: 1) for assessment of left ventricular function (LV) fractional area change (FAC) was obtained from a trans-gastric mid-esophageal short axis view, 2) right ventricular (RV) size was obtained from mid-esophageal four chamber view, and 3) superior vena cava collapsibility for detection of hypovolemia was assessed from mid-esophageal ascending aortic short axis view. Off-line blinded assessment by an expert cardiologist was considered as a reference. Inter-rater agreement was assessed using Chi-square tests or correlation analysis as appropriate. RESULTS In 55 patients, 148 mTEE examinations were performed. Acquisition of loops in sufficient quality was possible in 110 examinations for trans-gastric mid-esophageal short axis, 118 examinations for mid-esophageal four chamber and 125 examinations for mid-esophageal ascending aortic short axis view. Inter-rater agreement (Kappa) between ICU mTEE operators and the reference was 0.62 for estimates of LV function, 0.65 for RV dilatation, 0.76 for hypovolemia and 0.77 for occurrence of pericardial effusion (all P < 0.0001). There was a significant correlation between the FAC measured by ICU operators and the reference (r = 0.794, P (one-tailed) < 0.0001). CONCLUSIONS Echocardiographic examinations using mTEE after brief bed-side training were feasible and of sufficient quality in a majority of examined ICU patients with good inter-rater reliability between mTEE operators and an expert cardiologist. Further studies are required to assess the impact of hemodynamic monitoring by mTEE on relevant patient outcomes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The "Ardouin Scale of Behavior in Parkinson's Disease" is a new instrument specifically designed for assessing mood and behavior with a view to quantifying changes related to Parkinson's disease, to dopaminergic medication, and to non-motor fluctuations. This study was aimed at analyzing the psychometric attributes of this scale in patients with Parkinson's disease without dementia. In addition to this scale, the following measures were applied: the Unified Parkinson's Disease Rating Scale, the Montgomery and Asberg Depression Rating Scale, the Lille Apathy Rating Scale, the Bech and Rafaelsen Mania Scale, the Positive and Negative Syndrome Scale, the MacElroy Criteria, the Patrick Carnes criteria, the Hospital Anxiety and Depression Scale, and the Mini-International Neuropsychiatric Interview. Patients (n = 260) were recruited at 13 centers across four countries (France, Spain, United Kingdom, and United States). Cronbach's alpha coefficient for domains ranged from 0.69 to 0.78. Regarding test-retest reliability, the kappa coefficient for items was higher than 0.4. For inter-rater reliability, the kappa values were 0.29 to 0.81. Furthermore, most of the items from the Ardouin Scale of Behavior in Parkinson's Disease correlated with the corresponding items of the other scales, depressed mood with the Montgomery and Asberg Depression Rating Scale (ρ = 0.82); anxiety with the Hospital Anxiety and Depression Scale-anxiety (ρ = 0.56); apathy with the Lille Apathy Rating Scale (ρ = 0.60). The Ardouin Scale of Behavior in Parkinson's disease is an acceptable, reproducible, valid, and precise assessment for evaluating changes in behavior in patients with Parkinson's disease without dementia. © 2015 International Parkinson and Movement Disorder Society.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND AND PURPOSE Reproducible segmentation of brain tumors on magnetic resonance images is an important clinical need. This study was designed to evaluate the reliability of a novel fully automated segmentation tool for brain tumor image analysis in comparison to manually defined tumor segmentations. METHODS We prospectively evaluated preoperative MR Images from 25 glioblastoma patients. Two independent expert raters performed manual segmentations. Automatic segmentations were performed using the Brain Tumor Image Analysis software (BraTumIA). In order to study the different tumor compartments, the complete tumor volume TV (enhancing part plus non-enhancing part plus necrotic core of the tumor), the TV+ (TV plus edema) and the contrast enhancing tumor volume CETV were identified. We quantified the overlap between manual and automated segmentation by calculation of diameter measurements as well as the Dice coefficients, the positive predictive values, sensitivity, relative volume error and absolute volume error. RESULTS Comparison of automated versus manual extraction of 2-dimensional diameter measurements showed no significant difference (p = 0.29). Comparison of automated versus manual segmentation of volumetric segmentations showed significant differences for TV+ and TV (p<0.05) but no significant differences for CETV (p>0.05) with regard to the Dice overlap coefficients. Spearman's rank correlation coefficients (ρ) of TV+, TV and CETV showed highly significant correlations between automatic and manual segmentations. Tumor localization did not influence the accuracy of segmentation. CONCLUSIONS In summary, we demonstrated that BraTumIA supports radiologists and clinicians by providing accurate measures of cross-sectional diameter-based tumor extensions. The automated volume measurements were comparable to manual tumor delineation for CETV tumor volumes, and outperformed inter-rater variability for overlap and sensitivity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

OBJECTIVE Visual hallucinations (VHs) are a very personal experience, and it is not clear whether information about them is best provided by informants or patients. Some patients may not share their hallucinatory experiences with caregivers to avoid distress or for fear of being labeled insane, and others do not have informants at all, which limits the use of informant-based questionnaires. The aim of this study was to compare patient and caregiver views about VHs in Parkinson disease (PD), using the North-East Visual Hallucinations Interview (NEVHI). METHODS Fifty-nine PD patient-informant pairs were included. PD patients and informants were interviewed separately about VHs using the NEVHI. Informants were additionally interviewed using the four-item version of the Neuropsychiatric Inventory. Inter-reliability and concurrent validity of the different measures were compared. RESULTS VHs were more commonly reported by patients than informants. The inter-rater agreement between NEVHI-patient and NEVHI-informant was moderate for complex VHs (Cohen's kappa = 0.44; 95% confidence interval [CI]: 0.13-0.75; t = 3.43, df = 58, p = 0.001) and feeling of presence (Cohen's kappa = 0.35; 95% CI: 0.00-0.70; t = 2.75, df = 58, p = 0.006), but agreement was poor for illusions (Cohen's kappa = 0.25; 95% CI: -0.07-0.57; t = 2.36, df = 58, p = 0.018) and passage hallucinations (Cohen's kappa = 0.16; 95% CI: -0.04-0.36; t = 2.26, df = 58, p = 0.024). CONCLUSION When assessing VHs in PD patients, it is best to rely on patient information, because not all patients share the details of their hallucinations with their caregivers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

PURPOSE Stress urinary incontinence (SUI) affects women of all ages including young athletes, especially those involved in high-impact sports. To date, hardly any studies are available testing pelvic floor muscles (PFM) during sports activities. The aim of this study was the description and reliability test of six PFM electromyography (EMG) variables during three different running speeds. The secondary objective was to evaluate whether there was a speed-dependent difference between the PFM activity variables. METHODS This trial was designed as an exploratory and reliability study including ten young healthy female subjects to characterize PFM pre-activity and reflex activity during running at 7, 9 and 11 km/h. Six variables for each running speed, averaged over ten steps per subject, were presented descriptively, tested regarding their reliability (Friedman, ICC, SEM, MD) and speed difference (Friedman). RESULTS PFM EMG variables varied between 67.6 and 106.1 %EMG, showed no systematic error and were low for SEM and MD using the single value model. Applying the average model over ten steps, ICC (3,k) were >0.75 and SEM and MD about 50 % lower than for the single value model. Activity was found to be highest in 11 km/h. CONCLUSION EMG variables showed excellent ICC and very low SEM and MD. Further studies should investigate inter-session reliability and PFM reactivity patterns of SUI patients using the average over ten steps for each variable as it showed very high ICC and very low SEM and MD. Subsequently, longer running distances and other high-impact sports disciplines could be studied.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The objective of this study is to compare dental arch relationship following one-stage and three-stage surgical protocols of unilateral cleft lip and palate. Dental casts of 61 children (mean age, 11.2 years; SD, 1.7), consecutively treated in one center with one-stage closure of the complete cleft at 9.2 months (SD, 2.0), were compared with a sample of 97 patients (mean age, 8.7 years; SD, 0.9), consecutively treated with a three-stage protocol including delayed hard palate closure in another center. The dental casts were assigned random numbers to blind their origin. Four raters graded dental arch relationship and palatal morphology using the EUROCRAN index. The strength of agreement of rating was assessed with kappa statistics. Independent t tests were run to compare the EUROCRAN scores between one-stage and three-stage samples, and Fisher's exact tests were performed to evaluate differences of distribution of the EUROCRAN grades. The intra- and inter-rater agreement was moderate to very good. Dental arch relationship in the one-stage sample was less favorable than in three-stage group (mean scores, 2.58 and 1.97 for one-stage and three-stage samples, respectively; p?

Relevância:

80.00% 80.00%

Publicador:

Resumo:

OBJECTIVE: To assess the reliability of computed tomography (CT) numbers, also known as Hounsfield-units (HU) in the differentiation and identification of forensically relevant materials and to provide instructions to improve the reproducibility of HU measurements in daily forensic practice. MATERIALS AND METHODS: We scanned a phantom containing non-organic materials (glass, rocks and metals) on three different CT scanners with standardized parameters. The t-test was used to assess the influence of the scanner, the size and shape of different types of regions-of-interest (ROI), the composition and shape of the object, and the reader performance on HU measurements. Intra-class correlation coefficient was used to assess intra- and inter-reader reliability. RESULTS: HU values did not change significantly as a function of ROI-shape or -size (p>0.05). Intra-reader reliability reached ICC values >0.929 (p<0.001). Inter-reader reliability was also excellent with an ICC of 0.994 (p<0.001). Four of seven objects yielded significantly different CT numbers at different levels within the object (p<0.05). In 6/7 objects the HU changed significantly from CT scanner to CT scanner (p<0.05). CONCLUSION: Reproducible CT number measurements can be achieved through correct ROI-placement and repeat measurements within the object of interest. However, HU may differ from CT-scanner to CT-scanner. In order to obtain comparable CT numbers we suggest that a dedicated Forensic Reference Phantom be developed.