819 resultados para rater reliability


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective To evaluate the inter-rater reliability of the modified Oxford Grading Scale and the Peritron manometer. Design All participants were evaluated twice, first by one examiner and 30 days later by a second examiner. Measurements of vaginal squeeze pressure were compared with the results from the palpation test. Participants Nineteen women with a mean age of 23.7 years (range 21 to 28 years). Results Inter-rater reliability for vaginal palpation was fair (kappa = 0.33, 95% confidence interval 0.09 to 0.57). Using the Peritron manometer, the difference between examiners was less than 10 cmH(2)O in 11 of the 19 (58%) cases. The palpation test did not differentiate between weak, moderate, good and strong muscle contractions. This study found fair inter-rater reliability for the modified Oxford Grading Scale and moderate inter-rater reliability for the Peritron manometer. Conclusions The inter-rater reliability of vaginal squeeze pressure measurement using the Peritron manometer is acceptable and can be used in re-evaluations performed by different examiners in clinical practice. However, for research purposes, the ideal situation would be for a single examiner to assess and re-assess the subject. Vaginal palpation is important in the clinical assessment of correctness of a pelvic floor muscle contraction, but this study does not support the use of the modified Oxford Grading Scale as a reliable and valid method to measure and differentiate pelvic floor muscle strength. (C) 2010 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Pulmonary Embolism Severity Index (PESI) is a validated clinical prognostic model for patients with acute pulmonary embolism (PE). Our goal was to assess the PESI's inter-rater reliability in patients diagnosed with PE. We prospectively identified consecutive patients diagnosed with PE in the emergency department of a Swiss teaching hospital. For all patients, resident and attending physician raters independently collected the 11 PESI variables. The raters then calculated the PESI total point score and classified patients into one of five PESI risk classes (I-V) and as low (risk classes I/II) versus higher-risk (risk classes III-V). We examined the inter-rater reliability for each of the 11 PESI variables, the PESI total point score, assignment to each of the five PESI risk classes, and classification of patients as low versus higher-risk using kappa (κ) and intra-class correlation coefficients (ICC). Among 48 consecutive patients with an objective diagnosis of PE, reliability coefficients between resident and attending physician raters were > 0.60 for 10 of the 11 variables comprising the PESI. The inter-rater reliability for the PESI total point score (ICC: 0.89, 95% CI: 0.81-0.94), PESI risk class assignment (κ: 0.81, 95% CI: 0.66-0.94), and the classification of patients as low versus higher-risk (κ: 0.92, 95% CI: 0.72-0.98) was near perfect. Our results demonstrate the high reproducibility of the PESI, supporting the use of the PESI for risk stratification of patients with PE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intraclass correlation (ICC) is an established tool to assess inter-rater reliability. In a seminal paper published in 1979, Shrout and Fleiss considered three statistical models for inter-rater reliability data with a balanced design. In their first two models, an infinite population of raters was considered, whereas in their third model, the raters in the sample were considered to be the whole population of raters. In the present paper, we show that the two distinct estimates of ICC developed for the first two models can both be applied to the third model and we discuss their different interpretations in this context.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The evaluation of children's statements of sexual abuse cases in forensic cases is critically important and must and reliable. Criteria-based content analysis (CBCA) is the main component of the statement validity assessment (SVA), which is the most frequently used approach in this setting. This study investigated the inter-rater reliability (IRR) of CBCA in a forensic context. Three independent raters evaluated the transcripts of 95 statements of sexual abuse. IRR was calculated for each criterion, total score, and overall evaluation. The IRR was variable for the criteria, with several being unsatisfactory. But high IRR was found for the total CBCA scores (Kendall's W = 0.84) and for overall evaluation (Kendall's W = 0.65). Despite some shortcomings, SVA remains a robust method to be used in the comprehensive evaluation of children's statements of sexual abuse in the forensic setting. However, the low IRR of some CBCA criteria could justify some technical improvements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: In the Global postural re-education (GPR) evaluation, posture alterations are associated with anterior or posterior muscular chain impairments. Our goal was to assess the reliability of the GPR muscular chain evaluation. Methods: Design: Inter-rater reliability study. Fifty physical therapists (PTs) and two experts trained in GPR assessed the standing posture from photographs of five youths with idiopathic scoliosis using a posture analysis grid with 23 posture indices (PI). The PTs and experts indicated the muscular chain associated with posture alterations. The PTs were also divided into three groups according to their experience in GPR. Experts' results (after consensus) were used to verify agreement between PTs and experts for muscular chain and posture assessments. We used Kappa coefficients (K) and the percentage of agreement (%A) to assess inter-rater reliability and intra-class coefficients (ICC) for determining agreement between PTs and experts. Results: For the muscular chain evaluation, reliability was moderate to substantial for 12 PI for the PTs (% A: 56 to 82; K: 0.42 to 0.76) and perfect for 19 PI for the experts. For posture assessment, reliability was moderate to substantial for 12 PI for the PTs (% A > 60%; K: 0.42 to 0.75) and moderate to perfect for 18 PI for the experts (% A: 80 to 100; K: 0.55 to 1.00). The agreement between PTs and experts was good for most muscular chain evaluations (18 PI; ICC: 0.82 to 0.99) and PI (19 PI; ICC: 0.78 to 1.00). Conclusions: The GPR muscular chain evaluation has good reliability for most posture indices. GPR evaluation should help guide physical therapists in targeting affected muscles for treatment of abnormal posture patterns.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Pulmonary Embolism Severity Index (PESI) is a validated clinical prognostic model for patients with acute pulmonary embolism (PE). Our goal was to assess the PESI's inter-rater reliability in patients diagnosed with PE. We prospectively identified consecutive patients diagnosed with PE in the emergency department of a Swiss teaching hospital. For all patients, resident and attending physician raters independently collected the 11 PESI variables. The raters then calculated the PESI total point score and classified patients into one of five PESI risk classes (I-V) and as low (risk classes I/II) versus higher-risk (risk classes III-V). We examined the inter-rater reliability for each of the 11 PESI variables, the PESI total point score, assignment to each of the five PESI risk classes, and classification of patients as low versus higher-risk using kappa ( ) and intra-class correlation coefficients (ICC). Among 48 consecutive patients with an objective diagnosis of PE, reliability coefficients between resident and attending physician raters were > 0.60 for 10 of the 11 variables comprising the PESI. The inter-rater reliability for the PESI total point score (ICC: 0.89, 95% CI: 0.81-0.94), PESI risk class assignment ( : 0.81, 95% CI: 0.66-0.94), and the classification of patients as low versus higher-risk ( : 0.92, 95% CI: 0.72-0.98) was near perfect. Our results demonstrate the high reproducibility of the PESI, supporting the use of the PESI for risk stratification of patients with PE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND The abstraction of data from medical records is a widespread practice in epidemiological research. However, studies using this means of data collection rarely report reliability. Within the Transition after Childhood Cancer Study (TaCC) which is based on a medical record abstraction, we conducted a second independent abstraction of data with the aim to assess a) intra-rater reliability of one rater at two time points; b) the possible learning effects between these two time points compared to a gold-standard; and c) inter-rater reliability. METHOD Within the TaCC study we conducted a systematic medical record abstraction in the 9 Swiss clinics with pediatric oncology wards. In a second phase we selected a subsample of medical records in 3 clinics to conduct a second independent abstraction. We then assessed intra-rater reliability at two time points, the learning effect over time (comparing each rater at two time-points with a gold-standard) and the inter-rater reliability of a selected number of variables. We calculated percentage agreement and Cohen's kappa. FINDINGS For the assessment of the intra-rater reliability we included 154 records (80 for rater 1; 74 for rater 2). For the inter-rater reliability we could include 70 records. Intra-rater reliability was substantial to excellent (Cohen's kappa 0-6-0.8) with an observed percentage agreement of 75%-95%. In all variables learning effects were observed. Inter-rater reliability was substantial to excellent (Cohen's kappa 0.70-0.83) with high agreement ranging from 86% to 100%. CONCLUSIONS Our study showed that data abstracted from medical records are reliable. Investigating intra-rater and inter-rater reliability can give confidence to draw conclusions from the abstracted data and increase data quality by minimizing systematic errors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rationale and aims 'OTseeker' is an online database of randomized controlled trials (RCTs) and systematic reviews relevant to occupational therapy. RCTs are critically appraised and rated for quality using the 'PEDro' scale. We aimed to investigate the inter-rater reliability of the PEDro scale before and after revising rating guidelines. Methods In study 1, five raters scored 100 RCTs using the original PEDro scale guidelines. In study 2, two raters scored 40 different RCTs using revised guidelines. All RCTs were randomly selected from the OTseeker database. Reliability was calculated using Kappa and intraclass correlation coefficients [ICC (model 2,1)]. Results Inter-rater reliability was 'good to excellent' in the first study (Kappas >= 0.53; ICCs >= 0.71). After revising the rating guidelines, the reliability levels were equivalent or higher to those previously obtained (Kappas >= 0.53; ICCs >= 0.89), except for the item, 'groups similar at baseline', which still had moderate reliability (Kappa = 0.53). In study 2, two PEDro scale items, which had their definitions revised, 'less than 15% dropout' and 'point measures and variability', showed higher reliability. In both studies, the PEDro items with the lowest reliability were 'groups similar at baseline' (Kappas = 0.53), 'less than 15% dropout' (Kappas

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Predicting risk of adverse healthcare outcomes is important to enable targeted delivery of interventions. The Risk Instrument for Screening in the Community (RISC), designed for use by public health nurses (PHNs), measures the one-year risk of hospitalisation, institutionalisation and death in community-dwelling older adults according to a five-point global risk score: from low (score 1,2), medium (3) and high (4,5). We examined the inter-rater reliability (IRR) of the RISC between student PHNs (n=32) and expert raters using six cases (two low, medium and high-risk), scored before and after RISC training. Correlations increased for each adverse outcome, statistically significantly for institutionalisation (r=0.72 to 0.80,p=0.04) and hospitalisation, (r=0.51 to 0.71,p<0.01) but not death. Training improved accuracy for low-risk but not all high-risk cases. Overall, the RISC showed good IRR, which increased after RISC training. That reliability reduced for some high-risk cases suggests that the training programme requires adjustment to further improve IRR.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The National Institute of Mental Health developed the semi-structured Diagnostic Interview for Genetic Studies (DIGS) for the assessment of major mood and psychotic disorders and their spectrum conditions. The DIGS was translated into French in a collaborative effort of investigators from sites in France and Switzerland. Inter-rater and test-retest reliability of the French version have been established in a clinical sample in Lausanne. Excellent inter-rater reliability was found for schizophrenia, bipolar disorder, major depression, and unipolar schizoaffective disorder while fair inter-rater reliability was demonstrated for bipolar schizoaffective disorder. Using a six-week test-retest interval, reliability for all diagnoses was found to be fair to good with the exception of bipolar schizoaffective disorder. The lower test-retest reliability was the result of a relatively long test-retest interval that favored incomplete symptom recall. In order to increase reliability for lifetime diagnoses in persons not currently affected, best-estimate procedures using additional sources of diagnostic information such as medical records and reports from relatives should supplement DIGS information in family-genetic studies. Within such a procedure, the DIGS appears to be a useful part of data collection for genetic studies on major mood disorders and schizophrenia in French-speaking populations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The semi-structured diagnostic interview for genetic studies (DIGS) was developed to assess major mood and psychotic disorders and their spectrum manifestations in genetic studies. Our research group developed a French version of the DIGS and tested its inter-rater and test-retest reliability in psychiatric patients. In this article, we present estimates of the reliability of substance use and antisocial personality disorders. High kappa coefficients for inter-rater reliability were found for drug and alcohol as well as antisocial personality diagnoses and slightly lower kappas for test-retest reliability. Combined with evidence of the reliability of major mood and psychotic disorders, these findings support the suitability of the DIGS for studies of familial aggregation and comorbidity of psychiatric disorders including substance use and antisocial personality disorders.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

OBJECTIVE: The aim of this study was to translate the Structured Clinical Interview for Mood Spectrum into Brazilian Portuguese, measuring its reliability, validity, and defining scores for bipolar disorders. METHOD: Questionnaire was translated (into Brazilian Portuguese) and back-translated into English. Sample consisted of 47 subjects with bipolar disorder, 47 with major depressive disorder, 18 with schizophrenia and 22 controls. Inter-rater reliability was tested in 20 subjects with bipolar disorder and MDD. Internal consistency was measured using the Kuder Richardson formula. Forward stepwise discriminant analysis was performed. Scores were compared between groups; manic (M), depressive (D) and total (T) threshold scores were calculated through receiver operating characteristic (ROC) curves. RESULTS: Kuder Richardson coefficients were between 0.86 and 0.94. Intraclass correlation coefficient was 0.96 (CI 95 % 0.93-0.97). Subjects with bipolar disorder had higher M and T, and similar D scores, when compared to major depressive disorder (ANOVA, p < 0.001). The sub-domains that best discriminated unipolar and bipolar subjects were manic energy and manic mood. M had the best area under the curve (0.909), and values of M equal to or greater than 30 yielded 91.5% sensitivity and 74.5% specificity. CONCLUSION: Structured Clinical Interview for Mood Spectrum has good reliability and validity. Cut-off of 30 best differentiates subjects with bipolar disorder vs. unipolar depression. A cutoff score of 30 or higher in the mania sub-domain is appropriate to help make a distinction between subjects with bipolar disorder and those with unipolar depression.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Objective To assess the validity and the reliability of the Portuguese version of the Delirium Rating Scale-Revised-98 (DRS-R-98). Methods The scale was translated into Portuguese and back-translated into English. After assessing its face validity, five diagnostic groups (n = 64; delirium, depression, dementia, schizophrenia and others) were evaluated by two independent researchers blinded to the diagnosis. Diagnosis and severity of delirium as measured by the DRS-R-98 were compared to clinical diagnosis, Mini-Mental State Exam, Confusion Assessment Method, and Clinical Global Impressions scale (CGI). Results Mean and rnedian DRS-R-98 total scores significantly distinguished delirium from the other groups (p < 0.001). Inter-rater reliability (ICC between 0.9 and 1) and internal consistency (alpha = 0.91) were very high. DRS-R-98 severity scores correlated highly with the CGI. Mean DRS-R-98 severity scores during delirium differed significantly (p < 0.01) from the post-treatment values. The area under the curve established by ROC analysis was 0.99 and using the cut-off Value of 20 the scale showed sensitivity and specificity of 92.6% and 94.6%, respectively. Conclusion The Portuguese version of the DRS-R-98 is a valid and reliable measure of delirium that distinguishes delirium from other disorders and is sensitive to change in delirium severity, which may be of great value for longitudinal studies. Copyright (c) 2007 John Wiley & Sons, Ltd.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The aim of this study was to compare the intra-and inter-rater reliability of pressure pain threshold (PPT) and manual palpation (MP) of orofacial structures in symptomatic and symptom-free children for temporomandibular disorders (TMD). Fourteen children reporting pain in masticatory muscles or the temporomandibular joint and 16 symptom-free children were randomly assessed on three different occasions: by rater-1 in the first and third session and by rater-2 in the second session. The trained raters applied algometry and MP as recommended by the Research Diagnostic Criteria for TMD. Intraclass correlation coefficients and the Kappa statistic were used to assess the levels of reliability of PPT and MP, respectively. Excellent intra-and inter-rater reliability levels were observed for PPT values at most of the examined sites for symptom-free children and excellent and moderate reliability levels for children reporting pain. For MP, moderate and poor intra-rater and inter-rater reliability levels were observed for most sites in both groups. Algometry showed higher reliability levels for both groups of children and is recommended for pain assessment in children in association with MP. (C) 2010 Elsevier Ltd. All rights reserved.