843 resultados para interrater reliability
Resumo:
Summary Background The final phase of a three phase study analysing the implementation and impact of the nurse practitioner role in Australia (the Australian Nurse Practitioner Project or AUSPRAC) was undertaken in 2009, requiring nurse telephone interviewers to gather information about health outcomes directly from patients and their treating nurse practitioners. A team of several registered nurses was recruited and trained as telephone interviewers. The aim of this paper is to report on development and evaluation of the training process for telephone interviewers. Methods The training process involved planning the content and methods to be used in the training session; delivering the session; testing skills and understanding of interviewers post-training; collecting and analysing data to determine the degree to which the training process was successful in meeting objectives and post-training follow-up. All aspects of the training process were informed by established educational principles. Results Interrater reliability between interviewers was high for well-validated sections of the survey instrument resulting in 100% agreement between interviewers. Other sections with unvalidated questions showed lower agreement (between 75% and 90%). Overall the agreement between interviewers was 92%. Each interviewer was also measured against a specifically developed master script or gold standard and for this each interviewer achieved a percentage of correct answers of 94.7% or better. This equated to a Kappa value of 0.92 or better. Conclusion The telephone interviewer training process was very effective and achieved high interrater reliability. We argue that the high reliability was due to the use of well validated instruments and the carefully planned programme based on established educational principles. There is limited published literature on how to successfully operationalise educational principles and tailor them for specific research studies; this report addresses this knowledge gap.
Resumo:
BACKGROUND: Inappropriate prescribing is a well-documented problem in older people. The new screening tools, STOPP (Screening Tool of Older Peoples' Prescriptions) and START (Screening Tool to Alert doctors to Right Treatment) have been formulated to identify potentially inappropriate medications (PIMs) and potential errors of omissions (PEOs) in older patients. Consistent, reliable application of STOPP and START is essential for the screening tools to be used effectively by pharmacists. OBJECTIVE: To determine the interrater reliability among a group of clinical pharmacists in applying the STOPP and START criteria to elderly patients' records. METHODS: Ten pharmacists (5 hospital pharmacists, 5 community pharmacists) were given 20 patient profiles containing details including the patients' age and sex, current medications, current diagnoses, relevant medical histories, biochemical data, and estimated glomerular filtration rate. Each pharmacist applied the STOPP and START criteria to each patient record. The PIMs and PEOs identified by each pharmacist were compared with those of 2 academic pharmacists who were highly familiar with the application of STOPP and START. An interrater reliability analysis using the k statistic (chance corrected measure of agreement) was performed to determine consistency between pharmacists. RESULTS: The median ? coefficients for hospital pharmacists and community pharmacists compared with the academic pharmacists for STOPP were 0.89 and 0.88, respectively, while those for START were 0.91 and 0.90, respectively. CONCLUSIONS: Interrater reliability of STOPP and START tools between pharmacists working in different sectors is good. Pharmacists working in both hospitals and in the community can use STOPP and START reliably during their everyday practice to identify PIMs and PEOs in older patients.
Resumo:
Purpose: To examine the ‘interrater reliability’ of the Alberta Infant Motor Scale (AIMS) in term and preterm born infants between 10 to 16 months age from Talca province, Maule Region - Chile. Subjects: 115 infants between 10 to 16 months age were incorporated to the study; 95 term born infants were attended in the local Health Centre in Talca City, and 20 preterm infants belonged to the Premature Infants Follow-Up Programme of Talca Regional Hospital. Methods: The motor behaviour of each infant was recorded and later it was assessed by two trained assessors using AIMS. It was obtained the total AIMS’ score and also from prone, supine, seated, and stand subscales. For ‘interrater reliability’ analysis it was used the Intraclass Coefficient of Correlation (ICC), the Standard Error of Measurement (SEM) and 95% limits of agreement. Results: The obtained ICC for the total scores AIMS were major than 0.94 (p<0.0002) for term and preterm born infants. The SEM of total scores was less than 3.1 points, higher than what was found in other similar studies. The 95% limits of agreement were +5.3 to -4.1 points and +7.7 to – 3.9 points in term and preterm born, respectively, revealing ‘interrater agreement’. Conclusion: The AIMS showed adequate ‘interrater reliable’ levels when was applied in Chilean term and preterm born from 10 to 16 month’s age.
Resumo:
INTRODUCTION: Schizophrenia is a chronic mental disorder associated with impairment in social functioning. The most widely used scale to measure social functioning is the GAF (Global Assessment of Functioning), but it has the disadvantage of measuring at the same time symptoms and functioning, as described in its anchors. OBJECTIVES:Translation and cultural adaptation of the PSP, proposing a final version in Portuguese for use in Brazil. METHODS: We performed five steps: 1) translation; 2) back translation; 3) formal assessment of semantic equivalence; 4) debriefing; 5) analysis by experts. Interrater reliability (Intraclass correlation, ICC) between two raters was also measured. RESULTS: The final version was applied by two independent investigators in 18 adults with schizophrenia (DSM-IV-TR). The interrater reliability (ICC) was 0.812 (p < 0.001). CONCLUSION: The translation and adaptation of the PSP had an adequate level of semantic equivalence between the Portuguese version and the original English version. There were no difficulties related to understanding the content expressed in the translated texts and terms. Its application was easy and it showed a good interrater reliability. The PSP is a valid instrument for the measurement of personal and social functioning in schizophrenia.
Resumo:
STUDY DESIGN: Controlled laboratory study. OBJECTIVES: To investigate the reliability and concurrent validity of photographic measurements of hallux valgus angle compared to radiographs as the criterion standard. BACKGROUND: Clinical assessment of hallux valgus involves measuring alignment between the first toe and metatarsal on weight-bearing radiographs or visually grading the severity of deformity with categorical scales. Digital photographs offer a noninvasive method of measuring deformity on an exact scale; however, the validity of this technique has not previously been established. METHODS: Thirty-eight subjects (30 female, 8 male) were examined (76 feet, 54 with hallux valgus). Computer software was used to measure hallux valgus angle from digital records of bilateral weight-bearing dorsoplantar foot radiographs and photographs. One examiner measured 76 feet on 2 occasions 2 weeks apart, and a second examiner measured 40 feet on a single occasion. Reliability was investigated by intraclass correlation coefficients and validity by 95% limits of agreement. The Pearson correlation coefficient was also calculated. RESULTS: Intrarater and interrater reliability were very high (intraclass correlation coefficients greater than 0.96) and 95% limits of agreement between photographic and radiographic measurements were acceptable. Measurements from photographs and radiographs were also highly correlated (Pearson r = 0.96). CONCLUSIONS: Digital photographic measurements of hallux valgus angle are reliable and have acceptable validity compared to weight-bearing radiographs. This method provides a convenient and precise tool in assessment of hallux valgus, while avoiding the cost and radiation exposure associated with radiographs.
Resumo:
Objectives: To determine the interobserver reliability of radiologists' interpretations of mobile chest radiographs for nursing home-acquired pneumonia. Design: A cross-sectional reliability study. Setting: Nursing homes and an acute care hospital. Participants: Four radiologists reviewed 40 mobile chest radiographs obtained from residents of nursing homes who met a clinical definition of lower respiratory tract infections. Measurements: Radiologists were asked to interpret radiographs with respect to the film quality; presence, pattern, and extent of an infiltrate; and the presence of a pleural effusion or adenopathy. Interrater reliability was evaluated using the intraclass correlation coefficient derived from a 2-way random effects model. Results: On average the radiologists reported that 6 of the 40 films were of very good or excellent quality and 16 of the 40 were of fair or poor quality. When the finding of an infiltrate was dichotomized (0 = no; 1 = possible, probable, or definite) all 4 radiologists agreed on 21 of the 37 chest radiographs. The intraclass correlation coefficient for the presence or absence of infiltrates was 0.54 (95% confidence intervals [CI] 0.38 to 0.69). For the 14 radiographs where infiltrates were observed by all radiologists, intraclass correlation coefficients for the presence of pleural effusions was 0.08 (95% CI -0.10 to 0.41), hilar adenopathy 0.54 (95% CI 0.29 to 0.79), and mediastinal adenopathy 0.49 (95% CI 0.21 to 0.76). Conclusion: In conclusion, the interrater agreement among radiologists for mobile chest radiographs in establishing the presence or absence of an infiltrate can be judged to be "fair." Treatment decisions need to include clinical findings and should not be made based on radiographic findings alone. © 2006 American Medical Directors Association.
Resumo:
Background: Neuropsychiatric symptoms (NPS) affect almost all patients with dementia and are a major focus of study and treatment. Accurate assessment of NPS through valid, sensitive and reliable measures is crucial. Although current NPS measures have many strengths, they also have some limitations (e.g. acquisition of data is limited to informants or caregivers as respondents, limited depth of items specific to moderate dementia). Therefore, we developed a revised version of the NPI, known as the NPI-C. The NPI-C includes expanded domains and items, and a clinician-rating methodology. This study evaluated the reliability and convergent validity of the NPI-C at ten international sites (seven languages). Methods: Face validity for 78 new items was obtained through a Delphi panel. A total of 128 dyads (caregivers/patients) from three severity categories of dementia (mild = 58, moderate = 49, severe = 21) were interviewed separately by two trained raters using two rating methods: the original NPI interview and a clinician-rated method. Rater 1 also administered four additional, established measures: the Apathy Evaluation Scale, the Brief Psychiatric Rating Scale, the Cohen-Mansfield Agitation Index, and the Cornell Scale for Depression in Dementia. Intraclass correlations were used to determine inter-rater reliability. Pearson correlations between the four relevant NPI-C domains and their corresponding outside measures were used for convergent validity. Results: Inter-rater reliability was strong for most items. Convergent validity was moderate (apathy and agitation) to strong (hallucinations and delusions; agitation and aberrant vocalization; and depression) for clinician ratings in NPI-C domains. Conclusion: Overall, the NPI-C shows promise as a versatile tool which can accurately measure NPS and which uses a uniform scale system to facilitate data comparisons across studies. Copyright © 2010 International Psychogeriatric Association.
Resumo:
ABSTRACT Background: Patients with dementia may be unable to describe their symptoms, and caregivers frequently suffer emotional burden that can interfere with judgment of the patient's behavior. The Neuropsychiatric Inventory-Clinician rating scale (NPI-C) was therefore developed as a comprehensive and versatile instrument to assess and accurately measure neuropsychiatric symptoms (NPS) in dementia, thereby using information from caregiver and patient interviews, and any other relevant available data. The present study is a follow-up to the original, cross-national NPI-C validation, evaluating the reliability and concurrent validity of the NPI-C in quantifying psychopathological symptoms in dementia in a large Brazilian cohort. Methods: Two blinded raters evaluated 312 participants (156 patient-knowledgeable informant dyads) using the NPI-C for a total of 624 observations in five Brazilian centers. Inter-rater reliability was determined through intraclass correlation coefficients for the NPI-C domains and the traditional NPI. Convergent validity included correlations of specific domains of the NPI-C with the Brief Psychiatric Rating Scale (BPRS), the Cohen-Mansfield Agitation Index (CMAI), the Cornell Scale for Depression in Dementia (CSDD), and the Apathy Inventory (AI). Results: Inter-rater reliability was strong for all NPI-C domains. There were high correlations between NPI-C/delusions and BPRS, NPI-C/apathy-indifference with the AI, NPI-C/depression-dysphoria with the CSDD, NPI-C/agitation with the CMAI, and NPI-C/aggression with the CMAI. There was moderate correlation between the NPI-C/aberrant vocalizations and CMAI and the NPI-C/hallucinations with the BPRS. Conclusion: The NPI-C is a comprehensive tool that provides accurate measurement of NPS in dementia with high concurrent validity and inter-rater reliability in the Brazilian setting. In addition to universal assessment, the NPI-C can be completed by individual domains. © International Psychogeriatric Association 2013.
Resumo:
This project develops K(bin), a relatively simple, binomial based statistic for assessing interrater agreement in which expected agreement is calculated a priori from the number of raters involved in the study and number of categories on the rating tool. The statistic is logical in interpretation, easily calculated, stable for small sample sizes, and has application over a wide range of possible combinations from the simplest case of two raters using a binomial scale to multiple raters using a multiple level scale.^ Tables of expected agreement values and tables of critical values for K(bin) which include power to detect three levels of the population parameter K for n from 2 to 30 and observed agreement $\ge$.70 calculated at alpha =.05,.025, and.01 are included.^ An example is also included which describes the use of the tables for planning and evaluating an interrater reliability study using the statistic, K(bin). ^
Resumo:
Background and Purpose. Arm lymphedema following breast cancer In this study, we assessed the surgery is a continuing problem. reliability and validity of circumferential measurements and water displacement for measuring upper-limb volume. Subjects. Participants included subjects who had had breast cancer surgery, including axillary dissection-19 with and 22 without a diagnosis of arm lymphedema-and 25 control subjects. Methods. Two raters measured each subject by using circumferential tape measurements at specified distances from the fingertips and in relation to anatornic landmarks and by using water displacement. Interrater reliability was calculated by analysis of variance and multilevel modeling. Volumes from circumferential measurements were compared with those from water displacement by use of means and correlation coefficients, respectively. The standard error of measurement, minimum detectable change (MDC), and limits of agreement (LOA) for volumes also were calculated. Results. Arm volumes obtained with these methods had high reliability. Compared with volumes from water displacement, volumes from circumferential measurements had high validity, although these volumes were slightly larger. Expected differences between subjects with and without clinical lymphedema following breast cancer were found. The MDC of volumes or the error associated with a single measure for data based oil anatomic landmarks was lower than that based oil distance from fingertips. The mean LOA with water displacement were lower for data based on anatomic landmarks than for data based on distance from fingertips. Discussion and Conclusion. Volumes calculated from anatomic landmarks are reliable, valid, and more accurate than those obtained from circumferential measurements based on distance from fingertips.
Resumo:
Objective: To establish concurrent validity, interrater and test-retest reliability of the Modified Elderly Mobility Scale (MEMS). Methods: Ninety elderly patients were scored on the MEMS. To establish concurrent validity, 75 patients MEMS scores were compared to Functional Independence Measure (FIM) scores using Spearman's correlation. Videotaped patient performances were used to establish interrater and test-retest reliability using percentage absolute agreement and intraclass correlation coefficients (ICCs). Results: The total MEMS score demonstrated a significant association with the motor (r = 0.725) and total FIM scores (r = 0.718). Absolute agreement for interrater reliability was greater than 93% for all test items, with 97 and 98% for the two new measures, respectively. Test-retest reliability demonstrated similar high levels of absolute agreement and had ICCs ranging from 0.870 to 1.0. Conclusions: The MEMS is a quick, valid and reliable test of motor function of elderly patients with a spread of functional levels.
Resumo:
Objective: To demonstrate properties of the International Classification of the External Cause of Injury (ICECI) as a tool for use in injury prevention research. Methods: The Childhood Injury Prevention Study (CHIPS) is a prospective longitudinal follow up study of a cohort of 871 children 5–12 years of age, with a nested case crossover component. The ICECI is the latest tool in the International Classification of Diseases (ICD) family and has been designed to improve the precision of coding injury events. The details of all injury events recorded in the study, as well as all measured injury related exposures, were coded using the ICECI. This paper reports a substudy on the utility and practicability of using the ICECI in the CHIPS to record exposures. Interrater reliability was quantified for a sample of injured participants using the Kappa statistic to measure concordance between codes independently coded by two research staff. Results: There were 767 diaries collected at baseline and event details from 563 injuries and exposure details from injury crossover periods. There were no event, location, or activity details which could not be coded using the ICECI. Kappa statistics for concordance between raters within each of the dimensions ranged from 0.31 to 0.93 for the injury events and 0.94 and 0.97 for activity and location in the control periods. Discussion: This study represents the first detailed account of the properties of the ICECI revealed by its use in a primary analytic epidemiological study of injury prevention. The results of this study provide considerable support for the ICECI and its further use.
Resumo:
Although there are widely accepted and utilized models and frameworks for nondirective counseling (NDC), there is little in the way of tools or instruments designed to assist in determining whether or not a specific episode of counseling is consistent with the stated model or framework. The Counseling Progress and Depth Rating Instrument (CPDRI) was developed to evaluate counselor integrity in the use of Egan's skilled helper model in online counseling. The instrument was found to have sound internal consistency, good interrater reliability, and good face and convergent validity. The CPDRI is, therefore, proposed as a useful tool to facilitate investigation of the degree to which counselors adhere to and apply a widely used approach to NDC
Resumo:
Context: The Ober and Thomas tests are subjective and involve a "negative" or "positive" assessment, making them difficult to apply within the paradigm of evidence-based medicine. No authors have combined the subjective clinical assessment with an objective measurement for these special tests. Objective: To compare the subjective assessment of iliotibial band and iliopsoas flexibility with the objective measurement of a digital inclinometer, to establish normative values, and to provide an evidence-based critical criterion for determining tissue tightness. Design: Cross-sectional study. Setting: Clinical research laboratory. Patients or Other Participants: Three hundred recreational athletes (125 men, 175 women; 250 in injured group, 50 in control group). Main Outcome Measure(s): Iliotibial band and iliopsoas muscle flexibility were determined subjectively using the modified Ober and Thomas tests, respectively. Using a digital inclinometer, we objectively measured limb position. lnterrater reliability for the subjective assessment was compared between 2 clinicians for a random sample of 100 injured participants, who were classified subjectively as either negative or positive for iliotibial band and iliopsoas tightness. Percentage of agreement indicated interrater reliability for the subjective assessment. Results: For iliotibial band flexibility, the average inclinometer angle was -24.59 degrees +/- 7.27 degrees. A total of 432 limbs were subjectively assessed as negative (-27.13 degrees +/- 5.53 degrees) and 168 as positive (-16.29 degrees +/- 6.87 degrees). For iliopsoas flexibility, the average inclinometer angle was -10.60 degrees +/- 9.61 degrees. A total of 392 limbs were subjectively assessed as negative (-15.51 degrees +/- 5.82 degrees) and 208 as positive (0.34 degrees +/- 7.00 degrees). The critical criteria for iliotibial band and iliopsoas flexibility were determined to be -23.16 degrees and -9.69 degrees, respectively. Between-clinicians agreement was very good, ranging from 95.0% to 97.6% for the Thomas and Ober tests, respectively. Conclusions: Subjective assessments and instrumented measurements were combined to establish normative values and critical criterions for tissue flexibility for the modified Ober and Thomas tests.