79 resultados para reliability of results
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
INTRODUCTION Even though arthroplasty of the ankle joint is considered to be an established procedure, only about 1,300 endoprostheses are implanted in Germany annually. Arthrodeses of the ankle joint are performed almost three times more often. This may be due to the availability of the procedure - more than twice as many providers perform arthrodesis - as well as the postulated high frequency of revision procedures of arthroplasties in the literature. In those publications, however, there is often no clear differentiation between revision surgery with exchange of components, subsequent interventions due to complications and subsequent surgery not associated with complications. The German Orthopaedic Foot and Ankle Association's (D. A. F.) registry for total ankle replacement collects data pertaining to perioperative complications as well as cause, nature and extent of the subsequent interventions, and postoperative patient satisfaction. MATERIAL AND METHODS The D. A. F.'s total ankle replacement register is a nation-wide, voluntary registry. After giving written informed consent, the patients can be added to the database by participating providers. Data are collected during hospital stay for surgical treatment, during routine follow-up inspections and in the context of revision surgery. The information can be submitted in paper-based or online formats. The survey instruments are available as minimum data sets or scientific questionnaires which include patient-reported outcome measures (PROMs). The pseudonymous clinical data are collected and evaluated at the Institute for Evaluative Research in Medicine, University of Bern/Switzerland (IEFM). The patient-related data remain on the register's module server in North Rhine-Westphalia, Germany. The registry's methodology as well as the results of the revisions and patient satisfaction for 115 patients with a two year follow-up period are presented. Statistical analyses are performed with SAS™ (Version 9.4, SAS Institute, Inc., Cary, NC, USA). RESULTS About 2½ years after the register was launched there are 621 datasets on primary implantations, 1,427 on follow-ups and 121 records on re-operation available. 49 % of the patients received their implants due to post-traumatic osteoarthritis, 27 % because of a primary osteoarthritis and 15 % of patients suffered from a rheumatic disease. More than 90 % of the primary interventions proceeded without complications. Subsequent interventions were recorded for 84 patients, which corresponds to a rate of 13.5 % with respect to the primary implantations. It should be noted that these secondary procedures also include two-stage procedures not due to a complication. "True revisions" are interventions with exchange of components due to mechanical complications and/or infection and were present in 7.6 % of patients. 415 of the patients commented on their satisfaction with the operative result during the last follow-up: 89.9 % of patients evaluate their outcome as excellent or good, 9.4 % as moderate and only 0.7 % (3 patients) as poor. In these three cases a component loosening or symptomatic USG osteoarthritis was present. Two-year follow-up data using the American Orthopedic Foot and Ankle Society Ankle and Hindfoot Scale (AOFAS-AHS) are already available for 115 patients. The median AOFAS-AHS score increased from 33 points preoperatively to more than 80 points three to six months postoperatively. This increase remained nearly constant over the entire two-year follow-up period. CONCLUSION Covering less than 10 % of the approximately 240 providers in Germany and approximately 12 % of the annually implanted total ankle-replacements, the D. A. F.-register is still far from being seen as a national registry. Nevertheless, geographical coverage and inclusion of "high-" (more than 100 total ankle replacements a year) and "low-volume surgeons" (less than 5 total ankle replacements a year) make the register representative for Germany. The registry data show that the number of subsequent interventions and in particular the "true revision" procedures are markedly lower than the 20 % often postulated in the literature. In addition, a high level of patient satisfaction over the short and medium term is recorded. From the perspective of the authors, these results indicate that total ankle arthroplasty - given a correct indication and appropriate selection of patients - is not inferior to an ankle arthrodesis concerning patients' satisfaction and function. First valid survival rates can be expected about 10 years after the register's start.
Resumo:
Recent studies have shown that the nociceptive withdrawal reflex threshold (NWR-T) and the electrical pain threshold (EP-T) are reliable measures in pain-free populations. However, it is necessary to investigate the reliability of these measures in patients with chronic pain in order to translate these techniques from laboratory to clinic. The aims of this study were to determine the test-retest reliability of the NWR-T and EP-T after single and repeated (temporal summation) electrical stimulation in a group of patients with chronic low back pain, and to investigate the association between the NWR-T and the EP-T. To this end, 25 patients with chronic pain participated in three identical sessions, separated by 1 week in average, in which the NWR-T and the EP-T to single and repeated stimulation were measured. Test-retest reliability was assessed using intra-class correlation coefficient (ICC), coefficient of variation (CV), and Bland-Altman analysis. The association between the thresholds was assessed using the coefficient of determination (r (2)). The results showed good-to-excellent reliability for both NWR-T and EP-T in all cases, with average ICC values ranging 0.76-0.90 and average CV values ranging 12.0-17.7%. The association between thresholds was better after repeated stimulation than after single stimulation, with average r (2) values of 0.83 and 0.56, respectively. In conclusion, the NWR-T and the EP-T are reliable assessment tools for assessing the sensitivity of spinal nociceptive pathways in patients with chronic pain.
Resumo:
The Pulmonary Embolism Severity Index (PESI) is a validated clinical prognostic model for patients with acute pulmonary embolism (PE). Our goal was to assess the PESI's inter-rater reliability in patients diagnosed with PE. We prospectively identified consecutive patients diagnosed with PE in the emergency department of a Swiss teaching hospital. For all patients, resident and attending physician raters independently collected the 11 PESI variables. The raters then calculated the PESI total point score and classified patients into one of five PESI risk classes (I-V) and as low (risk classes I/II) versus higher-risk (risk classes III-V). We examined the inter-rater reliability for each of the 11 PESI variables, the PESI total point score, assignment to each of the five PESI risk classes, and classification of patients as low versus higher-risk using kappa ( ) and intra-class correlation coefficients (ICC). Among 48 consecutive patients with an objective diagnosis of PE, reliability coefficients between resident and attending physician raters were > 0.60 for 10 of the 11 variables comprising the PESI. The inter-rater reliability for the PESI total point score (ICC: 0.89, 95% CI: 0.81-0.94), PESI risk class assignment ( : 0.81, 95% CI: 0.66-0.94), and the classification of patients as low versus higher-risk ( : 0.92, 95% CI: 0.72-0.98) was near perfect. Our results demonstrate the high reproducibility of the PESI, supporting the use of the PESI for risk stratification of patients with PE.
Resumo:
Objectives To examine the extent of multiplicity of data in trial reports and to assess the impact of multiplicity on meta-analysis results. Design Empirical study on a cohort of Cochrane systematic reviews. Data sources All Cochrane systematic reviews published from issue 3 in 2006 to issue 2 in 2007 that presented a result as a standardised mean difference (SMD). We retrieved trial reports contributing to the first SMD result in each review, and downloaded review protocols. We used these SMDs to identify a specific outcome for each meta-analysis from its protocol. Review methods Reviews were eligible if SMD results were based on two to ten randomised trials and if protocols described the outcome. We excluded reviews if they only presented results of subgroup analyses. Based on review protocols and index outcomes, two observers independently extracted the data necessary to calculate SMDs from the original trial reports for any intervention group, time point, or outcome measure compatible with the protocol. From the extracted data, we used Monte Carlo simulations to calculate all possible SMDs for every meta-analysis. Results We identified 19 eligible meta-analyses (including 83 trials). Published review protocols often lacked information about which data to choose. Twenty-four (29%) trials reported data for multiple intervention groups, 30 (36%) reported data for multiple time points, and 29 (35%) reported the index outcome measured on multiple scales. In 18 meta-analyses, we found multiplicity of data in at least one trial report; the median difference between the smallest and largest SMD results within a meta-analysis was 0.40 standard deviation units (range 0.04 to 0.91). Conclusions Multiplicity of data can affect the findings of systematic reviews and meta-analyses. To reduce the risk of bias, reviews and meta-analyses should comply with prespecified protocols that clearly identify time points, intervention groups, and scales of interest.
Resumo:
In the training of healthcare professionals, one of the advantages of communication training with simulated patients (SPs) is the SP's ability to provide direct feedback to students after a simulated clinical encounter. The quality of SP feedback must be monitored, especially because it is well known that feedback can have a profound effect on student performance. Due to the current lack of valid and reliable instruments to assess the quality of SP feedback, our study examined the validity and reliability of one potential instrument, the 'modified Quality of Simulated Patient Feedback Form' (mQSF). Methods Content validity of the mQSF was assessed by inviting experts in the area of simulated clinical encounters to rate the importance of the mQSF items. Moreover, generalizability theory was used to examine the reliability of the mQSF. Our data came from videotapes of clinical encounters between six simulated patients and six students and the ensuing feedback from the SPs to the students. Ten faculty members judged the SP feedback according to the items on the mQSF. Three weeks later, this procedure was repeated with the same faculty members and recordings. Results All but two items of the mQSF received importance ratings of > 2.5 on a four-point rating scale. A generalizability coefficient of 0.77 was established with two judges observing one encounter. Conclusions The findings for content validity and reliability with two judges suggest that the mQSF is a valid and reliable instrument to assess the quality of feedback provided by simulated patients.
Resumo:
OBJECTIVE: To assess the intra-reader and inter-reader reliabilities of interpreting ultrasonography by several experts using video clips. METHOD: 99 video clips of healthy and rheumatic joints were recorded and delivered to 17 physician sonographers in two rounds. The intra-reader and inter-reader reliabilities of interpreting the ultrasound results were calculated using a dichotomous system (normal/abnormal) and a graded semiquantitative scoring system. RESULTS: The video reading method worked well. 70% of the readers could classify at least 70% of the cases correctly as normal or abnormal. The distribution of readers answering correctly was wide. The most difficult joints to assess were the elbow, wrist, metacarpophalangeal (MCP) and knee joints. The intra-reader and inter-reader agreements on interpreting dynamic ultrasound images as normal or abnormal, as well as detecting and scoring a Doppler signal were moderate to good (kappa = 0.52-0.82). CONCLUSIONS: Dynamic image assessment (video clips) can be used as an alternative method in ultrasonography reliability studies. The intra-reader and inter-reader reliabilities of ultrasonography in dynamic image reading are acceptable, but more definitions and training are needed to improve sonographic reproducibility.
Resumo:
OBJECT: Ultrasound may be a reliable but simpler alternative to intraoperative MR imaging (iMR imaging) for tumor resection control. However, its reliability in the detection of tumor remnants has not been definitely proven. The aim of the study was to compare high-field iMR imaging (1.5 T) and high-resolution 2D ultrasound in terms of tumor resection control. METHODS: A prospective comparative study of 26 consecutive patients was performed. The following parameters were compared: the existence of tumor remnants after presumed radical removal and the quality of the images. Tumor remnants were categorized as: detectable with both imaging modalities or visible only with 1 modality. RESULTS: Tumor remnants were detected in 21 cases (80.8%) with iMR imaging. All large remnants were demonstrated with both modalities, and their image quality was good. Two-dimensional ultrasound was not as effective in detecting remnants<1 cm. Two remnants detected with iMR imaging were missed by ultrasound. In 2 cases suspicious signals visible only on ultrasound images were misinterpreted as remnants but turned out to be a blood clot and peritumoral parenchyma. The average time for acquisition of an ultrasound image was 2 minutes, whereas that for an iMR image was approximately 10 minutes. Neither modality resulted in any procedure-related complications or morbidity. CONCLUSIONS: Intraoperative MR imaging is more precise in detecting small tumor remnants than 2D ultrasound. Nevertheless, the latter may be used as a less expensive and less time-consuming alternative that provides almost real-time feedback information. Its accuracy is highest in case of more confined, deeply located remnants. In cases of more superficially located remnants, its role is more limited.
Resumo:
Background: The design of Virtual Patients (VPs) is essential. So far there are no validated evaluation instruments for VP design published. Summary of work: We examined three sources of validity evidence of an instrument to be filled out by students aimed at measuring the quality of VPs with a special emphasis on fostering clinical reasoning: (1) Content was examined based on theory of clinical reasoning and an international VP expert team. (2) Response process was explored in think aloud pilot studies with students and content analysis of free text questions accompanying each item of the instrument. (3) Internal structure was assessed by confirmatory factor analysis (CFA) using 2547 student evaluations and reliability was examined utilizing generalizability analysis. Summary of results: Content analysis was supported by theory underlying Gruppen and Frohna’s clinical reasoning model on which the instrument is based and an international VP expert team. The pilot study and analysis of free text comments supported the validity of the instrument. The CFA indicated that a three factor model comprising 6 items showed a good fit with the data. Alpha coefficients per factor were 0,74 - 0,82. The findings of the generalizability studies indicated that 40-200 student responses are needed in order to obtain reliable data on one VP. Conclusions: The described instrument has the potential to provide faculty with reliable and valid information about VP design. Take-home messages: We present a short instrument which can be of help in evaluating the design of VPs.
Resumo:
BACKGROUND: Cardiac output (CO) measurement with lithium dilution (COLD) has not been fully validated in sheep using precise ultrasonic flow probe technology (COUFP). Sheep generate important cardiovascular research models and the use of COLD has become more popular in experimental settings. METHODS: Ultrasonic transit-time perivascular flow probes were surgically implanted on the pulmonary artery of 13 sheep. Paired COLD readings were taken at six time points, before and after implantation of a left ventricular assist device (LVAD) and compared with COUFP recorded just after lithium injection. RESULTS: The mean COLD was 5.7 litre min(-1) (range 3.8-9.6 litre min(-1)) and mean COUFP 5.9 litre min(-1) (range 4.0-9.2 litre min(-1)). The bias (standard deviation) was 0.3 (1.0) litre min(-1) [5.1 (16.9)%] and limits of agreement (LOA) were -1.7 to 2.3 litre min(-1) (-28.8 to 39.0%) with a percentage error (PE) of 34.4%. Data to assess trending [rate (95% confidence intervals)] included a 78 (62-93)% concordance rate in the four-quadrant plot (n=27). In the half moon polar plot (n=19), the mean polar angle was +5°, the radial LOA were -49 to +35° and 68 (47-89)% of data points fell within 22.5° of the mean polar angle. Both tests indicated moderate to poor trending ability. CONCLUSION: COLD is not precise when evaluated against COUFP in sheep based on the statistical criteria set, but the results are comparable with previously published animal studies. KEYWORDS:
Resumo:
INTRODUCTION AND HYPOTHESIS The prevalence of female stress urinary incontinence is high, and young adults are also affected, including athletes, especially those involved in "high-impact" sports. To date there have been almost no studies testing pelvic floor muscle (PFM) activity during dynamic functional whole body movements. The aim of this study was the description and reliability test of PFM activity and time variables during running. METHODS A prospective cross-sectional study including ten healthy female subjects was designed with the focus on the intra-session test-retest reliability of PFM activity and time variables during running derived from electromyography (EMG) and accelerometry. RESULTS Thirteen variables were identified based on ten steps of each subject: Six EMG variables showed good reliability (ICC 0.906-0.942) and seven time variables did not show good reliability (ICC 0.113-0.731). Time variables (e.g. time difference between heel strike and maximal acceleration of vaginal accelerator) showed low reliability. However, relevant PFM EMG variables during running (e.g., pre-activation, minimal and maximal activity) could be identified and showed good reliability. CONCLUSION Further adaptations regarding measurement methods should be tested to gain better control of the kinetics and kinematics of the EMG probe and accelerometers. To our knowledge this is the first study to test the reliability of PFM activity and time variables during dynamic functional whole body movements. More knowledge of PFM activity and time variables may help to provide a deeper insight into physical strain with high force impacts and important functional reflexive contraction patterns of PFM to maintain or to restore continence.
Resumo:
BACKGROUND The Cochrane risk of bias (RoB) tool has been widely embraced by the systematic review community, but several studies have reported that its reliability is low. We aim to investigate whether training of raters, including objective and standardized instructions on how to assess risk of bias, can improve the reliability of this tool. We describe the methods that will be used in this investigation and present an intensive standardized training package for risk of bias assessment that could be used by contributors to the Cochrane Collaboration and other reviewers. METHODS/DESIGN This is a pilot study. We will first perform a systematic literature review to identify randomized clinical trials (RCTs) that will be used for risk of bias assessment. Using the identified RCTs, we will then do a randomized experiment, where raters will be allocated to two different training schemes: minimal training and intensive standardized training. We will calculate the chance-corrected weighted Kappa with 95% confidence intervals to quantify within- and between-group Kappa agreement for each of the domains of the risk of bias tool. To calculate between-group Kappa agreement, we will use risk of bias assessments from pairs of raters after resolution of disagreements. Between-group Kappa agreement will quantify the agreement between the risk of bias assessment of raters in the training groups and the risk of bias assessment of experienced raters. To compare agreement of raters under different training conditions, we will calculate differences between Kappa values with 95% confidence intervals. DISCUSSION This study will investigate whether the reliability of the risk of bias tool can be improved by training raters using standardized instructions for risk of bias assessment. One group of inexperienced raters will receive intensive training on risk of bias assessment and the other will receive minimal training. By including a control group with minimal training, we will attempt to mimic what many review authors commonly have to do, that is-conduct risk of bias assessment in RCTs without much formal training or standardized instructions. If our results indicate that an intense standardized training does improve the reliability of the RoB tool, our study is likely to help improve the quality of risk of bias assessments, which is a central component of evidence synthesis.
Resumo:
Measured rates of intrinsic clearance determined using cryopreserved trout hepatocytes can be extrapolated to the whole animal as a means of improving modeled bioaccumulation predictions for fish. To date, however, the intra- and interlaboratory reliability of this procedure has not been determined. In the present study, three laboratories determined in vitro intrinsic clearance of six reference compounds (benzo[a]pyrene, 4-nonylphenol, di-tert-butyl phenol, fenthion, methoxychlor and o-terphenyl) by conducting substrate depletion experiments with cryopreserved trout hepatocytes from a single source. O-terphenyl was excluded from the final analysis due to nonfirst-order depletion kinetics and significant loss from denatured controls. For the other five compounds, intralaboratory variability (% CV) in measured in vitro intrinsic clearance values ranged from 4.1 to 30%, while interlaboratory variability ranged from 27 to 61%. Predicted bioconcentration factors based on in vitro clearance values exhibited a reduced level of interlaboratory variability (5.3-38% CV). The results of this study demonstrate that cryopreserved trout hepatocytes can be used to reliably obtain in vitro intrinsic clearance of xenobiotics, which provides support for the application of this in vitro method in a weight-of-evidence approach to chemical bioaccumulation assessment.
Resumo:
Background: Virtual patients (VPs) are increasingly used to train clinical reasoning. So far, no validated evaluation instruments for VP design are available. Aims: We examined the validity of an instrument for assessing the perception of VP design by learners. Methods: Three sources of validity evidence were examined: (i) Content was examined based on theory of clinical reasoning and an international VP expert team. (ii) The response process was explored in think-aloud pilot studies with medical students and in content analyses of free text questions accompanying each item of the instrument. (iii) Internal structure was assessed by exploratory factor analysis (EFA) and inter-rater reliability by generalizability analysis. Results: Content analysis was reasonably supported by the theoretical foundation and the VP expert team. The think-aloud studies and analysis of free text comments supported the validity of the instrument. In the EFA, using 2547 student evaluations of a total of 78 VPs, a three-factor model showed a reasonable fit with the data. At least 200 student responses are needed to obtain a reliable evaluation of a VP on all three factors. Conclusion: The instrument has the potential to provide valid information about VP design, provided that many responses per VP are available.
Resumo:
BACKGROUND AND OBJECTIVES Reliability is an essential condition for using quantitative sensory tests (QSTs) in research and clinical practice, but information on reliability in patients with chronic pain is sparse. The aim of this study was to evaluate the reliability of different QST in patients with chronic low back pain. METHODS Eighty-nine patients with chronic low back pain participated in 2 identical experimental sessions, separated by at least 7 days. The following parameters were recorded: pressure pain detection and tolerance thresholds at the toe, electrical pain thresholds to single and repeated stimulation, heat pain detection and tolerance thresholds at the arm and leg, cold pain detection threshold at the arm and leg, and conditioned pain modulation using the cold pressor test.Reliability was analyzed using the coefficient of variation, the coefficient of repeatability, and the intraclass correlation coefficient. It was judged as acceptable or not based primarily on the analysis of the coefficient of repeatability. RESULTS The reliability of most tests was acceptable. Exceptions were cold pain detection thresholds at the leg and arm. CONCLUSIONS Most QST measurements have acceptable reliability in patients with chronic low back pain.
Resumo:
OBJECTIVE To assess the reliability of the cervical vertebrae maturation method (CVM). BACKGROUND Skeletal maturity estimation can influence the manner and time of orthodontic treatment. The CVM method evaluates skeletal growth on the basis of the changes in the morphology of cervical vertebrae C2, C3, C4 during growth. These vertebrae are visible on a lateral cephalogram, so the method does not require an additional radiograph. METHODS In this website based study, 10 orthodontists with a long clinical practice (3 routinely using the method - "Routine user - RU" and 7 with less experience in the CVM method - "Non-Routine user - nonRU") rated twice cervical vertebrae maturation with the CVM method on 50 cropped scans of lateral cephalograms of children in circumpubertal age (for boys: 11.5 to 15.5 years; for girls: 10 to 14 years). Kappa statistics (with lower limits of 95% confidence intervals (CI)) and proportion of complete agreement on staging was used to evaluate intra- and inter-assessor agreement. RESULTS The mean weighted kappa for intra-assessor agreement was 0.44 (range: 0.30-0.64; range of lower limits of 95% CI: 0.12-0.48) and for inter-assessor agreement was 0.28 (range: -0.01-0.58; range of lower limits of 95% CI: -0.14-0.42). The mean proportion of identical scores assigned by the same assessor was 55.2 %(range: 44-74 %) and for different pairs of assessors was 42 % (range: 16-68 %). CONCLUSIONS The reliability of the CVM method is questionable and if orthodontic treatment should be initiated relative to the maximum growth, the use of additional biologic indicators should be considered (Tab. 4, Fig. 1, Ref. 24).