30 resultados para Reliability in automation
Resumo:
Objectives To examine the extent of multiplicity of data in trial reports and to assess the impact of multiplicity on meta-analysis results. Design Empirical study on a cohort of Cochrane systematic reviews. Data sources All Cochrane systematic reviews published from issue 3 in 2006 to issue 2 in 2007 that presented a result as a standardised mean difference (SMD). We retrieved trial reports contributing to the first SMD result in each review, and downloaded review protocols. We used these SMDs to identify a specific outcome for each meta-analysis from its protocol. Review methods Reviews were eligible if SMD results were based on two to ten randomised trials and if protocols described the outcome. We excluded reviews if they only presented results of subgroup analyses. Based on review protocols and index outcomes, two observers independently extracted the data necessary to calculate SMDs from the original trial reports for any intervention group, time point, or outcome measure compatible with the protocol. From the extracted data, we used Monte Carlo simulations to calculate all possible SMDs for every meta-analysis. Results We identified 19 eligible meta-analyses (including 83 trials). Published review protocols often lacked information about which data to choose. Twenty-four (29%) trials reported data for multiple intervention groups, 30 (36%) reported data for multiple time points, and 29 (35%) reported the index outcome measured on multiple scales. In 18 meta-analyses, we found multiplicity of data in at least one trial report; the median difference between the smallest and largest SMD results within a meta-analysis was 0.40 standard deviation units (range 0.04 to 0.91). Conclusions Multiplicity of data can affect the findings of systematic reviews and meta-analyses. To reduce the risk of bias, reviews and meta-analyses should comply with prespecified protocols that clearly identify time points, intervention groups, and scales of interest.
Resumo:
The aim of this study was to refine a multi-dimensional scale based on physiological and behavioural parameters, known as the post abdominal surgery pain assessment scale (PASPAS), to quantify pain after laparotomy in horses. After a short introduction, eight observers used the scale to assess eight horses at multiple time points after laparotomy. In addition, a single observer was used to test the correlation of each parameter with the total pain index in 34 patients, and the effect of general anaesthesia on PASPAS was investigated in a control group of eight horses. Inter-observer variability was low (coefficient of variation 0.3), which indicated good reliability of PASPAS. The correlation of individual parameters with the total pain index differed between parameters. PASPAS, which was not influenced by general anaesthesia, was a useful tool to evaluate pain in horses after abdominal surgery and may also be useful to investigate analgesic protocols or for teaching purposes.
Resumo:
OBJECTIVE: To assess the intra-reader and inter-reader reliabilities of interpreting ultrasonography by several experts using video clips. METHOD: 99 video clips of healthy and rheumatic joints were recorded and delivered to 17 physician sonographers in two rounds. The intra-reader and inter-reader reliabilities of interpreting the ultrasound results were calculated using a dichotomous system (normal/abnormal) and a graded semiquantitative scoring system. RESULTS: The video reading method worked well. 70% of the readers could classify at least 70% of the cases correctly as normal or abnormal. The distribution of readers answering correctly was wide. The most difficult joints to assess were the elbow, wrist, metacarpophalangeal (MCP) and knee joints. The intra-reader and inter-reader agreements on interpreting dynamic ultrasound images as normal or abnormal, as well as detecting and scoring a Doppler signal were moderate to good (kappa = 0.52-0.82). CONCLUSIONS: Dynamic image assessment (video clips) can be used as an alternative method in ultrasonography reliability studies. The intra-reader and inter-reader reliabilities of ultrasonography in dynamic image reading are acceptable, but more definitions and training are needed to improve sonographic reproducibility.
Resumo:
BACKGROUND: High intercoder reliability (ICR) is required in qualitative content analysis for assuring quality when more than one coder is involved in data analysis. The literature is short of standardized procedures for ICR procedures in qualitative content analysis. OBJECTIVE: To illustrate how ICR assessment can be used to improve codings in qualitative content analysis. METHODS: Key steps of the procedure are presented, drawing on data from a qualitative study on patients' perspectives on low back pain. RESULTS: First, a coding scheme was developed using a comprehensive inductive and deductive approach. Second, 10 transcripts were coded independently by two researchers, and ICR was calculated. A resulting kappa value of .67 can be regarded as satisfactory to solid. Moreover, varying agreement rates helped to identify problems in the coding scheme. Low agreement rates, for instance, indicated that respective codes were defined too broadly and would need clarification. In a third step, the results of the analysis were used to improve the coding scheme, leading to consistent and high-quality results. DISCUSSION: The quantitative approach of ICR assessment is a viable instrument for quality assurance in qualitative content analysis. Kappa values and close inspection of agreement rates help to estimate and increase quality of codings. This approach facilitates good practice in coding and enhances credibility of analysis, especially when large samples are interviewed, different coders are involved, and quantitative results are presented.
Resumo:
The aim of this study was to assess the ability to extract surgically relevant information from plain radiographs in trimalleolar fractures and to compare this with the information gathered from computed tomography (CT).
Resumo:
BACKGROUND: Cardiac output (CO) measurement with lithium dilution (COLD) has not been fully validated in sheep using precise ultrasonic flow probe technology (COUFP). Sheep generate important cardiovascular research models and the use of COLD has become more popular in experimental settings. METHODS: Ultrasonic transit-time perivascular flow probes were surgically implanted on the pulmonary artery of 13 sheep. Paired COLD readings were taken at six time points, before and after implantation of a left ventricular assist device (LVAD) and compared with COUFP recorded just after lithium injection. RESULTS: The mean COLD was 5.7 litre min(-1) (range 3.8-9.6 litre min(-1)) and mean COUFP 5.9 litre min(-1) (range 4.0-9.2 litre min(-1)). The bias (standard deviation) was 0.3 (1.0) litre min(-1) [5.1 (16.9)%] and limits of agreement (LOA) were -1.7 to 2.3 litre min(-1) (-28.8 to 39.0%) with a percentage error (PE) of 34.4%. Data to assess trending [rate (95% confidence intervals)] included a 78 (62-93)% concordance rate in the four-quadrant plot (n=27). In the half moon polar plot (n=19), the mean polar angle was +5°, the radial LOA were -49 to +35° and 68 (47-89)% of data points fell within 22.5° of the mean polar angle. Both tests indicated moderate to poor trending ability. CONCLUSION: COLD is not precise when evaluated against COUFP in sheep based on the statistical criteria set, but the results are comparable with previously published animal studies. KEYWORDS:
Resumo:
The aim of this study was to evaluate the reliability of the cardiothoracic ratio (CTR) in postmortem computed tomography (PMCT) and to assess a CTR threshold for the diagnosis of cardiomegaly based on the weight of the heart at autopsy. PMCT data of 170 deceased human adults were retrospectively evaluated by two blinded radiologists. The CTR was measured on axial computed tomography images and the actual cardiac weight was weighed at autopsy. Inter-rater reliability, sensitivity, and specificity were calculated. Receiver operating characteristic curves were calculated to assess enlarged heart weight by CTR. The autopsy definition of cardiomegaly was based on normal values of the Zeek method (within a range of both, one or two SD) and the Smith method (within the given range). Intra-class correlation coefficients demonstrated excellent agreements (0.983) regarding CTR measurements. In 105/170 (62 %) cases the CTR in PMCT was >0.5, indicating enlarged heart weight, according to clinical references. The mean heart weight measured in autopsy was 405 ± 105 g. As a result, 114/170 (67 %) cases were interpreted as having enlarged heart weights according to the normal values of Zeek within one SD, while 97/170 (57 %) were within two SD. 100/170 (59 %) were assessed as enlarged according to Smith's normal values. The sensitivity/specificity of the 0.5 cut-off of the CTR for the diagnosis of enlarged heart weight was 78/71 % (Zeek one SD), 74/55 % (Zeek two SD), and 76/59 % (Smith), respectively. The discriminative power between normal heart weight and cardiomegaly was 79, 73, and 74 % for the Zeek (1SD/2SD) and Smith methods respectively. Changing the CTR threshold to 0.57 resulted in a minimum specificity of 95 % for all three definitions of cardiomegaly. With a CTR threshold of 0.57, cardiomegaly can be identified with a very high specificity. This may be useful if PMCT is used by forensic pathologists as a screening tool for medico-legal autopsies.
Resumo:
OBJECTIVES To test the inter-rater reliability of the RoB tool applied to Physical Therapy (PT) trials by comparing ratings from Cochrane review authors with those of blinded external reviewers. METHODS Randomized controlled trials (RCTs) in PT were identified by searching the Cochrane Database of Systematic Reviews for meta-analysis of PT interventions. RoB assessments were conducted independently by 2 reviewers blinded to the RoB ratings reported in the Cochrane reviews. Data on RoB assessments from Cochrane reviews and other characteristics of reviews and trials were extracted. Consensus assessments between the two reviewers were then compared with the RoB ratings from the Cochrane reviews. Agreement between Cochrane and blinded external reviewers was assessed using weighted kappa (κ). RESULTS In total, 109 trials included in 17 Cochrane reviews were assessed. Inter-rater reliability on the overall RoB assessment between Cochrane review authors and blinded external reviewers was poor (κ = 0.02, 95%CI: -0.06, 0.06]). Inter-rater reliability on individual domains of the RoB tool was poor (median κ = 0.19), ranging from κ = -0.04 ("Other bias") to κ = 0.62 ("Sequence generation"). There was also no agreement (κ = -0.29, 95%CI: -0.81, 0.35]) in the overall RoB assessment at the meta-analysis level. CONCLUSIONS Risk of bias assessments of RCTs using the RoB tool are not consistent across different research groups. Poor agreement was not only demonstrated at the trial level but also at the meta-analysis level. Results have implications for decision making since different recommendations can be reached depending on the group analyzing the evidence. Improved guidelines to consistently apply the RoB tool and revisions to the tool for different health areas are needed.
Resumo:
Resting-state functional connectivity (FC) fMRI (rs-fcMRI) offers an appealing approach to mapping the brain's intrinsic functional organization. Blood oxygen level dependent (BOLD) and arterial spin labeling (ASL) are the two main rs-fcMRI approaches to assess alterations in brain networks associated with individual differences, behavior and psychopathology. While the BOLD signal is stronger with a higher temporal resolution, ASL provides quantitative, direct measures of the physiology and metabolism of specific networks. This study systematically investigated the similarity and reliability of resting brain networks (RBNs) in BOLD and ASL. A 2×2×2 factorial design was employed where each subject underwent repeated BOLD and ASL rs-fcMRI scans on two occasions on two MRI scanners respectively. Both independent and joint FC analyses revealed common RBNs in ASL and BOLD rs-fcMRI with a moderate to high level of spatial overlap, verified by Dice Similarity Coefficients. Test-retest analyses indicated more reliable spatial network patterns in BOLD (average modal Intraclass Correlation Coefficients: 0.905±0.033 between-sessions; 0.885±0.052 between-scanners) than ASL (0.545±0.048; 0.575±0.059). Nevertheless, ASL provided highly reproducible (0.955±0.021; 0.970±0.011) network-specific CBF measurements. Moreover, we observed positive correlations between regional CBF and FC in core areas of all RBNs indicating a relationship between network connectivity and its baseline metabolism. Taken together, the combination of ASL and BOLD rs-fcMRI provides a powerful tool for characterizing the spatiotemporal and quantitative properties of RBNs. These findings pave the way for future BOLD and ASL rs-fcMRI studies in clinical populations that are carried out across time and scanners.
Resumo:
Glycogen is a major substrate in energy metabolism and particularly important to prevent hypoglycemia in pathologies of glucose homeostasis such as type 1 diabetes mellitus (T1DM). (13) C-MRS is increasingly used to determine glycogen in skeletal muscle and liver non-invasively; however, the low signal-to-noise ratio leads to long acquisition times, particularly when glycogen levels are determined before and after interventions. In order to ease the requirements for the subjects and to avoid systematic effects of the lengthy examination, we evaluated if a standardized preparation period would allow us to shift the baseline (pre-intervention) experiments to a preceding day. Based on natural abundance (13) C-MRS on a clinical 3 T MR system the present study investigated the test-retest reliability of glycogen measurements in patients with T1DM and matched controls (n = 10 each group) in quadriceps muscle and liver. Prior to the MR examination, participants followed a standardized diet and avoided strenuous exercise for two days. The average coefficient of variation (CV) of myocellular glycogen levels was 9.7% in patients with T1DM compared with 6.6% in controls after a 2 week period, while hepatic glycogen variability was 13.3% in patients with T1DM and 14.6% in controls. For comparison, a single-session test-retest variability in four healthy volunteers resulted in 9.5% for skeletal muscle and 14.3% for liver. Glycogen levels in muscle and liver were not statistically different between test and retest, except for hepatic glycogen, which decreased in T1DM patients in the retest examination, but without an increase of the group distribution. Since the CVs of glycogen levels determined in a "single session" versus "within weeks" are comparable, we conclude that the major source of uncertainty is the methodological error and that physiological variations can be minimized by a pre-study standardization. For hepatic glycogen examinations, familiarization sessions (MR and potentially strenuous interventions) are recommended. Copyright © 2016 John Wiley & Sons, Ltd.
Resumo:
Measuring the ratio of heterophils and lymphocytes (H/L) in response to different stressors is a standard tool for assessing long-term stress in laying hens but detailed information on the reliability of measurements, measurement techniques and methods, and absolute cell counts is often lacking. Laying hens offered different sites of the nest boxes at different ages were compared in a two-treatment crossover experiment to provide detailed information on the procedure for measuring and the difficulties in the interpretation of H/L ratios in commercial conditions. H/L ratios were pen-specific and depended on the age and aviary system. There was no effect for the position of the nest. Heterophiles and lymphocytes were not correlated within individuals. Absolute cell counts differed in the number of heterophiles and lymphocytes and H/L ratios, whereas absolute leucocyte counts between individuals were similar. The reliability of the method using relative cell counts was good, yielding a correlation coefficient between double counts of r > 0.9. It was concluded that population-based reference values may not be sensitive enough to detect individual stress reactions and that the H/L ratio as an indicator of stress under commercial conditions may not be useful because of confounding factors and that other, non-invasive, measurements should be adopted.