948 resultados para Data reliability
Resumo:
In the training of healthcare professionals, one of the advantages of communication training with simulated patients (SPs) is the SP's ability to provide direct feedback to students after a simulated clinical encounter. The quality of SP feedback must be monitored, especially because it is well known that feedback can have a profound effect on student performance. Due to the current lack of valid and reliable instruments to assess the quality of SP feedback, our study examined the validity and reliability of one potential instrument, the 'modified Quality of Simulated Patient Feedback Form' (mQSF). Methods Content validity of the mQSF was assessed by inviting experts in the area of simulated clinical encounters to rate the importance of the mQSF items. Moreover, generalizability theory was used to examine the reliability of the mQSF. Our data came from videotapes of clinical encounters between six simulated patients and six students and the ensuing feedback from the SPs to the students. Ten faculty members judged the SP feedback according to the items on the mQSF. Three weeks later, this procedure was repeated with the same faculty members and recordings. Results All but two items of the mQSF received importance ratings of > 2.5 on a four-point rating scale. A generalizability coefficient of 0.77 was established with two judges observing one encounter. Conclusions The findings for content validity and reliability with two judges suggest that the mQSF is a valid and reliable instrument to assess the quality of feedback provided by simulated patients.
Resumo:
This article gives an overview over the methods used in the low--level analysis of gene expression data generated using DNA microarrays. This type of experiment allows to determine relative levels of nucleic acid abundance in a set of tissues or cell populations for thousands of transcripts or loci simultaneously. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. This includes the design of probes, the experimental design, the image analysis of microarray scanned images, the normalization of fluorescence intensities, the assessment of the quality of microarray data and incorporation of quality information in subsequent analyses, the combination of information across arrays and across sets of experiments, the discovery and recognition of patterns in expression at the single gene and multiple gene levels, and the assessment of significance of these findings, considering the fact that there is a lot of noise and thus random features in the data. For all of these components, access to a flexible and efficient statistical computing environment is an essential aspect.
Resumo:
Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Since the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme and breast cancer are analyzed, and comparisons are made with some widely-used algorithms to illustrate the reliability and success of the technique.
Resumo:
OBJECT: In this study, 1H magnetic resonance (MR) spectroscopy was prospectively tested as a reliable method for presurgical grading of neuroepithelial brain tumors. METHODS: Using a database of tumor spectra obtained in patients with histologically confirmed diagnoses, 94 consecutive untreated patients were studied using single-voxel 1H spectroscopy (point-resolved spectroscopy; TE 135 msec, TE 135 msec, TR 1500 msec). A total of 90 tumor spectra obtained in patients with diagnostic 1H MR spectroscopy examinations were analyzed using commercially available software (MRUI/VARPRO) and classified using linear discriminant analysis as World Health Organization (WHO) Grade I/II, WHO Grade III, or WHO Grade IV lesions. In all cases, the classification results were matched with histopathological diagnoses that were made according to the WHO classification criteria after serial stereotactic biopsy procedures or open surgery. Histopathological studies revealed 30 Grade I/II tumors, 29 Grade III tumors, and 31 Grade IV tumors. The reliability of the histological diagnoses was validated considering a minimum postsurgical follow-up period of 12 months (range 12-37 months). Classifications based on spectroscopic data yielded 31 tumors in Grade I/II, 32 in Grade III, and 27 in Grade IV. Incorrect classifications included two Grade II tumors, one of which was identified as Grade III and one as Grade IV; two Grade III tumors identified as Grade II; two Grade III lesions identified as Grade IV; and six Grade IV tumors identified as Grade III. Furthermore, one glioblastoma (WHO Grade IV) was classified as WHO Grade I/II. This represents an overall success rate of 86%, and a 95% success rate in differentiating low-grade from high-grade tumors. CONCLUSIONS: The authors conclude that in vivo 1H MR spectroscopy is a reliable technique for grading neuroepithelial brain tumors.
Resumo:
BACKGROUND: High intercoder reliability (ICR) is required in qualitative content analysis for assuring quality when more than one coder is involved in data analysis. The literature is short of standardized procedures for ICR procedures in qualitative content analysis. OBJECTIVE: To illustrate how ICR assessment can be used to improve codings in qualitative content analysis. METHODS: Key steps of the procedure are presented, drawing on data from a qualitative study on patients' perspectives on low back pain. RESULTS: First, a coding scheme was developed using a comprehensive inductive and deductive approach. Second, 10 transcripts were coded independently by two researchers, and ICR was calculated. A resulting kappa value of .67 can be regarded as satisfactory to solid. Moreover, varying agreement rates helped to identify problems in the coding scheme. Low agreement rates, for instance, indicated that respective codes were defined too broadly and would need clarification. In a third step, the results of the analysis were used to improve the coding scheme, leading to consistent and high-quality results. DISCUSSION: The quantitative approach of ICR assessment is a viable instrument for quality assurance in qualitative content analysis. Kappa values and close inspection of agreement rates help to estimate and increase quality of codings. This approach facilitates good practice in coding and enhances credibility of analysis, especially when large samples are interviewed, different coders are involved, and quantitative results are presented.
Resumo:
Methodological evaluation of the proteomic analysis of cardiovascular-tissue material has been performed with a special emphasis on establishing examinations that allow reliable quantitative analysis of silver-stained readouts. Reliability, reproducibility, robustness and linearity were addressed and clarified. In addition, several types of normalization procedures were evaluated and new approaches are proposed. It has been found that the silver-stained readout offers a convenient approach for quantitation if a linear range for gel loading is defined. In addition, a broad range of a 10-fold input (loading 20-200 microg per gel) fulfills the linearity criteria, although at the lowest input (20 microg) a portion of protein species will remain undetected. The method is reliable and reproducible within a range of 65-200 microg input. The normalization procedure using the sum of all spot intensities from a silver-stained 2D pattern has been shown to be less reliable than other approaches, namely, normalization through median or through involvement of interquartile range. A special refinement of the normalization through virtual segmentation of pattern, and calculation of normalization factor for each stratum provides highly satisfactory results. The presented results not only provide evidence for the usefulness of silver-stained gels for quantitative evaluation, but they are directly applicable to the research endeavor of monitoring alterations in cardiovascular pathophysiology.
Resumo:
To interconnect a wireless sensor network (WSN) to the Internet, we propose to use TCP/IP as the standard protocol for all network entities. We present a cross layer designed communication architecture, which contains a MAC protocol, IP, a new protocol called Hop-to-Hop Reliability (H2HR) protocol, and the TCP Support for Sensor Nodes (TSS) protocol. The MAC protocol implements the MAC layer of beacon-less personal area networks (PANs) as defined in IEEE 802.15.4. H2HR implements hop-to-hop reliability mechanisms. Two acknowledgment mechanisms, explicit and implicit ACK are supported. TSS optimizes using TCP in WSNs by implementing local retransmission of TCP data packets, local TCP ACK regeneration, aggressive TCP ACK recovery, congestion and flow control algorithms. We show that H2HR increases the performance of UDP, TCP, and RMST in WSNs significantly. The throughput is increased and the packet loss ratio is decreased. As a result, WSNs can be operated and managed using TCP/IP.
Resumo:
Background: The design of Virtual Patients (VPs) is essential. So far there are no validated evaluation instruments for VP design published. Summary of work: We examined three sources of validity evidence of an instrument to be filled out by students aimed at measuring the quality of VPs with a special emphasis on fostering clinical reasoning: (1) Content was examined based on theory of clinical reasoning and an international VP expert team. (2) Response process was explored in think aloud pilot studies with students and content analysis of free text questions accompanying each item of the instrument. (3) Internal structure was assessed by confirmatory factor analysis (CFA) using 2547 student evaluations and reliability was examined utilizing generalizability analysis. Summary of results: Content analysis was supported by theory underlying Gruppen and Frohna’s clinical reasoning model on which the instrument is based and an international VP expert team. The pilot study and analysis of free text comments supported the validity of the instrument. The CFA indicated that a three factor model comprising 6 items showed a good fit with the data. Alpha coefficients per factor were 0,74 - 0,82. The findings of the generalizability studies indicated that 40-200 student responses are needed in order to obtain reliable data on one VP. Conclusions: The described instrument has the potential to provide faculty with reliable and valid information about VP design. Take-home messages: We present a short instrument which can be of help in evaluating the design of VPs.
Resumo:
High-throughput assays, such as yeast two-hybrid system, have generated a huge amount of protein-protein interaction (PPI) data in the past decade. This tremendously increases the need for developing reliable methods to systematically and automatically suggest protein functions and relationships between them. With the available PPI data, it is now possible to study the functions and relationships in the context of a large-scale network. To data, several network-based schemes have been provided to effectively annotate protein functions on a large scale. However, due to those inherent noises in high-throughput data generation, new methods and algorithms should be developed to increase the reliability of functional annotations. Previous work in a yeast PPI network (Samanta and Liang, 2003) has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional associations between proteins, and hence suggest their functions. One advantage of the work is that their algorithm is not sensitive to noises (false positives) in high-throughput PPI data. In this study, we improved their prediction scheme by developing a new algorithm and new methods which we applied on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting functionally associated proteins. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as independent and unbiased benchmarks to evaluate our algorithms and methods within the human PPI network. We showed that, compared with the previous work from Samanta and Liang, our algorithm and methods developed in this study improved the overall quality of functional inferences for human proteins. By applying the algorithms to the human PPI network, we obtained 4,233 significant functional associations among 1,754 proteins. Further comparisons of their KEGG and GO annotations allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made pathway analysis to identify several subclusters that are highly enriched in certain signaling pathways. Particularly, we performed a detailed analysis on a subcluster enriched in the transforming growth factor β signaling pathway (P<10-50) which is important in cell proliferation and tumorigenesis. Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotations in this post-genomic era.
Resumo:
BACKGROUND: Cardiac output (CO) measurement with lithium dilution (COLD) has not been fully validated in sheep using precise ultrasonic flow probe technology (COUFP). Sheep generate important cardiovascular research models and the use of COLD has become more popular in experimental settings. METHODS: Ultrasonic transit-time perivascular flow probes were surgically implanted on the pulmonary artery of 13 sheep. Paired COLD readings were taken at six time points, before and after implantation of a left ventricular assist device (LVAD) and compared with COUFP recorded just after lithium injection. RESULTS: The mean COLD was 5.7 litre min(-1) (range 3.8-9.6 litre min(-1)) and mean COUFP 5.9 litre min(-1) (range 4.0-9.2 litre min(-1)). The bias (standard deviation) was 0.3 (1.0) litre min(-1) [5.1 (16.9)%] and limits of agreement (LOA) were -1.7 to 2.3 litre min(-1) (-28.8 to 39.0%) with a percentage error (PE) of 34.4%. Data to assess trending [rate (95% confidence intervals)] included a 78 (62-93)% concordance rate in the four-quadrant plot (n=27). In the half moon polar plot (n=19), the mean polar angle was +5°, the radial LOA were -49 to +35° and 68 (47-89)% of data points fell within 22.5° of the mean polar angle. Both tests indicated moderate to poor trending ability. CONCLUSION: COLD is not precise when evaluated against COUFP in sheep based on the statistical criteria set, but the results are comparable with previously published animal studies. KEYWORDS:
Resumo:
The aim of this study was to evaluate the reliability of the cardiothoracic ratio (CTR) in postmortem computed tomography (PMCT) and to assess a CTR threshold for the diagnosis of cardiomegaly based on the weight of the heart at autopsy. PMCT data of 170 deceased human adults were retrospectively evaluated by two blinded radiologists. The CTR was measured on axial computed tomography images and the actual cardiac weight was weighed at autopsy. Inter-rater reliability, sensitivity, and specificity were calculated. Receiver operating characteristic curves were calculated to assess enlarged heart weight by CTR. The autopsy definition of cardiomegaly was based on normal values of the Zeek method (within a range of both, one or two SD) and the Smith method (within the given range). Intra-class correlation coefficients demonstrated excellent agreements (0.983) regarding CTR measurements. In 105/170 (62 %) cases the CTR in PMCT was >0.5, indicating enlarged heart weight, according to clinical references. The mean heart weight measured in autopsy was 405 ± 105 g. As a result, 114/170 (67 %) cases were interpreted as having enlarged heart weights according to the normal values of Zeek within one SD, while 97/170 (57 %) were within two SD. 100/170 (59 %) were assessed as enlarged according to Smith's normal values. The sensitivity/specificity of the 0.5 cut-off of the CTR for the diagnosis of enlarged heart weight was 78/71 % (Zeek one SD), 74/55 % (Zeek two SD), and 76/59 % (Smith), respectively. The discriminative power between normal heart weight and cardiomegaly was 79, 73, and 74 % for the Zeek (1SD/2SD) and Smith methods respectively. Changing the CTR threshold to 0.57 resulted in a minimum specificity of 95 % for all three definitions of cardiomegaly. With a CTR threshold of 0.57, cardiomegaly can be identified with a very high specificity. This may be useful if PMCT is used by forensic pathologists as a screening tool for medico-legal autopsies.
Resumo:
OBJECTIVES To test the inter-rater reliability of the RoB tool applied to Physical Therapy (PT) trials by comparing ratings from Cochrane review authors with those of blinded external reviewers. METHODS Randomized controlled trials (RCTs) in PT were identified by searching the Cochrane Database of Systematic Reviews for meta-analysis of PT interventions. RoB assessments were conducted independently by 2 reviewers blinded to the RoB ratings reported in the Cochrane reviews. Data on RoB assessments from Cochrane reviews and other characteristics of reviews and trials were extracted. Consensus assessments between the two reviewers were then compared with the RoB ratings from the Cochrane reviews. Agreement between Cochrane and blinded external reviewers was assessed using weighted kappa (κ). RESULTS In total, 109 trials included in 17 Cochrane reviews were assessed. Inter-rater reliability on the overall RoB assessment between Cochrane review authors and blinded external reviewers was poor (κ = 0.02, 95%CI: -0.06, 0.06]). Inter-rater reliability on individual domains of the RoB tool was poor (median κ = 0.19), ranging from κ = -0.04 ("Other bias") to κ = 0.62 ("Sequence generation"). There was also no agreement (κ = -0.29, 95%CI: -0.81, 0.35]) in the overall RoB assessment at the meta-analysis level. CONCLUSIONS Risk of bias assessments of RCTs using the RoB tool are not consistent across different research groups. Poor agreement was not only demonstrated at the trial level but also at the meta-analysis level. Results have implications for decision making since different recommendations can be reached depending on the group analyzing the evidence. Improved guidelines to consistently apply the RoB tool and revisions to the tool for different health areas are needed.
Resumo:
Background: Virtual patients (VPs) are increasingly used to train clinical reasoning. So far, no validated evaluation instruments for VP design are available. Aims: We examined the validity of an instrument for assessing the perception of VP design by learners. Methods: Three sources of validity evidence were examined: (i) Content was examined based on theory of clinical reasoning and an international VP expert team. (ii) The response process was explored in think-aloud pilot studies with medical students and in content analyses of free text questions accompanying each item of the instrument. (iii) Internal structure was assessed by exploratory factor analysis (EFA) and inter-rater reliability by generalizability analysis. Results: Content analysis was reasonably supported by the theoretical foundation and the VP expert team. The think-aloud studies and analysis of free text comments supported the validity of the instrument. In the EFA, using 2547 student evaluations of a total of 78 VPs, a three-factor model showed a reasonable fit with the data. At least 200 student responses are needed to obtain a reliable evaluation of a VP on all three factors. Conclusion: The instrument has the potential to provide valid information about VP design, provided that many responses per VP are available.
Resumo:
BACKGROUND The abstraction of data from medical records is a widespread practice in epidemiological research. However, studies using this means of data collection rarely report reliability. Within the Transition after Childhood Cancer Study (TaCC) which is based on a medical record abstraction, we conducted a second independent abstraction of data with the aim to assess a) intra-rater reliability of one rater at two time points; b) the possible learning effects between these two time points compared to a gold-standard; and c) inter-rater reliability. METHOD Within the TaCC study we conducted a systematic medical record abstraction in the 9 Swiss clinics with pediatric oncology wards. In a second phase we selected a subsample of medical records in 3 clinics to conduct a second independent abstraction. We then assessed intra-rater reliability at two time points, the learning effect over time (comparing each rater at two time-points with a gold-standard) and the inter-rater reliability of a selected number of variables. We calculated percentage agreement and Cohen's kappa. FINDINGS For the assessment of the intra-rater reliability we included 154 records (80 for rater 1; 74 for rater 2). For the inter-rater reliability we could include 70 records. Intra-rater reliability was substantial to excellent (Cohen's kappa 0-6-0.8) with an observed percentage agreement of 75%-95%. In all variables learning effects were observed. Inter-rater reliability was substantial to excellent (Cohen's kappa 0.70-0.83) with high agreement ranging from 86% to 100%. CONCLUSIONS Our study showed that data abstracted from medical records are reliable. Investigating intra-rater and inter-rater reliability can give confidence to draw conclusions from the abstracted data and increase data quality by minimizing systematic errors.
Resumo:
BACKGROUND The aim of this study was to evaluate the accuracy of linear measurements on three imaging modalities: lateral cephalograms from a cephalometric machine with a 3 m source-to-mid-sagittal-plane distance (SMD), from a machine with 1.5 m SMD and 3D models from cone-beam computed tomography (CBCT) data. METHODS Twenty-one dry human skulls were used. Lateral cephalograms were taken, using two cephalometric devices: one with a 3 m SMD and one with a 1.5 m SMD. CBCT scans were taken by 3D Accuitomo® 170, and 3D surface models were created in Maxilim® software. Thirteen linear measurements were completed twice by two observers with a 4 week interval. Direct physical measurements by a digital calliper were defined as the gold standard. Statistical analysis was performed. RESULTS Nasion-Point A was significantly different from the gold standard in all methods. More statistically significant differences were found on the measurements of the 3 m SMD cephalograms in comparison to the other methods. Intra- and inter-observer agreement based on 3D measurements was slightly better than others. LIMITATIONS Dry human skulls without soft tissues were used. Therefore, the results have to be interpreted with caution, as they do not fully represent clinical conditions. CONCLUSIONS 3D measurements resulted in a better observer agreement. The accuracy of the measurements based on CBCT and 1.5 m SMD cephalogram was better than a 3 m SMD cephalogram. These findings demonstrated the linear measurements accuracy and reliability of 3D measurements based on CBCT data when compared to 2D techniques. Future studies should focus on the implementation of 3D cephalometry in clinical practice.