984 resultados para Tests accuracy
Resumo:
PURPOSE. Scanning laser tomography with the Heidelberg retina tomograph (HRT; Heidelberg Engineering, Heidelberg, Germany) has been proposed as a useful diagnostic test for glaucoma. This study was conducted to evaluate the quality of reporting of published studies using the HRT for diagnosing glaucoma. METHODS. A validated Medline and hand search of English-language articles reporting on measures of diagnostic accuracy of the HRT for glaucoma was performed. Two reviewers selected and appraised the papers independently. The Standards for Reporting of Diagnostic Accuracy (STARD) checklist was used to evaluate the quality of each publication. RESULTS. A total of 29 articles were included. Interobserver rating agreement was observed in 83% of items (? = 0.76). The number of STARD items properly reported ranged from 5 to 18. Less than a third of studies (7/29) explicitly reported more than half of the STARD items. Descriptions of key aspects of the methodology were frequently missing. For example, the design of the study (prospective or retrospective) was reported in 6 of 29 studies, and details of participant sampling (e.g., consecutive or random selection) were described in 5 of 29 publications. The commonest description of diagnostic accuracy was sensitivity and specificity (25/29) followed by area under the ROC curve (13/29), with 9 of 29 publications reporting both. CONCLUSIONS. The quality of reporting of diagnostic accuracy tests for glaucoma with HRT is suboptimal. The STARD initiative may be a useful tool for appraising the strengths and weaknesses of diagnostic accuracy studies. Copyright © Association for Research in Vision and Ophthalmology.
Resumo:
Aim: To evaluate the quality of reporting of all diagnostic studies published in five major ophthalmic journals in the year 2002 using the Standards for Reporting of Diagnostic Accuracy (STARD) initiative parameters. Methods: Manual searching was used to identify diagnostic studies published in 2002 in five leading ophthalmic journals, the American Journal of Ophthalmology (AJO), Archives of Ophthalmology (Archives), British Journal of Ophthalmology (BJO), Investigative Ophthalmology and Visual Science (IOVS), and Ophthalmology. The STARD checklist of 25 items and flow chart was used to evaluate the quality of each publication. Results: A total of 16 publications were included (AJO = 5, Archives = 1, BJO = 2, IOVS = 2, and Ophthalmology = 6). More than half of the studies (n = 9) were related to glaucoma diagnosis. Other specialties included retina (n = 4) cornea (n = 2), and neuro-ophthalmology (n = 1). The most common description of diagnostic accuracy was sensitivity and specificity values, published in 13 articles. The number of fully reported items in evaluated studies ranged from eight to 19. Seven studies reported more than 50% of the STARD items. Conclusions: The current standards of reporting of diagnostic accuracy tests are highly variable. The STARD initiative may be a useful tool for appraising the strengths and weaknesses of diagnostic accuracy studies.
Resumo:
Purpose: There is an urgent need to develop diagnostic tests to improve the detection of pathogens causing life-threatening infection (sepsis). SeptiFast is a CE-marked multi-pathogen real-time PCR system capable of detecting DNA sequences of bacteria and fungi present in blood samples within a few hours. We report here a systematic review and meta-analysis of diagnostic accuracy studies of SeptiFast in the setting of suspected sepsis.
Methods: A comprehensive search strategy was developed to identify studies that compared SeptiFast with blood culture in suspected sepsis. Methodological quality was assessed using QUADAS. Heterogeneity of studies was investigated using a coupled forest plot of sensitivity and specificity and a scatter plot in receiver operator characteristic space. Bivariate model method was used to estimate summary sensitivity and specificity.
Results: From 41 phase III diagnostic accuracy studies, summary sensitivity and specificity for SeptiFast compared with blood culture were 0.68 (95 % CI 0.63–0.73) and 0.86 (95 % CI 0.84–0.89) respectively. Study quality was judged to be variable with important deficiencies overall in design and reporting that could impact on derived diagnostic accuracy metrics.
Conclusions: SeptiFast appears to have higher specificity than sensitivity, but deficiencies in study quality are likely to render this body of work unreliable. Based on the evidence presented here, it remains difficult to make firm recommendations about the likely clinical utility of SeptiFast in the setting of suspected sepsis.
Resumo:
Background: Diagnosis of meningococcal disease relies on recognition of clinical signs and symptoms that are notoriously non-specific, variable, and often absent in the early stages of the disease. Loop-mediated isothermal amplification (LAMP) has previously been shown to be fast and effective for the molecular detection of meningococcal DNA in clinical specimens. We aimed to assess the diagnostic accuracy of meningococcal LAMP as a near-patient test in the emergency department.
Methods: For this observational cohort study of diagnostic accuracy, children aged 0-13 years presenting to the emergency department of the Royal Belfast Hospital for Sick Children (Belfast, UK) with suspected meningococcal disease were eligible for inclusion. Patients underwent a standard meningococcal pack of investigations testing for meningococcal disease. Respiratory (nasopharyngeal swab) and blood specimens were collected from patients and tested with near-patient meningococcal LAMP and the results were compared with those obtained by reference laboratory tests (culture and PCR of blood and cerebrospinal fluid).
Findings: Between Nov 1, 2009, and Jan 31, 2012, 161 eligible children presenting at the hospital underwent the meningococcal pack of investigations and were tested for meningococcal disease, of whom 148 consented and were enrolled in the study. Combined testing of respiratory and blood specimens with use of LAMP was accurate (sensitivity 89% [95% CI 72-96], specificity 100% [97-100], positive predictive value 100% [85-100]; negative predictive value 98% [93-99]) and diagnostically useful (positive likelihood ratio 213 [95% CI 13-infinity] and negative likelihood ratio 0·11 [0·04-0·32]). The median time required for near-patient testing from sample to result was 1 h 26 min (IQR 1 h 20 min-1 h 32 min).
Interpretation: Meningococcal LAMP is straightforward enough for use in any hospital with basic laboratory facilities, and near-patient testing with this method is both feasible and effective. By contrast with existing UK National Institute of Health and Care Excellence guidelines, we showed that molecular testing of non-invasive respiratory specimens from children is diagnostically accurate and clinically useful.
Resumo:
This paper reports on the accuracy of new test methods developed to measure the air and water permeability of high-performance concretes (HPCs). Five representative HPC and one normal concrete (NC) mixtures were tested to estimate both repeatability and reliability of the proposed methods. Repeatability acceptance was adjudged using values of signal-noise ratio (SNR) and discrimination ratio (DR), and reliability was investigated by comparing against standard laboratory-based test methods (i.e., the RILEM gas permeability test and BS EN water penetration test). With SNR and DR values satisfying recommended criteria, it was concluded that test repeatability error has no significant influence on results. In addition, the research confirmed strong positive relationships between the proposed test methods and existing standard permeability assessment techniques. Based on these findings, the proposed test methods show strong potential to become recognized as international methods for determining the permeability of HPCs.
Resumo:
Recently there has been an increasing interest in the development of new methods using Pareto optimality to deal with multi-objective criteria (for example, accuracy and architectural complexity). Once one has learned a model based on their devised method, the problem is then how to compare it with the state of art. In machine learning, algorithms are typically evaluated by comparing their performance on different data sets by means of statistical tests. Unfortunately, the standard tests used for this purpose are not able to jointly consider performance measures. The aim of this paper is to resolve this issue by developing statistical procedures that are able to account for multiple competing measures at the same time. In particular, we develop two tests: a frequentist procedure based on the generalized likelihood-ratio test and a Bayesian procedure based on a multinomial-Dirichlet conjugate model. We further extend them by discovering conditional independences among measures to reduce the number of parameter of such models, as usually the number of studied cases is very reduced in such comparisons. Real data from a comparison among general purpose classifiers is used to show a practical application of our tests.
Resumo:
This paper presents a study investigating the accuracy of two standardized individual achievement tests, the Wechsler Individual Achievement (WIAT) and the Peabody Individual Achievement Test-Revised (PIAT-R). The study compares the students' scores and includes students' opinions of the tests.
Resumo:
Simulations of the global atmosphere for weather and climate forecasting require fast and accurate solutions and so operational models use high-order finite differences on regular structured grids. This precludes the use of local refinement; techniques allowing local refinement are either expensive (eg. high-order finite element techniques) or have reduced accuracy at changes in resolution (eg. unstructured finite-volume with linear differencing). We present solutions of the shallow-water equations for westerly flow over a mid-latitude mountain from a finite-volume model written using OpenFOAM. A second/third-order accurate differencing scheme is applied on arbitrarily unstructured meshes made up of various shapes and refinement patterns. The results are as accurate as equivalent resolution spectral methods. Using lower order differencing reduces accuracy at a refinement pattern which allows errors from refinement of the mountain to accumulate and reduces the global accuracy over a 15 day simulation. We have therefore introduced a scheme which fits a 2D cubic polynomial approximately on a stencil around each cell. Using this scheme means that refinement of the mountain improves the accuracy after a 15 day simulation. This is a more severe test of local mesh refinement for global simulations than has been presented but a realistic test if these techniques are to be used operationally. These efficient, high-order schemes may make it possible for local mesh refinement to be used by weather and climate forecast models.
Resumo:
This article addresses the question of how far working memory may affect second language (L2) learners' improvement in spoken language during a period of immersion. Research is presented testing the hypothesis that individual differences in working memory (WM) capacity are associated with individual variation in improvements in oral production of questions in English. Thirty-two Chinese adult speakers of English were tested, before and after a year's postgraduate study in the United Kingdom, to measure grammatical accuracy and fluency using a question elicitation task, and to measure WM using a battery of first language (L1) and L2 WM tests. Story recall in L1 (Mandarin) was significantly associated with individuals' improvement in oral grammatical measures (p < .05). However, there was no significant mean improvement across the cohort in grammatical accuracy, although there was for fluency. The findings suggest that WM may aid certain aspects of individuals' L2 oral proficiency during academic immersion through postgraduate study. They also indicate that academic immersion in itself can lead to improvements in oral proficiency, independent of WM capacity, but there is no general guarantee of significant grammatical change. Further research to clarify the opportunities for input and interaction available in academic immersion settings is called for.
Resumo:
In order to examine metacognitive accuracy (i.e., the relationship between metacognitive judgment and memory performance), researchers often rely on by-participant analysis, where metacognitive accuracy (e.g., resolution, as measured by the gamma coefficient or signal detection measures) is computed for each participant and the computed values are entered into group-level statistical tests such as the t-test. In the current work, we argue that the by-participant analysis, regardless of the accuracy measurements used, would produce a substantial inflation of Type-1 error rates, when a random item effect is present. A mixed-effects model is proposed as a way to effectively address the issue, and our simulation studies examining Type-1 error rates indeed showed superior performance of mixed-effects model analysis as compared to the conventional by-participant analysis. We also present real data applications to illustrate further strengths of mixed-effects model analysis. Our findings imply that caution is needed when using the by-participant analysis, and recommend the mixed-effects model analysis.
Resumo:
We present an analysis of the accuracy of the method introduced by Lockwood et al. (1994) for the determination of the magnetopause reconnection rate from the dispersion of precipitating ions in the ionospheric cusp region. Tests are made by applying the method to synthesised data. The simulated cusp ion precipitation data are produced by an analytic model of the evolution of newly-opened field lines, along which magnetosheath ions are firstly injected across the magnetopause and then dispersed as they propagate into the ionosphere. The rate at which these newly opened field lines are generated by reconnection can be varied. The derived reconnection rate estimates are then compared with the input variation to the model and the accuracy of the method assessed. Results are presented for steady-state reconnection, for continuous reconnection showing a sine-wave variation in rate and for reconnection which only occurs in square wave pulses. It is found that the method always yields the total flux reconnected (per unit length of the open-closed field-line boundary) to within an accuracy of better than 5%, but that pulses tend to be smoothed so that the peak reconnection rate within the pulse is underestimated and the pulse length is overestimated. This smoothing is reduced if the separation between energy channels of the instrument is reduced; however this also acts to increase the experimental uncertainty in the estimates, an effect which can be countered by improving the time resolution of the observations. The limited time resolution of the data is shown to set a minimum reconnection rate below which the method gives spurious short-period oscillations about the true value. Various examples of reconnection rate variations derived from cusp observations are discussed in the light of this analysis.
Resumo:
Genome-wide association studies (GWAS) have been widely used in genetic dissection of complex traits. However, common methods are all based on a fixed-SNP-effect mixed linear model (MLM) and single marker analysis, such as efficient mixed model analysis (EMMA). These methods require Bonferroni correction for multiple tests, which often is too conservative when the number of markers is extremely large. To address this concern, we proposed a random-SNP-effect MLM (RMLM) and a multi-locus RMLM (MRMLM) for GWAS. The RMLM simply treats the SNP-effect as random, but it allows a modified Bonferroni correction to be used to calculate the threshold p value for significance tests. The MRMLM is a multi-locus model including markers selected from the RMLM method with a less stringent selection criterion. Due to the multi-locus nature, no multiple test correction is needed. Simulation studies show that the MRMLM is more powerful in QTN detection and more accurate in QTN effect estimation than the RMLM, which in turn is more powerful and accurate than the EMMA. To demonstrate the new methods, we analyzed six flowering time related traits in Arabidopsis thaliana and detected more genes than previous reported using the EMMA. Therefore, the MRMLM provides an alternative for multi-locus GWAS.
Resumo:
Sensitivity and specificity are measures that allow us to evaluate the performance of a diagnostic test. In practice, it is common to have situations where a proportion of selected individuals cannot have the real state of the disease verified, since the verification could be an invasive procedure, as occurs with biopsy. This happens, as a special case, in the diagnosis of prostate cancer, or in any other situation related to risks, that is, not practicable, nor ethical, or in situations with high cost. For this case, it is common to use diagnostic tests based only on the information of verified individuals. This procedure can lead to biased results or workup bias. In this paper, we introduce a Bayesian approach to estimate the sensitivity and the specificity for two diagnostic tests considering verified and unverified individuals, a result that generalizes the usual situation based on only one diagnostic test.
Resumo:
When missing data occur in studies designed to compare the accuracy of diagnostic tests, a common, though naive, practice is to base the comparison of sensitivity, specificity, as well as of positive and negative predictive values on some subset of the data that fits into methods implemented in standard statistical packages. Such methods are usually valid only under the strong missing completely at random (MCAR) assumption and may generate biased and less precise estimates. We review some models that use the dependence structure of the completely observed cases to incorporate the information of the partially categorized observations into the analysis and show how they may be fitted via a two-stage hybrid process involving maximum likelihood in the first stage and weighted least squares in the second. We indicate how computational subroutines written in R may be used to fit the proposed models and illustrate the different analysis strategies with observational data collected to compare the accuracy of three distinct non-invasive diagnostic methods for endometriosis. The results indicate that even when the MCAR assumption is plausible, the naive partial analyses should be avoided.
Resumo:
Likelihood ratio tests can be substantially size distorted in small- and moderate-sized samples. In this paper, we apply Skovgaard`s [Skovgaard, I.M., 2001. Likelihood asymptotics. Scandinavian journal of Statistics 28, 3-321] adjusted likelihood ratio statistic to exponential family nonlinear models. We show that the adjustment term has a simple compact form that can be easily implemented from standard statistical software. The adjusted statistic is approximately distributed as X(2) with high degree of accuracy. It is applicable in wide generality since it allows both the parameter of interest and the nuisance parameter to be vector-valued. Unlike the modified profile likelihood ratio statistic obtained from Cox and Reid [Cox, D.R., Reid, N., 1987. Parameter orthogonality and approximate conditional inference. journal of the Royal Statistical Society B49, 1-39], the adjusted statistic proposed here does not require an orthogonal parameterization. Numerical comparison of likelihood-based tests of varying dispersion favors the test we propose and a Bartlett-corrected version of the modified profile likelihood ratio test recently obtained by Cysneiros and Ferrari [Cysneiros, A.H.M.A., Ferrari, S.L.P., 2006. An improved likelihood ratio test for varying dispersion in exponential family nonlinear models. Statistics and Probability Letters 76 (3), 255-265]. (C) 2008 Elsevier B.V. All rights reserved.