958 resultados para reliability measurement
Resumo:
The purpose of this paper is to describe the development and to test the reliability of a new method called INTERMED, for health service needs assessment. The INTERMED integrates the biopsychosocial aspects of disease and the relationship between patient and health care system in a comprehensive scheme and reflects an operationalized conceptual approach to case mix or case complexity. The method is developed to enhance interdisciplinary communication between (para-) medical specialists and to provide a method to describe case complexity for clinical, scientific, and educational purposes. First, a feasibility study (N = 21 patients) was conducted which included double scoring and discussion of the results. This led to a version of the instrument on which two interrater reliability studies were performed. In study 1, the INTERMED was double scored for 14 patients admitted to an internal ward by a psychiatrist and an internist on the basis of a joint interview conducted by both. In study 2, on the basis of medical charts, two clinicians separately double scored the INTERMED in 16 patients referred to the outpatient psychiatric consultation service. Averaged over both studies, in 94.2% of all ratings there was no important difference between the raters (more than 1 point difference). As a research interview, it takes about 20 minutes; as part of the whole process of history taking it takes about 15 minutes. In both studies, improvements were suggested by the results. Analyses of study 1 revealed that on most items there was considerable agreement; some items were improved. Also, the reference point for the prognoses was changed so that it reflected both short- and long-term prognoses. Analyses of study 2 showed that in this setting, less agreement between the raters was obtained due to the fact that the raters were less experienced and the scoring procedure was more susceptible to differences. Some improvements--mainly of the anchor points--were specified which may further enhance interrater reliability. The INTERMED proves to be a reliable method for classifying patients' care needs, especially when used by experienced raters scoring by patient interview. It can be a useful tool in assessing patients' care needs, as well as the level of needed adjustment between general and mental health service delivery. The INTERMED is easily applicable in the clinical setting at low time-costs.
Resumo:
In the last decades; a growing stock of literature has been devoted to the criticism of GDP as an indicator of societal wealth. A relevant question is: what are the perspectives to build, on the existing knowledge and consensus, alternative measures of prosperity? A starting point may be to connect well-being research agenda with the sustainability one. However, there is no doubt that there is a lot of complexity and fuzziness inherent in multidimensional concepts such as sustainability and well-being. This article analyses the theoretical foundations and the empirical validity of some multidimensional technical tools that can be used for well-being evaluation and assessment. Of course one should not forget that policy conclusions derived through any mathematical model depend also on the conceptual framework used, i.e. which representation of reality (and thus which societal values and interests) has been considered.
Resumo:
BACKGROUND: The Foot and Ankle Ability Measure (FAAM) is a self reported questionnaire for patients with foot and ankle disorders available in English, German, and Persian. This study plans to translate the FAAM from English to French (FAAM-F) and assess the validity and reliability of this new version.METHODS: The FAAM-F Activities of Daily Living (ADL) and sports subscales were completed by 105 French-speaking patients (average age 50.5 years) presenting various chronic foot and ankle disorders. Convergent and divergent validity was assessed by Pearson's correlation coefficients between the FAAM-F subscales and the SF-36 scales: Physical Functioning (PF), Physical Component Summary (PCS), Mental Health (MH) and Mental Component Summary (MCS). Internal consistency was calculated by Cronbach's Alpha (CA). To assess test re-test reliability, 22 patients filled out the questionnaire a second time to estimate minimal detectable changes (MDC) and intraclass correlation coefficients (ICC).RESULTS: Correlations for FAAM-F ADL subscale were 0.85 with PF, 0.81 with PCS, 0.26 with MH, 0.37 with MCS. Correlations for FAAM-F Sports subscale were 0.72 with PF, 0.72 with PCS, 0.21 with MH, 0.29 with MCS. CA estimates were 0.97 for both subscales. Respectively for the ADL and Sports subscales, ICC were 0.97 and 0.94, errors for a single measure were 8 and 10 points at 95% confidence and the MDC values at 95% confidence were 7 and 18 points.CONCLUSION: The FAAM-F is valid and reliable for the self-assessment of physical function in French-speaking patients with a wide range of chronic foot and ankle disorders.
Resumo:
Background: The COSMIN checklist (COnsensus-based Standards for the selection of health status Measurement INstruments) was developed in an international Delphi study to evaluate the methodological quality of studies on measurement properties of health-related patient reported outcomes (HR-PROs). In this paper, we explain our choices for the design requirements and preferred statistical methods for which no evidence is available in the literature or on which the Delphi panel members had substantial discussion. Methods: The issues described in this paper are a reflection of the Delphi process in which 43 panel members participated. Results: The topics discussed are internal consistency (relevance for reflective and formative models, and distinction with unidimensionality), content validity (judging relevance and comprehensiveness), hypotheses testing as an aspect of construct validity (specificity of hypotheses), criterion validity (relevance for PROs), and responsiveness (concept and relation to validity, and (in) appropriate measures).Conclusions: We expect that this paper will contribute to a better understanding of the rationale behind the items, thereby enhancing the acceptance and use of the COSMIN checklist.
Resumo:
Background: Despite the fact that labour market flexibility has resulted in an expansion of precarious employment in industrialized countries, to date there is limited empirical evidence about its health consequences. The Employment Precariousness Scale (EPRES) is a newly developed, theory-based, multidimensional questionnaire specifically devised for epidemiological studies among waged and salaried workers. Objective: To assess acceptability, reliability and construct validity of EPRES in a sample of waged and salaried workers in Spain. Methods: Cross-sectional study, using a sub-sample of 6.968 temporary and permanent workers from a population-based survey carried out in 2004-2005. The survey questionnaire was interviewer administered and included the six EPRES subscales, measures of the psychosocial work environment (COPSOQ ISTAS21), and perceived general and mental health (SF-36). Results: A high response rate to all EPRES items indicated good acceptability; Cronbach’s alpha coefficients, over 0.70 for all subscales and the global score, demonstrated good internal consistency reliability; exploratory factor analysis using principal axis analysis and varimax rotation confirmed the six-subscale structure and the theoretical allocation of all items. Patterns across known groups and correlation coefficients with psychosocial work environment measures and perceived health demonstrated the expected relations, providing evidence of construct validity. Conclusions: Our results provide evidence in support of the psychometric properties of EPRES, which appears to be a promising tool for the measurement of employment precariousness in public health research.
Resumo:
Introduction: Measures of the degree of lumbar spinal stenosis (LSS) such as antero-posterior diameter of the canal, and dural sac cross sectional area vary, and do not correlate with symptoms or results of surgery. We created a grading system, comprised of seven categories, based on the morphology of the dural sac and its contents as seen on T2 axial images. The categories take into account the ratio of rootlet/ CSF content. Grade A indicates no significant compression, grade D is equivalent to a total myelograhic block. We compared this classification with commonly used criteria of severity of stenosis. Methods: Fifty T2 axial MRI images taken at disc level from 27 symptomatic LSS patients undergoing decompressive surgery were classified twice by two radiologists and three spinal surgeons working at different institutions and countries. Dural sac cross-sectional surface area and AP diameter of the canal were measured both at disc and pedicle level from DICOM images using OsiriX software. Intraand inter-observer reliability were assessed using Cohen's, Fleiss' kappa statistics, and t test. Results: For the morphological grading the average intra-and inter observer kappas were 0.76 and 0.69+, respectively, for physicians working in the study originating country. Combining all observers the kappa values were 0.57 ± 0.19. and 0.44 ± 0.19, respectively. AP diameter and dural sac cross-sectional area measurements showed no statistically significant differences between observers. No correlation between morphological grading and AP diameter or dural sac crosssectional areawas observed in 13 (26%) and 8 cases (16%), respectively. Discussion: The proposed morphological grading relies on the identification of the dural sac and CSF better seen on full MRI series. This was not available to the external observers, which might explain the lower overall kappa values. Since no specific measurement tools are needed the grading suits everyday clinical practice and favours communication of degree of stenosis between practising physicians. The absence of a strict correlation with the dural sac surface suggests that measuring the surface alone might be insufficient in defining LSS as it is essentially a mismatch between the spinal canal and its contents. This grading is now adopted in our unit and further studies concentrating on relation between morphology, clinical symptoms and surgical results are underway.
Resumo:
Little attention has been paid so far to the influence of the chemical nature of the substance when measuring δ 15N by elemental analysis (EA)-isotope ratio mass spectrometry (IRMS). Although the bulk nitrogen isotope analysis of organic material is not to be questioned, literature from different disciplines using IRMS provides hints that the quantitative conversion of nitrate into nitrogen presents difficulties. We observed abnormal series of δ 15N values of laboratory standards and nitrates. These unexpected results were shown to be related to the tailing of the nitrogen peak of nitrate-containing compounds. A series of experiments were set up to investigate the cause of this phenomenon, using ammonium nitrate (NH4NO3) and potassium nitrate (KNO3) samples, two organic laboratory standards as well as the international secondary reference materials IAEA-N1, IAEA-N2-two ammonium sulphates [(NH4)2SO4]-and IAEA-NO-3, a potassium nitrate. In experiment 1, we used graphite and vanadium pentoxide (V2O5) as additives to observe if they could enhance the decomposition (combustion) of nitrates. In experiment 2, we tested another elemental analyser configuration including an additional section of reduced copper in order to see whether or not the tailing could originate from an incomplete reduction process. Finally, we modified several parameters of the method and observed their influence on the peak shape, δ 15N value and nitrogen content in weight percent of nitrogen of the target substances. We found the best results using mere thermal decomposition in helium, under exclusion of any oxygen. We show that the analytical procedure used for organic samples should not be used for nitrates because of their different chemical nature. We present the best performance given one set of sample introduction parameters for the analysis of nitrates, as well as for the ammonium sulphate IAEA-N1 and IAEA-N2 reference materials. We discuss these results considering the thermochemistry of the substances and the analytical technique itself. The results emphasise the difference in chemical nature of inorganic and organic samples, which necessarily involves distinct thermochemistry when analysed by EA-IRMS. Therefore, they should not be processed using the same analytical procedure. This clearly impacts on the way international secondary reference materials should be used for the calibration of organic laboratory standards.
Resumo:
Background: Choosing an adequate measurement instrument depends on the proposed use of the instrument, the concept to be measured, the measurement properties (e.g. internal consistency, reproducibility, content and construct validity, responsiveness, and interpretability), the requirements, the burden for subjects, and costs of the available instruments. As far as measurement properties are concerned, there are no sufficiently specific standards for the evaluation of measurement properties of instruments to measure health status, and also no explicit criteria for what constitutes good measurement properties. In this paper we describe the protocol for the COSMIN study, the objective of which is to develop a checklist that contains COnsensus-based Standards for the selection of health Measurement INstruments, including explicit criteria for satisfying these standards. We will focus on evaluative health related patient-reported outcomes (HR-PROs), i.e. patient-reported health measurement instruments used in a longitudinal design as an outcome measure, excluding health care related PROs, such as satisfaction with care or adherence. The COSMIN standards will be made available in the form of an easily applicable checklist.Method: An international Delphi study will be performed to reach consensus on which and how measurement properties should be assessed, and on criteria for good measurement properties. Two sources of input will be used for the Delphi study: (1) a systematic review of properties, standards and criteria of measurement properties found in systematic reviews of measurement instruments, and (2) an additional literature search of methodological articles presenting a comprehensive checklist of standards and criteria. The Delphi study will consist of four (written) Delphi rounds, with approximately 30 expert panel members with different backgrounds in clinical medicine, biostatistics, psychology, and epidemiology. The final checklist will subsequently be field-tested by assessing the inter-rater reproducibility of the checklist.Discussion: Since the study will mainly be anonymous, problems that are commonly encountered in face-to-face group meetings, such as the dominance of certain persons in the communication process, will be avoided. By performing a Delphi study and involving many experts, the likelihood that the checklist will have sufficient credibility to be accepted and implemented will increase.
Resumo:
The concentrations of 3-beta-hydroxybutyrate (3HB) in blood and two liver samples were retrospectively examined in a series of medicolegal autopsies. These cases included diabetic ketoacidosis, nondiabetic individuals presenting moderate to severe decompositional changes and nondiabetic medicolegal cases privy of decompositional changes. 3HB concentrations in liver sample homogenates correlate well with blood values in all examined groups. Additionally, decompositional changes were not associated with increases in blood and liver 3HB levels. These results suggest that 3HB can be reliably measured in liver homogenates when blood is not available at autopsy. Furthermore, they suggest that metabolic disturbances potentially leading or contributing to death may be objectified through liver 3HB determination even in decomposed bodies.
Resumo:
Communication is an indispensable component of animal societies, yet many open questions remain regarding the factors affecting the evolution and reliability of signalling systems. A potentially important factor is the level of genetic relatedness between signallers and receivers. To quantitatively explore the role of relatedness in the evolution of reliable signals, we conducted artificial evolution over 500 generations in a system of foraging robots that can emit and perceive light signals. By devising a quantitative measure of signal reliability, and comparing independently evolving populations differing in within-group relatedness, we show a strong positive correlation between relatedness and reliability. Unrelated robots produced unreliable signals, whereas highly related robots produced signals that reliably indicated the location of the food source and thereby increased performance. Comparisons across populations also revealed that the frequency for signal production-which is often used as a proxy of signal reliability in empirical studies on animal communication-is a poor predictor of signal reliability and, accordingly, is not consistently correlated with group performance. This has important implications for our understanding of signal evolution and the empirical tools that are used to investigate communication.
Resumo:
Objective: To determine methadone plasma trough and peak concentrations in patients presenting opiate withdrawal symptoms after introduction of nevirapine or efavirenz. To describe the disappearance of these symptoms after methadone titration based on plasma concentrations rather than on the symptoms. Methods: Nine patients undergoing highly active antiretroviral therapy (HAART) and either nevirapine or efavirenz treatment were monitored daily for opiate withdrawal in a specialized drug addiction center. Methadone dose was titrated daily, and plasma concentrations were measured. The data are retrospective (case series). Results: Several patients complained of symptoms such as nausea, vomiting, accelerated intestinal transit, or insomnia. Even after methadone titration based on clinical symptoms, patients and health-care providers trained in infectious disease did not classify these as withdrawal symptoms and considered them as the side effects of HAART or anxiety. Methadone plasma trough concentration showed low levels of (R)- and (R,S)-methadone. Further methadone dose adjustment according to plasma level resulted in the disappearance of these withdrawal symptoms. The daily methadone dose was split when the peak/trough (R)-methadone ratio was more than 2. Conclusions: When introducing efavirenz or nevirapine to patients undergoing methadone treatment, withdrawal symptoms should be monitored, especially those such as insomnia, vomiting, or nausea. Methadone plasma trough and peak measurements can be of value in preventing unnecessary side effects of HAART.
Resumo:
We propose a method to evaluate cyclical models which does not require knowledge of the DGP and the exact empirical specification of the aggregate decision rules. We derive robust restrictions in a class of models; use some to identify structural shocks and others to evaluate the model or contrast sub-models. The approach has good size and excellent power properties, even in small samples. We show how to examine the validity of a class of models, sort out the relevance of certain frictions, evaluate the importance of an added feature, and indirectly estimate structural parameters.
Resumo:
When facing age-related cerebral decline, older adults are unequally affected by cognitive impairment without us knowing why. To explore underlying mechanisms and find possible solutions to maintain life-space mobility, there is a need for a standardized behavioral test that relates to behaviors in natural environments. The aim of the project described in this paper was therefore to provide a free, reliable, transparent, computer-based instrument capable of detecting age-related changes on visual processing and cortical functions for the purposes of research into human behavior in computational transportation science. After obtaining content validity, exploring psychometric properties of the developed tasks, we derived (Study 1) the scoring method for measuring cerebral decline on 106 older drivers aged ≥70 years attending a driving refresher course organized by the Swiss Automobile Association to test the instrument's validity against on-road driving performance (106 older drivers). We then validated the derived method on a new sample of 182 drivers (Study 2). We then measured the instrument's reliability having 17 healthy, young volunteers repeat all tests included in the instrument five times (Study 3) and explored the instrument's psychophysical underlying functions on 47 older drivers (Study 4). Finally, we tested the instrument's responsiveness to alcohol and effects on performance on a driving simulator in a randomized, double-blinded, placebo, crossover, dose-response, validation trial including 20 healthy, young volunteers (Study 5). The developed instrument revealed good psychometric properties related to processing speed. It was reliable (ICC = 0.853) and showed reasonable association to driving performance (R (2) = 0.053), and responded to blood alcohol concentrations of 0.5 g/L (p = 0.008). Our results suggest that MedDrive is capable of detecting age-related changes that affect processing speed. These changes nevertheless do not necessarily affect driving behavior.