26 resultados para Standards, moderation, assessment, teacher judgement, criteria
Resumo:
Objective: To investigate whether the recently developed (statistically derived) "ASsessment in Ankylosing Spondylitis Working Group" improvement criteria (ASAS-IC) for ankylosing spondylitis (AS) reflect clinically relevant improvement according to the opinion of an expert panel. Methods: The ASAS-IC consist of four domains: physical function, spinal pain, patient global assessment, and inflammation. Scores on these four domains of 55 patients with AS, who had participated in a non-steroidal anti-inflammatory drug efficacy trial, were presented to an international expert panel (consisting of patients with AS and members of the ASAS Working Group) in a three round Delphi exercise. The number of (non-) responders according to the ASAS-IC was compared with the final-consensus of the experts. The most important domains in the opinion of the experts were identified, and also selected with discriminant analysis. A number of provisional criteria sets that best represented the consensus of the experts were defined. Using other datasets, these clinically derived criteria sets as well as the statistically derived ASAS-IC were then tested for discriminative properties and for agreement with the end of trial efficacy by patient and doctor. Results: Forty experts completed the three Delphi rounds. The experts considered twice as many patients to be responders than the ASAS-IC (42 v 21). Overall agreement between experts and ASAS-IC was 62%. Spinal pain was considered the most important domain by most experts and was also selected as such by discriminant analysis. Provisional criteria sets with an agreement of greater than or equal to 80% compared with the consensus of the experts showed high placebo response rates (27-42%), in contrast with the ASAS-IC with a predefined placebo response rate of 25%. All criteria sets and the ASAS-IC discriminated well between active and placebo treatment (chi(2) = 36-45; p < 0.001). Compared with the end of trial efficacy assessment, the provisional criteria sets showed an agreement of 71-82%, sensitivity of 67-83%, and specificity of 81-88%. The ASAS-IC showed an agreement of 70%, sensitivity of 62%, and specificity of 89%. Conclusion: The ASAS-IC are strict in defining response, are highly specific, and consequently show lower sensitivity than the clinically derived criteria sets. However, those patients who are considered as responders by applying the ASAS-IC are acknowledged as such by the expert panel as well as by. patients' and doctors' judgments, and are therefore likely to be true responders.
Resumo:
Constructing quality assessment rubrics can be challenging, especially when they are used for integrated, group-centered, applied learning. We describe a collaborative assessment task in which groups of second-year dentistry students developed a complex concept map. In groups of four, the students were given a written, simulated, medical history of a patient and required to construct a concept map illustrating relevant pathophysiological concepts and pharmacological interventions. This report describes a research project aimed at making educational goals of the task more explicit through investigating student and faculty member understandings of the criteria that might be used to assess the concept map. Information was gathered about the perceptions of students in relation to the learning goals associated with the task. These were compared with faculty member perceptions. The findings were used to develop an assessment rubric intended to be more accessible to learners. The new rubric used the language of both faculty members and students to more clearly represent expectations of each criterion and standard. This assessment rubric will be used in 2005 for the next phase of the project.
Resumo:
Objective: This paper compares four techniques used to assess change in neuropsychological test scores before and after coronary artery bypass graft surgery (CABG), and includes a rationale for the classification of a patient as overall impaired. Methods: A total of 55 patients were tested before and after surgery on the MicroCog neuropsychological test battery. A matched control group underwent the same testing regime to generate test–retest reliabilities and practice effects. Two techniques designed to assess statistical change were used: the Reliable Change Index (RCI), modified for practice, and the Standardised Regression-based (SRB) technique. These were compared against two fixed cutoff techniques (standard deviation and 20% change methods). Results: The incidence of decline across test scores varied markedly depending on which technique was used to describe change. The SRB method identified more patients as declined on most measures. In comparison, the two fixed cutoff techniques displayed relatively reduced sensitivity in the detection of change. Conclusions: Overall change in an individual can be described provided the investigators choose a rational cutoff based on likely spread of scores due to chance. A cutoff value of ≥20% of test scores used provided acceptable probability based on the number of tests commonly encountered. Investigators must also choose a test battery that minimises shared variance among test scores.
Resumo:
Background: The OARSI Standing Committee for Clinical Trials Response Criteria Initiative had developed two sets of responder criteria to present the results of changes after treatment in three symptomatic domains (pain, function, and patient's global assessment) as a single variable for clinical trials (1). For each domain, a response was defined by both a relative and an absolute change, with different cut-offs with regard to the drug, the route of administration and the OA localization. Objective: To propose a simplified set of responder criteria with a similar cut-off, whatever the drug, the route or the OA localization. Methods: Data driven approach: (1) Two databases were considered The 'elaboration' database with which the formal OARSI sets of responder criteria were elaborated and The 'revisit' database. (2) Six different scenarios were evaluated: The two formal OARSI sets of criteria Four proposed scenarios of simplified sets of criteria Data from clinical randomized blinded placebo controlled trials were used to evaluate the performances of the two formal scenarios with two different databases ('elaboration' versus 'revisit') and those of the four proposed simplified scenarios within the 'revisit' database. The placebo effect, active effect, treatment effect, and the required sample arm size to obtain the placebo effect and the active treatment effect observed were the performances evaluated for each of the six scenarios. Experts' opinion approach: Results were discussed among the participants of the OMERACT VI meeting, who voted to select the definite OMERACT-OARSI set of criteria (one of the six evaluated scenarios). Results: Data driven approach: Fourteen trials totaling 1886 CA patients and fifteen studies involving 8164 CA patients were evaluated in the 'elaboration' and the 'revisit' databases respectively. The variability of the performances observed in the 'revisit' database when using the different simplified scenarios was similar to that observed between the two databases ('elaboration' versus 'revisit') when using the formal scenarios. The treatment effect and the required sample arm size were similar for each set of criteria. Experts' opinion approach: According to the experts, these two previous performances were the most important of an optimal set of responder criteria. They chose the set of criteria considering both pain and function as evaluation domain and requiring an absolute change and a relative change from baseline to define a response, with similar cut-offs whatever the drug, the route of administration or the CA localization. Conclusion: This data driven and experts' opinion approach is the basis for proposing an optimal simplified set of responder criteria for CA clinical trials. Other studies, using other sets of CA patients, are required in order to further validate this proposed OMERACT - OARSI set of criteria. (C) 2004 OsteoArthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Resumo:
Assessments for assigning the conservation status of threatened species that are based purely on subjective judgements become problematic because assessments can be influenced by hidden assumptions, personal biases and perceptions of risks, making the assessment process difficult to repeat. This can result in inconsistent assessments and misclassifications, which can lead to a lack of confidence in species assessments. It is almost impossible to Understand an expert's logic or visualise the underlying reasoning behind the many hidden assumptions used throughout the assessment process. In this paper, we formalise the decision making process of experts, by capturing their logical ordering of information, their assumptions and reasoning, and transferring them into a set of decisions rules. We illustrate this through the process used to evaluate the conservation status of species under the NatureServe system (Master, 1991). NatureServe status assessments have been used for over two decades to set conservation priorities for threatened species throughout North America. We develop a conditional point-scoring method, to reflect the current subjective process. In two test comparisons, 77% of species' assessments using the explicit NatureServe method matched the qualitative assessments done subjectively by NatureServe staff. Of those that differed, no rank varied by more than one rank level under the two methods. In general, the explicit NatureServe method tended to be more precautionary than the subjective assessments. The rank differences that emerged from the comparisons may be due, at least in part, to the flexibility of the qualitative system, which allows different factors to be weighted on a species-by-species basis according to expert judgement. The method outlined in this study is the first documented attempt to explicitly define a transparent process for weighting and combining factors under the NatureServe system. The process of eliciting expert knowledge identifies how information is combined and highlights any inconsistent logic that may not be obvious in Subjective decisions. The method provides a repeatable, transparent, and explicit benchmark for feedback, further development, and improvement. (C) 2004 Elsevier SAS. All rights reserved.
Resumo:
Risk-ranking protocols are used widely to classify the conservation status of the world's species. Here we report on the first empirical assessment of their reliability by using a retrospective study of 18 pairs of bird and mammal species (one species extinct and the other extant) with eight different assessors. The performance of individual assessors varied substantially, but performance was improved by incorporating uncertainty in parameter estimates and consensus among the assessors. When this was done, the ranks from the protocols were consistent with the extinction outcome in 70-80% of pairs and there were mismatches in only 10-20% of cases. This performance was similar to the subjective judgements of the assessors after they had estimated the range and population parameters required by the protocols, and better than any single parameter. When used to inform subjective judgement, the protocols therefore offer a means of reducing unpredictable biases that may be associated with expert input and have the advantage of making the logic behind assessments explicit. We conclude that the protocols are useful for forecasting extinctions, although they are prone to some errors that have implications for conservation. Some level of error is to be expected, however, given the influence of chance on extinction. The performance of risk assessment protocols may be improved by providing training in the application of the protocols, incorporating uncertainty in parameter estimates and using consensus among multiple assessors, including some who are experts in the application of the protocols. Continued testing and refinement of the protocols may help to provide better absolute estimates of risk, particularly by re-evaluating how the protocols accommodate missing data.
Resumo:
A rapid increase in the number and size of protected areas has prompted interest in their effectiveness and calls for guarantees that they are providing a good return on investment by maintaining their values. Research reviewed here suggests that many remain under threat and a significant number are already suffering deterioration. One suggestion for encouraging good management is to develop a protected-area certification system: however this idea remains controversial and has created intense debate. We list a typology of options for guaranteeing good protected-area management, and give examples, including: danger lists; self-reporting systems against individual or standardised criteria; and independent assessment including standardised third-party reporting, use of existing certification systems such as those for forestry and farming and certification tailored specifically to protected areas. We review the arguments for and against certification and identify some options, such as: development of an accreditation scheme to ensure that assessment systems meet minimum standards; building up experience from projects that are experimenting with certification in protected areas; and initiating certification schemes for specific users such as private protected areas or institutions like the World Heritage Convention.
Resumo:
The design of liquid-retaining structures involves many decisions to be made by the designer based on rules of thumb, heuristics, judgement, codes of practice and previous experience. Structural design problems are often ill structured and there is a need to develop programming environments that can incorporate engineering judgement along with algorithmic tools. Recent developments in artificial intelligence have made it possible to develop an expert system that can provide expert advice to the user in the selection of design criteria and design parameters. This paper introduces the development of an expert system in the design of liquid-retaining structures using blackboard architecture. An expert system shell, Visual Rule Studio, is employed to facilitate the development of this prototype system. It is a coupled system combining symbolic processing with traditional numerical processing. The expert system developed is based on British Standards Code of Practice BS8007. Explanations are made to assist inexperienced designers or civil engineering students to learn how to design liquid-retaining structures effectively and sustainably in their design practices. The use of this expert system in disseminating heuristic knowledge and experience to practitioners and engineering students is discussed.
Resumo:
Objective: Secondary analyses of a previously conducted 1-year randomized controlled trial were performed to assess the application of responder criteria in patients with knee osteoarthritis (OA) using different sets of responder criteria developed by the Osteoarthritis Research Society International (OARSI) (Propositions A and B) for intra-articular drugs and Outcome Measures in Arthritis Clinical Trials (OMERACT)-OARSI (Proposition D). Methods: Two hundred fifty-five patients with knee OA were randomized to appropriate care with hylan G-F 20 (AC + H) or appropriate care without hylan G-F 20 (AC). A patient was defined as a responder at month 12 based on change in Western Ontario and McMaster Universities Osteoarthritis Index pain and function (0-100 normalized scale) and patient global assessment of OA in the study knee (at least one-category improvement in very poor, poor, fair, good and very good). All propositions incorporate both minimum relative and absolute changes. Results: Results demonstrated that statistically significant differences in responders between treatment groups, in favor of hylan G-F 20, were detected for Proposition A (AC + H = 53.5%, AC = 25.2%), Proposition B (AC + H = 56.7%, AC = 32.3%) and Proposition D (AC + H = 66.9%, AC = 42.5%). The highest effectiveness in both treatment groups was observed with Proposition D, whereas Proposition A resulted in the lowest effectiveness in both treatment groups. The treatment group differences always exceeded the required 20% minimum clinically important difference between groups established a priori, and were 28.3%, 24.4% and 24.4% for Propositions A, B and D, respectively. Conclusion: This analysis provides evidence for the capacity of OARSI and OMERACT-OARSI responder criteria to detect clinically important statistically detectable differences between treatment groups. (C) 2004 OsteoArthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Resumo:
Psoriatic arthritis is a multisystem disorder which, from a measurement standpoint, demands consideration of its cutaneous manifestations and both axial and peripheral musculoskeletal involvement. Measurements of various aspects of impairment, ability/disability, and participation/ handicap are feasible using existing measurement techniques, which are for the most part valid, reliable, and responsive. Nevertheless, there remain opportunities for the further development of consensus around core set measures and responder criteria, as well as for instrument development and refinement, standardised assessor training, cross-cultural adaptation of health status questionnaires, electronic data capture, and the introduction of standardised quantitative measurement into routine clinical care.