703 resultados para Formative Construct Validity
Resumo:
The outcomes of educational assessments undoubtedly have real implications for students, teachers, schools and education in the widest sense. Assessment results are, for example, used to award qualifications that determine future educational or vocational pathways of students. The results obtained by students in assessments are also used to gauge individual teacher quality, to hold schools to account for the standards achieved by their students, and to compare international education systems. Given the current high-stakes nature of educational assessment, it is imperative that the measurement practices involved have stable philosophical foundations. However, this paper casts doubt on the theoretical underpinnings of contemporary educational measurement models. Aspects of Wittgenstein’s later philosophy and Bohr’s philosophy of quantum theory are used to argue that a quantum theoretical rather than a Newtonian model is appropriate for educational measurement, and the associated implications for the concept of validity are elucidated. Whilst it is acknowledged that the transition to a quantum theoretical framework would not lead to the demise of educational assessment, it is argued that, where practical, current high-stakes assessments should be reformed to become as ‘low-stakes’ as possible. The paper also undermines some of the pro high-stakes testing rhetoric that has a tendency to afflict education.
Formative Evaluation of PlayBoard’s Play Advocacy Programme (January 2010 – June 2012). Final Report
Resumo:
This study explored the validity of using critical thinking tests to predict final psychology degree marks over and above that already predicted by traditional admission exams (A-levels). Participants were a longitudinal sample of 109 psychology students from a university in the United Kingdom. The outcome measures were: total degree marks; and end of year marks. The predictor measures were: university admission exam results (A-levels); critical thinking test scores (skills & dispositions); and non-verbal intelligence scores. Hierarchical regressions showed A-levels significantly predicted 10% of the final degree score and the 11-item measure of ‘Inference skills’ from the California Critical Thinking Skills Test significantly predicted an additional 6% of degree outcome variance. The findings from this study should inform decisions about the precise measurement constructs included in aptitude tests used in the higher education admission process.
Resumo:
This paper addresses the representation of landscape complexity in stated preferences research. It integrates landscape ecology and landscape economics and conducts the landscape analysis in a three-dimensional space to provide ecologically meaningful quantitative landscape indicators that are used as variables for the monetary valuation of landscape in a stated preferences study. Expected heterogeneity in taste intensity across respondents is addressed with a mixed logit model in Willingness to Pay space. The results suggest that the integration of landscape ecology metrics in a stated preferences model provides useful insights for valuing landscape and landscape changes
Resumo:
The purpose of this paper is to conceptualise and operationalise the concept of supply chain management sustainability practices. Based on a multi-stage procedure involving a literature review, expert Q-sort and pre-test process, pilot test and survey of 156 supply chain directors and managers in Ireland, we develop a multidimensional conceptualisation and measure of social and environmental supply chain management sustainability practices. The research findings show theoretically sound constructs based on four underlying sustainable supply chain management practices: monitoring, implementing systems, new product and process development and strategy redefinition. A two-factor model is then identified as the most reliable: comprising process-based and market-based practices.
Resumo:
Clinical clerks learn more than they are taught and not all they learn can be measured. As a result, curriculum leaders evaluate clinical educational environments. The quantitative Dundee Ready Environment Measure (DREEM) is a de facto standard for that purpose. Its 50 items and 5 subscales were developed by consensus. Reasoning that an instrument would perform best if it were underpinned by a clearly conceptualized link between environment and learning as well as psychometric evidence, we developed the mixed methods Manchester Clinical Placement Index (MCPI), eliminated redundant items, and published validity evidence for its 8 item and 2 subscale structure. Here, we set out to compare MCPI with DREEM. 104 students on full-time clinical placements completed both measures three times during a single academic year. There was good agreement and at least as good discrimination between placements with the smaller MCPI. Total MCPI scores and the mean score of its 5-item learning environment subscale allowed ten raters to distinguish between the quality of educational environments. Twenty raters were needed for the 3-item MCPI training subscale and the DREEM scale and its subscales. MCPI compares favourably with DREEM in that one-sixth the number of items perform at least as well psychometrically, it provides formative free text data, and it is founded on the widely shared assumption that communities of practice make good learning environments.
Resumo:
Purpose
The Strengths and Difficulties Questionnaire (SDQ) is a behavioural screening tool for children. The SDQ is increasingly used as the primary outcome measure in population health interventions involving children, but it is not preference based; therefore, its role in allocative economic evaluation is limited. The Child Health Utility 9D (CHU9D) is a generic preference-based health-related quality of-life measure. This study investigates the applicability of the SDQ outcome measure for use in economic evaluations and examines its relationship with the CHU9D by testing previously published mapping algorithms. The aim of the paper is to explore the feasibility of using the SDQ within economic evaluations of school-based population health interventions.
Methods
Data were available from children participating in a cluster randomised controlled trial of the school-based roots of empathy programme in Northern Ireland. Utility was calculated using the original and alternative CHU9D tariffs along with two SDQ mapping algorithms. t tests were performed for pairwise differences in utility values from the preference-based tariffs and mapping algorithms.
Results
Mean (standard deviation) SDQ total difficulties and prosocial scores were 12 (3.2) and 8.3 (2.1). Utility values obtained from the original tariff, alternative tariff, and mapping algorithms using five and three SDQ subscales were 0.84 (0.11), 0.80 (0.13), 0.84 (0.05), and 0.83 (0.04), respectively. Each method for calculating utility produced statistically significantly different values except the original tariff and five SDQ subscale algorithm.
Conclusion
Initial evidence suggests the SDQ and CHU9D are related in some of their measurement properties. The mapping algorithm using five SDQ subscales was found to be optimal in predicting mean child health utility. Future research valuing changes in the SDQ scores would contribute to this research.
Resumo:
Temperament tests are widely accepted as instruments for profiling behavioral variability in dogs, and they are applied in numerous areas of investigation (e.g. suitability for adoption or for breeding). During testing, to elicit a dog's reaction toward novel stimuli and predict its behavior in everyday life, model devices such as a child-like doll, or a fake dog, are often employed. However, the reliability of these devices to accurately stimulate dogs' reactions to children or dogs, is unknown and perhaps overestimated. This may be a particular concern in the case of aggressive behavior toward humans, a significant public health issue. The aim of this study was to: (1) evaluate the correlation between dogs' reactions to these devices, and owners' reports of their dog's aggression history (using the C-BARQ ??); (2) compare reactions toward the devices of dogs with and without histories of aggression. Subjects were selected among those visiting for behavioral consultation at the Veterinary Hospital of the University of Pennsylvania, and previously categorized as aggressive toward unfamiliar children, conspecifics, or as non-aggressive dogs (control). The test consisted of different components: an unfamiliar female tester approaching the dog; the presentation of a child-like doll, an ambiguous object, and a fake plastic dog. All tests were videotaped and durations of behaviors were later analyzed on the basis of a specified ethogram. Dogs' reactions were compared to C-BARQ scores, and interesting correlations emerged for 'dog-directed aggression/fear' (R = 0.48, P = 0.004), and 'stranger-directed aggression' (R = 0.58, P <0.001) factors. Dogs differed in their reactions toward the devices: the child-like doll and the fake dog elicited more social behaviors than the ambiguous object used as a control stimulus. Issues concerning the reliability of these tools to assess canine temperament are discussed. ?? 2012 Elsevier B.V. All rights reserved.