828 resultados para VALIDITY INDICES
Resumo:
The outcomes of educational assessments undoubtedly have real implications for students, teachers, schools and education in the widest sense. Assessment results are, for example, used to award qualifications that determine future educational or vocational pathways of students. The results obtained by students in assessments are also used to gauge individual teacher quality, to hold schools to account for the standards achieved by their students, and to compare international education systems. Given the current high-stakes nature of educational assessment, it is imperative that the measurement practices involved have stable philosophical foundations. However, this paper casts doubt on the theoretical underpinnings of contemporary educational measurement models. Aspects of Wittgenstein’s later philosophy and Bohr’s philosophy of quantum theory are used to argue that a quantum theoretical rather than a Newtonian model is appropriate for educational measurement, and the associated implications for the concept of validity are elucidated. Whilst it is acknowledged that the transition to a quantum theoretical framework would not lead to the demise of educational assessment, it is argued that, where practical, current high-stakes assessments should be reformed to become as ‘low-stakes’ as possible. The paper also undermines some of the pro high-stakes testing rhetoric that has a tendency to afflict education.
Resumo:
Cold-pressed rapeseed oil (CPRSO) is produced when seeds from an oilseed rape crop are mechanically crushed whilst at a low temperature. CPRSO’s popularity is rapidly expanding and is now produced in most Northern European countries including both N.Ireland and ROI. The CPRSO industry is still relatively new and therefore not as widely researched as other high quality oils. Fifteen CPRSO from The UK, Ireland and France were examined to determine characteristic differences between the oils. Two samples of extra-virgin olive oil and two samples of refined rapeseed oil were also included in the investigation to assess performance against market competitors. The antioxidant potential of the oils was assessed using the ABTS and DPPH radical scavenging assays. Both unexpectedly showed that refined rapeseed oil had the highest potential whilst there was significant difference between many of the CPRSO’s. The acid value (ACOS method Cd 3d-63) ranged widely from 0.47-3.41. To predict the stability during storage, an accelerated oxidation test was carried out where the oils were placed in an oven (60°C) and peroxide value was monitored. The results showed extra-virgin olive oil underwent the least oxidation during the trial. The refined rapeseed oil suffered the worst levels of oxidation whilst the CPRSO’s performed similarly but with some variation. Fatty acid composition was investigated with GC-MS and some of the major fatty acids were found to differ significantly between producers. Minor compound analysis was achieved with extraction and identification through HPLC. All results are critically discussed and compared to relevant published studies.
Resumo:
This study explored the validity of using critical thinking tests to predict final psychology degree marks over and above that already predicted by traditional admission exams (A-levels). Participants were a longitudinal sample of 109 psychology students from a university in the United Kingdom. The outcome measures were: total degree marks; and end of year marks. The predictor measures were: university admission exam results (A-levels); critical thinking test scores (skills & dispositions); and non-verbal intelligence scores. Hierarchical regressions showed A-levels significantly predicted 10% of the final degree score and the 11-item measure of ‘Inference skills’ from the California Critical Thinking Skills Test significantly predicted an additional 6% of degree outcome variance. The findings from this study should inform decisions about the precise measurement constructs included in aptitude tests used in the higher education admission process.
Resumo:
Drastic biodiversity declines have raised concerns about the deterioration of ecosystem functions and have motivated much recent research on the relationship between species diversity and ecosystem functioning. A functional trait framework has been proposed to improve the mechanistic understanding of this relationship, but this has rarely been tested for organisms other than plants. We analysed eight datasets, including five animal groups, to examine how well a trait-based approach, compared with a more traditional taxonomic approach, predicts seven ecosystem functions below- and above-ground. Trait-based indices consistently provided greater explanatory power than species richness or abundance. The frequency distributions of single or multiple traits in the community were the best predictors of ecosystem functioning. This implies that the ecosystem functions we investigated were underpinned by the combination of trait identities (i.e. single-trait indices) and trait complementarity (i.e. multi-trait indices) in the communities. Our study provides new insights into the general mechanisms that link biodiversity to ecosystem functioning in natural animal communities and suggests that the observed responses were due to the identity and dominance patterns of the trait composition rather than the number or abundance of species per se.
Resumo:
This paper addresses the representation of landscape complexity in stated preferences research. It integrates landscape ecology and landscape economics and conducts the landscape analysis in a three-dimensional space to provide ecologically meaningful quantitative landscape indicators that are used as variables for the monetary valuation of landscape in a stated preferences study. Expected heterogeneity in taste intensity across respondents is addressed with a mixed logit model in Willingness to Pay space. The results suggest that the integration of landscape ecology metrics in a stated preferences model provides useful insights for valuing landscape and landscape changes
Resumo:
Background A 2014 national audit used the English General Practice Patient Survey (GPPS) to compare service users’ experience of out-of-hours general practitioner (GP) services, yet there is no published evidence on the validity of these GPPS items. Objectives Establish the construct and concurrent validity of GPPS items evaluating service users’ experience of GP out-of-hours care. Methods Cross-sectional postal survey of service users (n=1396) of six English out-of-hours providers. Participants reported on four GPPS items evaluating out-of-hours care (three items modified following cognitive interviews with service users), and 14 evaluative items from the Out-of-hours Patient Questionnaire (OPQ). Construct validity was assessed through correlations between any reliable (Cochran's α>0.7) scales, as suggested by a principal component analysis of the modified GPPS items, with the ‘entry access’ (four items) and ‘consultation satisfaction’ (10 items) OPQ subscales. Concurrent validity was determined by investigating whether each modified GPPS item was associated with thematically related items from the OPQ using linear regressions. Results The modified GPPS item-set formed a single scale (α=0.77), which summarised the two-component structure of the OPQ moderately well; explaining 39.7% of variation in the ‘entry access’ scores (r=0.63) and 44.0% of variation in the ‘consultation satisfaction’ scores (r=0.66), demonstrating acceptable construct validity. Concurrent validity was verified as each modified GPPS item was highly associated with a distinct set of related items from the OPQ. Conclusions Minor modifications are required for the English GPPS items evaluating out-of-hours care to improve comprehension by service users. A modified question set was demonstrated to comprise a valid measure of service users’ overall satisfaction with out-of-hours care received. This demonstrates the potential for the use of as few as four items in benchmarking providers and assisting services in identifying, implementing and assessing quality improvement initiatives.
Resumo:
Purpose
The Strengths and Difficulties Questionnaire (SDQ) is a behavioural screening tool for children. The SDQ is increasingly used as the primary outcome measure in population health interventions involving children, but it is not preference based; therefore, its role in allocative economic evaluation is limited. The Child Health Utility 9D (CHU9D) is a generic preference-based health-related quality of-life measure. This study investigates the applicability of the SDQ outcome measure for use in economic evaluations and examines its relationship with the CHU9D by testing previously published mapping algorithms. The aim of the paper is to explore the feasibility of using the SDQ within economic evaluations of school-based population health interventions.
Methods
Data were available from children participating in a cluster randomised controlled trial of the school-based roots of empathy programme in Northern Ireland. Utility was calculated using the original and alternative CHU9D tariffs along with two SDQ mapping algorithms. t tests were performed for pairwise differences in utility values from the preference-based tariffs and mapping algorithms.
Results
Mean (standard deviation) SDQ total difficulties and prosocial scores were 12 (3.2) and 8.3 (2.1). Utility values obtained from the original tariff, alternative tariff, and mapping algorithms using five and three SDQ subscales were 0.84 (0.11), 0.80 (0.13), 0.84 (0.05), and 0.83 (0.04), respectively. Each method for calculating utility produced statistically significantly different values except the original tariff and five SDQ subscale algorithm.
Conclusion
Initial evidence suggests the SDQ and CHU9D are related in some of their measurement properties. The mapping algorithm using five SDQ subscales was found to be optimal in predicting mean child health utility. Future research valuing changes in the SDQ scores would contribute to this research.
Resumo:
Temperament tests are widely accepted as instruments for profiling behavioral variability in dogs, and they are applied in numerous areas of investigation (e.g. suitability for adoption or for breeding). During testing, to elicit a dog's reaction toward novel stimuli and predict its behavior in everyday life, model devices such as a child-like doll, or a fake dog, are often employed. However, the reliability of these devices to accurately stimulate dogs' reactions to children or dogs, is unknown and perhaps overestimated. This may be a particular concern in the case of aggressive behavior toward humans, a significant public health issue. The aim of this study was to: (1) evaluate the correlation between dogs' reactions to these devices, and owners' reports of their dog's aggression history (using the C-BARQ ??); (2) compare reactions toward the devices of dogs with and without histories of aggression. Subjects were selected among those visiting for behavioral consultation at the Veterinary Hospital of the University of Pennsylvania, and previously categorized as aggressive toward unfamiliar children, conspecifics, or as non-aggressive dogs (control). The test consisted of different components: an unfamiliar female tester approaching the dog; the presentation of a child-like doll, an ambiguous object, and a fake plastic dog. All tests were videotaped and durations of behaviors were later analyzed on the basis of a specified ethogram. Dogs' reactions were compared to C-BARQ scores, and interesting correlations emerged for 'dog-directed aggression/fear' (R = 0.48, P = 0.004), and 'stranger-directed aggression' (R = 0.58, P <0.001) factors. Dogs differed in their reactions toward the devices: the child-like doll and the fake dog elicited more social behaviors than the ambiguous object used as a control stimulus. Issues concerning the reliability of these tools to assess canine temperament are discussed. ?? 2012 Elsevier B.V. All rights reserved.
Resumo:
PURPOSE:
To determine the accuracy of a history of cataract and cataract surgery (self-report and for a sibling), and to determine which demographic, cognitive, and medical factors are predictive of an accurate history.
METHODS:
All participants in the Salisbury Eye Evaluation (SEE) project and their locally resident siblings were questioned about a personal and family history of cataract or cataract surgery. Lens grading at the slit lamp, using standardized photographs and a grading system, was performed for both SEE participants (probands) and their siblings. Cognitive testing and a history of systemic comorbidities were also obtained for all probands.
RESULTS:
Sensitivity of a history of cataract provided on behalf of a sibling was 32%, specificity 98%. The performance was better for a history of cataract surgery: sensitivity 90%, specificity 89%. For self-report of cataract, sensitivity was also low at 55%, with specificity at 77%. Self-report of cataract surgery gave a much better performance: sensitivity 94%, specificity 100%. Different cutoffs in the definition of cataract had little impact. Factors predicting a correct history of cataract included high school or greater education in the proband (odds ratio [OR] = 1.13, 95% confidence interval [CI]1.02-1.25) and younger sibling (but not proband) age (OR = 0.94 for each year of age, 95% CI 0.90-0.99). Gender, race and Mini-Mental Status Examination (MMSE) result were not predictive.
CONCLUSIONS:
Whereas accurate self and family histories for cataract surgery may be obtainable, it is difficult to ascertain cataract status accurately from history alone.
Resumo:
The contemporary literature investigating the construct broadly known as time perspective is replete with methodological and conceptual concerns. These concerns focus on the reliability and factorial validity of measurement tools, and the sample-specific modification of scales. These issues continue to hamper the development of this potentially useful psychological construct. An emerging body of evidence has supported the six-factor structure of scores on the Adolescent Time Inventory-Time Attitudes Scale, as well as their reliability. The present study utilized data from the first wave of a longitudinal study in the United Kingdom to examine the reliability, validity, and cross-cultural invariance of the scale. Results showed that the hypothesized six-factor model provided the best fit for the data; all alpha and omega estimates were >. .70; scores on ATI-TA factors related meaningfully to self-efficacy scores; and the factor structure was invariant across both research sites. Results are discussed in the context of the extant temporal literature.
Resumo:
A key assumption of dual process theory is that reasoning is an explicit, effortful, deliberative process. The present study offers evidence for an implicit, possibly intuitive component of reasoning. Participants were shown sentences embedded in logically valid or invalid arguments. Participants were not asked to reason but instead rated the sentences for liking (Experiment 1) and physical brightness (Experiments 2-3). Sentences that followed logically from preceding sentences were judged to be more likable and brighter. Two other factors thought to be linked to implicit processing-sentence believability and facial expression-had similar effects on liking and brightness ratings. The authors conclude that sensitivity to logical structure was implicit, occurring potentially automatically and outside of awareness. They discuss the results within a fluency misattribution framework and make reference to the literature on discourse comprehension.