Biblioteca Digital

15 resultados para standardized test scores

em CentAUR: Central Archive University of Reading - UK

Long-term positive effects of repeating a year in school: Six-year longitudinal study of self-beliefs, anxiety, social relations, school grades, and test scores

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Consistently with a priori predictions, school retention (repeating a year in school) had largely positive effects for a diverse range of 10 outcomes (e.g., math self-concept, self-efficacy, anxiety, relations with teachers, parents and peers, school grades, and standardized achievement test scores). The design, based on a large, representative sample of German students (N = 1,325, M age = 11.75 years) measured each year during the first five years of secondary school, was particularly strong. It featured four independent retention groups (different groups of students, each repeating one of the four first years of secondary school, total N = 103), with multiple post-test waves to evaluate short- and long-term effects, controlling for covariates (gender, age, SES, primary school grades, IQ) and one or more sets of 10 outcomes realised prior to retention. Tests of developmental invariance demonstrated that the effects of retention (controlling for covariates and pre-retention outcomes) were highly consistent across this potentially volatile early-to-middle adolescent period; largely positive effects in the first year following retention were maintained in subsequent school years following retention. Particularly considering that these results are contrary to at least some of the accepted wisdom about school retention, the findings have important implications for educational researchers, policymakers and parents.

Standardized test to evaluate numerical weather prediction algorithms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to assist in comparing the computational techniques used in different models, the authors propose a standardized set of one-dimensional numerical experiments that could be completed for each model. The results of these experiments, with a simplified form of the computational representation for advection, diffusion, pressure gradient term, Coriolis term, and filter used in the models, should be reported in the peer-reviewed literature. Specific recommendations are described in this paper.

The assessment of GCSE art: Criterion-referencing and cognitive abilities

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper Cognitive Abilities Test scores are compared directly with moderated GCSE scores awarded to the same group of pupils. For ease of interpretation the comparisons are presented in a graphical form. Whilst some provisional and tentative conclusions are drawn about the reliability of GCSE art, questions are raised about the general validity of criterion-referenced assessment in this area.

Conversational success in Williams syndrome: communication in the face of cognitive and linguistic limitations

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Williams syndrome (WS) is characterized by apparent relative strengths in language, facial processing and social cognition but by profound impairment in spatial cognition, planning and problem solving. Following recent research which suggests that individuals with WS may be less linguistically able than was once thought, in this paper we begin to investigate why and how they may give the impression of linguistic proficiency despite poor standardized test results. This case study of Brendan, a 12-year-old boy with WS, who presents with a considerable lack of linguistic ability, suggests that impressions of linguistic competence may to some extent be the result of conversational strategies which enable him to compensate for various cognitive and linguistic deficits with a considerable degree of success. These conversational strengths are not predicted by his standardized language test results, and provide compelling support for the use of approaches such as Conversation Analysis in the assessment of individuals with communication impairments.

Measurement of apolipoprotein B-48 in the Svedberg flotation rate (Sf) > 400, Sf 60 – 400 and Sf 20 – 60 lipoprotein fractions reveals novel findings with respect to the effects of dietary fatty acids on triacylglycerol-rich lipoproteins in postmenopausal women

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The present study was designed to examine whether the type of fat ingested in an initial test meal influences the response and density distribution of dietary-derived lipoproteins in the Svedberg flotation rate (Sf)>400, Sf 60 - 400 and Sf 20 - 60 lipoprotein fractions. A single-blind randomized within-subject crossover design was used to study the effects of palm oil, safflower oil, a mixture of fish and safflower oil, and olive oil on postprandial apolipoprotein (apo) B-48, retinyl ester and triacylglycerol responses in each lipoprotein fraction following an initial test meal containing one of the oils and a second standardized test meal. For all dietary oils, late postprandial (300min) concentrations of triacylglycerol and apo B-48 were significantly higher in the Sf 60 - 400 fraction than in the Sf>400 fraction (P<0.02). Significantly greater apo B-48 incremental areas under the curve (IAUCs) were also observed in the Sf 60 - 400 fraction than in the Sf>400 fraction following palm oil, safflower oil and olive oil (P<0.04), with a similar non-significant trend for fish/safflower oil. Olive oil resulted in a significantly greater apo B-48 IAUC in the Sf>400 fraction (P<0.02) than did any of the other dietary oils, as well as a tendency for a higher IAUC in the Sf 60 - 400 fraction compared with the palm, safflower and fish/safflower oils. In conclusion, we have found that the majority of intestinally derived lipoproteins present in the circulation following meals enriched with saturated, polyunsaturated or monounsaturated fatty acids are of the density and size of small chylomicrons and chylomicron remnants. Olive oil resulted in a greater apo B-48 response compared with the other dietary oils following sequential test meals, suggesting the formation of a greater number of small (Sf 60 - 400) and large (Sf>400) apo B-48-containing lipoproteins in response to this dietary oil.

Measuring lexical diversity among L2 learners of French: an exploration of the validity of D, MTLD and HD-D as measures of language ability

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this study two new measures of lexical diversity are tested for the first time on French. The usefulness of these measures, MTLD (McCarthy and Jarvis (2010 and this volume) ) and HD-D (McCarthy and Jarvis 2007), in predicting different aspects of language proficiency is assessed and compared with D (Malvern and Richards 1997; Malvern, Richards, Chipere and Durán 2004) and Maas (1972) in analyses of stories told by two groups of learners (n=41) of two different proficiency levels and one group of native speakers of French (n=23). The importance of careful lemmatization in studies of lexical diversity which involve highly inflected languages is also demonstrated. The paper shows that the measures of lexical diversity under study are valid proxies for language ability in that they explain up to 62 percent of the variance in French C-test scores, and up to 33 percent of the variance in a measure of complexity. The paper also provides evidence that dependence on segment size continues to be a problem for the measures of lexical diversity discussed in this paper. The paper concludes that limiting the range of text lengths or even keeping text length constant is the safest option in analysing lexical diversity.

Breaking the double-edged sword of effort/trying hard: developmental equilibrium and Longitudinal relations among effort, achievement, and academic self-concept

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ever since the classic research of Nicholls (1976) and others, effort has been recognized as a double-edged sword: whilst it might enhance achievement, it undermines academic self-concept (ASC). However, there has not been a thorough evaluation of the longitudinal reciprocal effects of effort, ASC and achievement,in the context of modern self-concept theory and statistical methodology. Nor have there been developmental equilibrium tests of whether these effects are consistent across the potentially volatile early-to-middle adolescence. Hence, focusing on mathematics, we evaluate reciprocal effects models over the first four years of secondary school, relating effort, achievement (test scores and school grades), ASC, and ASCxEffort interactions for a representative sample of 3,421 German students (Mn age = 11.75 years at Wave 1). ASC, effort and achievement were positively correlated at each wave, and there was a clear pattern of positive reciprocal positive effects among ASC, test scores and school grades—each contributing to the other, after controlling for the prior effects of all others. There was an asymmetrical pattern of effects for effort that is consistent with the double-edged sword premise: prior school grades had positive effects on subsequent effort, but prior effort had non-significant or negative effects on subsequent grades and ASC. However, on the basis of a synergistic application of new theory and methodology, we predicted and found a significant ASC-by-effort interaction, such that prior effort had more positive effects on subsequent ASC and school grades when prior ASC was high—thus providing a key to breaking the double-edged sword.

Turing Test: Mindless Game?

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Turing Test, originally configured for a human to distinguish between an unseen man and unseen woman through a text-based conversational measure of gender, is the ultimate test for thinking. So conceived Alan Turing when he replaced the woman with a machine. His assertion, that once a machine deceived a human judge into believing that they were the human, then that machine should be attributed with intelligence. But is the Turing Test nothing more than a mindless game? We present results from recent Loebner Prizes, a platform for the Turing Test, and find that machines in the contest appear conversationally worse rather than better, from 2004 to 2006, showing a downward trend in highest scores awarded to them by human judges. Thus the machines are not thinking in the same way as a human intelligent entity would.

The binding site distance test score: a robust method for the assessment of predicted protein binding sites

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel method for scoring the accuracy of protein binding site predictions – the Binding-site Distance Test (BDT) score. Recently, the Matthews Correlation Coefficient (MCC) has been used to evaluate binding site predictions, both by developers of new methods and by the assessors for the community wide prediction experiment – CASP8. Whilst being a rigorous scoring method, the MCC does not take into account the actual 3D location of the predicted residues from the observed binding site. Thus, an incorrectly predicted site that is nevertheless close to the observed binding site will obtain an identical score to the same number of nonbinding residues predicted at random. The MCC is somewhat affected by the subjectivity of determining observed binding residues and the ambiguity of choosing distance cutoffs. By contrast the BDT method produces continuous scores ranging between 0 and 1, relating to the distance between the predicted and observed residues. Residues predicted close to the binding site will score higher than those more distant, providing a better reflection of the true accuracy of predictions. The CASP8 function predictions were evaluated using both the MCC and BDT methods and the scores were compared. The BDT was found to strongly correlate with the MCC scores whilst also being less susceptible to the subjectivity of defining binding residues. We therefore suggest that this new simple score is a potentially more robust method for future evaluations of protein-ligand binding site predictions.

Evaluating the intonation of non-native speakers of English using a computerised test battery

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study investigates the intonation of Chinese and Arabic learners of English using the computerized test battery Profiling Elements of Prosody for Speech and Communication (PEPS-C). The aims were to ascertain which aspects of intonation are difficult for these learners, and to determine whether PEPS-C can be used to assess the intonation of adult learners. Although some results were significantly different from native-speaker data, raw scores showed that the learner groups performed well in most tasks, which may indicate that the learners' level is too high for the PEPS-C to be useful. However, the PEPS-C did reveal that Arabic learners performed significantly worse at contrastive stress placement, and Chinese learners performed significantly worse assessing likes and dislikes.

The challenge of validation: assessing the performance of a test of productive vocabulary

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper assesses the performance of a vocabulary test designed to measure second language productive vocabulary knowledge.The test, Lex30, uses a word association task to elicit vocabulary, and uses word frequency data to measure the vocabulary produced. Here we report firstly on the reliability of the test as measured by a test-retest study, a parallel test forms experiment and an internal consistency measure. We then investigate the construct validity of the test by looking at changes in test performance over time, analyses of correlations with scores on similar tests, and comparison of spoken and written test performance. Last, we examine the theoretical bases of the two main test components: eliciting vocabulary and measuring vocabulary. Interpretations of our findings are discussed in the context of test validation research literature. We conclude that the findings reported here present a robust argument for the validity of the test as a research tool, and encourage further investigation of its validity in an instructional context

A simple syllogism-solving test: empirical findings and implications for g research

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been reported that the ability to solve syllogisms is highly g-loaded. In the present study, using a self-administered shortened version of a syllogism-solving test, the BAROCO Short, we examined whether robust findings generated by previous research regarding IQ scores were also applicable to BAROCO Short scores. Five syllogism-solving problems were included in a questionnaire as part of a postal survey conducted by the Keio Twin Research Center. Data were collected from 487 pairs of twins (1021 individuals) who were Japanese junior high or high school students (ages 13–18) and from 536 mothers and 431 fathers. Four findings related to IQ were replicated: 1) The mean level increased gradually during adolescence, stayed unchanged from the 30s to the early 50s, and subsequently declined after the late 50s. 2) The scores for both children and parents were predicted by the socioeconomic status of the family. 3) The genetic effect increased, although the shared environmental effect decreased during progression from adolescence to adulthood. 4) Children's scores were genetically correlated with school achievement. These findings further substantiate the close association between syllogistic reasoning ability and g.

Standardized evaluation of algorithms for computer-aided diagnosis of dementia based on structural MRI: The CADDementia challenge

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n = 30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org.

Overgeneral autobiological memory in women: association with childhood abuse and history of depression in a community sample

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective. Numerous studies have reported elevated levels of overgeneral autobiographical memory among depressed patients and also among those previously exposed to a traumatic event. No previous study has examined their joint association with overgeneral memory in a community sample, nor examined whether the associations are with both juvenile- and adult-onset depression. Methods. The current study examined the relative importance of exposure to childhood abuse and neglect in overgeneral memory of women with and without a history of major depressive disorder (MDD). Autobiographical memory test together with standardized interviews of childhood experiences and MDD were assessed in a risk-stratified community sample of 103 women aged 25–37. Results. Overgenerality in memory was associated with recalled childhood sexual abuse (CSA) but not other adversities. A history of CSA was predictive of overgeneral memory bias even in the absence of MDD. Our analyses indicated no significant association between a history of MDD and overgeneral memory in women who reported no CSA. However, overgeneral memory was increased in women who reported CSA and MDD with a significant difference found in relation to positive cues, the highest scores being seen among those with adult rather than juvenile-onset depression. Conclusions. The findings highlight the significance of CSA in predicting overgeneral memory, differential response in relation to positive and negative cue memories, and point to a specific role in the development of depression for overgeneral memory following CSA.

Assessing species’ habitat associations from occurrence records, standardised monitoring data and expert opinion: a test with British butterflies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Accurate knowledge of species’ habitat associations is important for conservation planning and policy. Assessing habitat associations is a vital precursor to selecting appropriate indicator species for prioritising sites for conservation or assessing trends in habitat quality. However, much existing knowledge is based on qualitative expert opinion or local scale studies, and may not remain accurate across different spatial scales or geographic locations. Data from biological recording schemes have the potential to provide objective measures of habitat association, with the ability to account for spatial variation. We used data on 50 British butterfly species as a test case to investigate the correspondence of data-derived measures of habitat association with expert opinion, from two different butterfly recording schemes. One scheme collected large quantities of occurrence data (c. 3 million records) and the other, lower quantities of standardised monitoring data (c. 1400 sites). We used general linear mixed effects models to derive scores of association with broad-leaf woodland for both datasets and compared them with scores canvassed from experts. Scores derived from occurrence and abundance data both showed strongly positive correlations with expert opinion. However, only for occurrence data did these fell within the range of correlations between experts. Data-derived scores showed regional spatial variation in the strength of butterfly associations with broad-leaf woodland, with a significant latitudinal trend in 26% of species. Sub-sampling of the data suggested a mean sample size of 5000 occurrence records per species to gain an accurate estimation of habitat association, although habitat specialists are likely to be readily detected using several hundred records. Occurrence data from recording schemes can thus provide easily obtained, objective, quantitative measures of habitat association.