102 resultados para rating speaking tests
em CentAUR: Central Archive University of Reading - UK
Resumo:
This paper develops and tests formulas for representing playing strength at chess by the quality of moves played, rather than by the results of games. Intrinsic quality is estimated via evaluations given by computer chess programs run to high depth, ideally so that their playing strength is sufficiently far ahead of the best human players as to be a `relatively omniscient' guide. Several formulas, each having intrinsic skill parameters s for `sensitivity' and c for `consistency', are argued theoretically and tested by regression on large sets of tournament games played by humans of varying strength as measured by the internationally standard Elo rating system. This establishes a correspondence between Elo rating and the parameters. A smooth correspondence is shown between statistical results and the century points on the Elo scale, and ratings are shown to have stayed quite constant over time. That is, there has been little or no `rating inflation'. The theory and empirical results are transferable to other rational-choice settings in which the alternatives have well-defined utilities, but in which complexity and bounded information constrain the perception of the utility values.
Resumo:
This study tests Slobin’s (1996) claim that L2 learners struggle with conceptual restructuring in L2 acquisition. We suggest that learners can find themselves in four different reconceptualisation scenarios: the TRANSFER, RESTRUCTURING, CREATIVE/HYBRID and CONVERGENCE SCENARIOS. To test this proposal in the field of event conceptualisation, a comprehensive analysis was made of the frequency distribution of path, manner, caused motion and deictic verbs in narratives elicited from intermediate (N=20) and advanced learners (N=21) of French, as well as native speakers of French (N=23) and English (N=30). The productions of the intermediate level learners were found to correspond to the creative/hybrid scenario because they differed significantly in their motion expressions from English as well as French native speakers, except for path, which was verbalised in target-like ways early on. Advanced learners were found to be able to reconceptualise motion in the L2, as far as manner and path are concerned, but continued to struggle with deictic verbs and caused motion. The clearest evidence for transfer from the L1 was found in verbalisations among intermediate level learners of events which involved a boundary crossing.
Resumo:
Egger (2008) constructs some idealised experiments to test the usefulness of piecewise potential vorticity inversion (PPVI) in the diagnosis of Rossby wave dynamics and baroclinic development. He concludes that, ``PPVI does not help us to understand the dynamics of linear Rossby waves. It provides local tendencies of the streamfunction which are unrelated to the true ones. The same way, the motion of baroclinic waves in shear flow cannot be understood by using PPVI. Moreover, the effect of boundary temperatures as determined by PPVI is unrelated to the flow evolution.'' He goes further in arguing that we should not consider velocities as ``induced'' by PV anomalies defined by carving up the global domain. However, these conclusions partly reflect the limitations of his idealised experiments and the manner in which the PV components were partitioned from one another.
Resumo:
Simulations of the global atmosphere for weather and climate forecasting require fast and accurate solutions and so operational models use high-order finite differences on regular structured grids. This precludes the use of local refinement; techniques allowing local refinement are either expensive (eg. high-order finite element techniques) or have reduced accuracy at changes in resolution (eg. unstructured finite-volume with linear differencing). We present solutions of the shallow-water equations for westerly flow over a mid-latitude mountain from a finite-volume model written using OpenFOAM. A second/third-order accurate differencing scheme is applied on arbitrarily unstructured meshes made up of various shapes and refinement patterns. The results are as accurate as equivalent resolution spectral methods. Using lower order differencing reduces accuracy at a refinement pattern which allows errors from refinement of the mountain to accumulate and reduces the global accuracy over a 15 day simulation. We have therefore introduced a scheme which fits a 2D cubic polynomial approximately on a stencil around each cell. Using this scheme means that refinement of the mountain improves the accuracy after a 15 day simulation. This is a more severe test of local mesh refinement for global simulations than has been presented but a realistic test if these techniques are to be used operationally. These efficient, high-order schemes may make it possible for local mesh refinement to be used by weather and climate forecast models.
Resumo:
Field populations of earthworms have shown a varied response in mortality to the fungicide carbendazim, the toxic reference substance used in agrochemical field trials. The aim of this study was to determine the influence of soil conditions as a potential cause of this variation. Laboratory acute toxicity tests were conducted using a range of artificial soils with varying soil components (organic matter, clay, pH and moisture). Batch adsorption/desorption studies were run to determine the influence of the soil properties on carbendazim behaviour. Adsorption was shown to be correlated with organic matter content and pH and this in turn could be linked to Eisenia fetida mortality, with lower mortality occurring with increased adsorption. Overall while E.fetida mortality did vary significantly between several of the soils the calculated LC50 values in the different soils did not cover a wide range (6.04-16.00 mg kg(-1)), showing that under these laboratory conditions soil components did not greatly influence carbendazim toxicity to E.fietida. (c) 2007 Elsevier Masson SAS. All rights reserved.
Resumo:
Systems Engineering often involves computer modelling the behaviour of proposed systems and their components. Where a component is human, fallibility must be modelled by a stochastic agent. The identification of a model of decision-making over quantifiable options is investigated using the game-domain of Chess. Bayesian methods are used to infer the distribution of players’ skill levels from the moves they play rather than from their competitive results. The approach is used on large sets of games by players across a broad FIDE Elo range, and is in principle applicable to any scenario where high-value decisions are being made under pressure.
Resumo:
This study looks at idiom comprehension by French-speaking people with Williams’ syndrome (WS) and metapragmatic knowledge is examined. Idiomatic expressions are a nonliteral form of language where there is a considerable difference between what is said (literal interpretation) and what is meant (idiomatic interpretation). WS is characterized by a relatively preserved formal language, social interest and poor conversational skills. Using this framework, the present study aims to explore the comprehension of idiomatic expressions by 20 participants with WS. Participants performed a story completion task (comprehension task), and a task of metapragmatic knowledge to justify their chosen answers. WS performances were compared to typically developing children with the same verbal mental age. The main results can be summarized as follows: (1) People with WS have difficulties to understand idioms; (3) WS group seems to perform partly as typically developing children for the acquisition of metapragmatic knowledge of linguistic convention: there is a progressive increase in metapragmatic knowledge of linguistic convention as age increased. Our results indicate a delay of acquisition in idiom comprehension in Williams’ syndrome.
Resumo:
The article considers screening human populations with two screening tests. If any of the two tests is positive, then full evaluation of the disease status is undertaken; however, if both diagnostic tests are negative, then disease status remains unknown. This procedure leads to a data constellation in which, for each disease status, the 2 × 2 table associated with the two diagnostic tests used in screening has exactly one empty, unknown cell. To estimate the unobserved cell counts, previous approaches assume independence of the two diagnostic tests and use specific models, including the special mixture model of Walter or unconstrained capture–recapture estimates. Often, as is also demonstrated in this article by means of a simple test, the independence of the two screening tests is not supported by the data. Two new estimators are suggested that allow associations of the screening test, although the form of association must be assumed to be homogeneous over disease status. These estimators are modifications of the simple capture–recapture estimator and easy to construct. The estimators are investigated for several screening studies with fully evaluated disease status in which the superior behavior of the new estimators compared to the previous conventional ones can be shown. Finally, the performance of the new estimators is compared with maximum likelihood estimators, which are more difficult to obtain in these models. The results indicate the loss of efficiency as minor.
Resumo:
Oral nutrition supplements (ONS) are routinely prescribed to those with, or at risk of, malnutrition. Previous research identified poor compliance due to taste and sweetness. This paper investigates taste and hedonic liking of ONS, of varying sweetness and metallic levels, over consumption volume; an important consideration as patients are prescribed large volumes of ONS daily. A sequential descriptive profile was developed to determine the perception of sensory attributes over repeat consumption of ONS. Changes in liking of ONS following repeat consumption were characterised by a boredom test. Certain flavour (metallic taste, soya milk flavour) and mouthfeel (mouthdrying, mouthcoating) attributes built up over increased consumption volume (p 0.002). Hedonic liking data from two cohorts, healthy older volunteers (n = 32, median age 73) and patients (n = 28, median age 85), suggested such build-up was disliked. Efforts made to improve the palatability of ONS must take account of the build up of taste and mouthfeel characteristics over increased consumption volume.
Resumo:
1. We tested three pesticides used for field manipulations of herbivory for direct phytoactive effects on the germination and growth of 14 herbaceous plant species selected to provide a range of life-history strategies and functional groups. 2. We report three companion experiments: (A) Two insecticides, chlorpyrifos (granular soil insecticide) and dimethoate (foliar spray), were applied in fully-factorial combination to pot-germinated individuals of 12 species. (B) The same fully-factorial design was used to test for direct effects on the germination of four herbaceous legumes. (C) The molluscicide, metaldehyde, was tested for direct effects on the germination and growth of six plant species. 3. The insecticides had few significant effects on growth and germination. Dimethoate acted only on growth stimulating Anisantha sterilis, Sonchus asper and Stellaria graminea. In contrast, chlorpyrifos acted on germination increasing the germination of Trifolium dubium and Trifolium pratense. There was also a significant interactive effect of chlorpyrifos and dimethoate on the germination of T pratense. However, all. effects were relatively small in magnitude and explanatory power. The molluscicide had no significant effect on plant germination or growth. 4. The small number and size of direct effects of the pesticides on plant performance is encouraging for the use of these pesticides in manipulative experiments on herbivory, especially for the molluscicide. However, a smatt number of direct (positive) effects of the insecticides on some plant species need to be taken into account when interpreting field manipulations of herbivory with these compounds, and emphasises the importance of conducting tests for direct phyto-active effects. (C) 2004 Elsevier GmbH. All rights reserved.
Resumo:
The article considers screening human populations with two screening tests. If any of the two tests is positive, then full evaluation of the disease status is undertaken; however, if both diagnostic tests are negative, then disease status remains unknown. This procedure leads to a data constellation in which, for each disease status, the 2 x 2 table associated with the two diagnostic tests used in screening has exactly one empty, unknown cell. To estimate the unobserved cell counts, previous approaches assume independence of the two diagnostic tests and use specific models, including the special mixture model of Walter or unconstrained capture-recapture estimates. Often, as is also demonstrated in this article by means of a simple test, the independence of the two screening tests is not supported by the data. Two new estimators are suggested that allow associations of the screening test, although the form of association must be assumed to be homogeneous over disease status. These estimators are modifications of the simple capture-recapture estimator and easy to construct. The estimators are investigated for several screening studies with fully evaluated disease status in which the superior behavior of the new estimators compared to the previous conventional ones can be shown. Finally, the performance of the new estimators is compared with maximum likelihood estimators, which are more difficult to obtain in these models. The results indicate the loss of efficiency as minor.
Resumo:
Heterogeneity in lifetime data may be modelled by multiplying an individual's hazard by an unobserved frailty. We test for the presence of frailty of this kind in univariate and bivariate data with Weibull distributed lifetimes, using statistics based on the ordered Cox-Snell residuals from the null model of no frailty. The form of the statistics is suggested by outlier testing in the gamma distribution. We find through simulation that the sum of the k largest or k smallest order statistics, for suitably chosen k , provides a powerful test when the frailty distribution is assumed to be gamma or positive stable, respectively. We provide recommended values of k for sample sizes up to 100 and simple formulae for estimated critical values for tests at the 5% level.