55 resultados para Standardised tests
Resumo:
This review is an output of the International Life Sciences Institute (ILSI) Europe Marker Initiative, which aims to identify evidence-based criteria for selecting adequate measures of nutrient effects on health through comprehensive literature review. Experts in cognitive and nutrition sciences examined the applicability of these proposed criteria to the field of cognition with respect to the various cognitive domains usually assessed to reflect brain or neurological function. This review covers cognitive domains important in the assessment of neuronal integrity and function, commonly used tests and their state of validation, and the application of the measures to studies of nutrition and nutritional intervention trials. The aim is to identify domain-specific cognitive tests that are sensitive to nutrient interventions and from which guidance can be provided to aid the application of selection criteria for choosing the most suitable tests for proposed nutritional intervention studies using cognitive outcomes. The material in this review serves as a background and guidance document for nutritionists, neuropsychologists, psychiatrists, and neurologists interested in assessing mental health in terms of cognitive test performance and for scientists intending to test the effects of food or food components on cognitive function.
Resumo:
This paper presents some important issues on misidentification of human interlocutors in text-based communication during practical Turing tests. The study here presents transcripts in which human judges succumbed to theconfederate effect, misidentifying hidden human foils for machines. An attempt is made to assess the reasons for this. The practical Turing tests in question were held on 23 June 2012 at Bletchley Park, England. A selection of actual full transcripts from the tests is shown and an analysis is given in each case. As a result of these tests, conclusions are drawn with regard to the sort of strategies which can perhaps lead to erroneous conclusions when one is involved as an interrogator. Such results also serve to indicate conversational directions to avoid for those machine designers who wish to create a conversational entity that performs well on the Turing test.
Resumo:
Interpretation of utterances affects an interrogator’s determination of human from machine during live Turing tests. Here, we consider transcripts realised as a result of a series of practical Turing tests that were held on 23 June 2012 at Bletchley Park, England. The focus in this paper is to consider the effects of lying and truth-telling on the human judges by the hidden entities, whether human or a machine. Turing test transcripts provide a glimpse into short text communication, the type that occurs in emails: how does the reader determine truth from the content of a stranger’s textual message? Different types of lying in the conversations are explored, and the judge’s attribution of human or machine is investigated in each test.
Resumo:
Short-term memory (STM) impairments are prevalent in adults with acquired brain injuries. While there are several published tests to assess these impairments, the majority require speech production, e.g. digit span (Wechsler, 1987). This feature may make them unsuitable for people with aphasia and motor speech disorders because of word finding difficulties and speech demands respectively. If patients perceive the speech demands of the test to be high, the may not engage with testing. Furthermore, existing STM tests are mainly ‘pen-and-paper’ tests, which can jeopardise accuracy. To address these shortcomings, we designed and standardised a novel computerised test that does not require speech output and because of the computerised delivery it would enable clinicians identify STM impairments with greater precision than current tests. The matching listening span tasks, similar to the non-normed PALPA 13 (Kay, Lesser & Coltheart, 1992) is used to test short-term memory for serial order of spoken items. Sequences of digits are presented in pairs. The person hears the first sequence, followed by the second sequence and s/he decides whether the two sequences are the same or different. In the computerised test, the sequences are presented in live voice recordings on a portable computer through a software application (Molero Martin, Laird, Hwang & Salis 2013). We collected normative data from healthy older adults (N=22-24) using digits, real words (one- and two-syllables) and non-words (one- and two- syllables). Their performance was scored following two systems. The Highest Span system was the highest span length (e.g. 2-8) at which a participant correctly responded to over 7 out of 10 trials at the highest sequence length. Test re-test reliability was also tested in a subgroup of participants. The test will be available as free of charge for clinicians and researchers to use.
Resumo:
We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.
Resumo:
Accurate knowledge of species’ habitat associations is important for conservation planning and policy. Assessing habitat associations is a vital precursor to selecting appropriate indicator species for prioritising sites for conservation or assessing trends in habitat quality. However, much existing knowledge is based on qualitative expert opinion or local scale studies, and may not remain accurate across different spatial scales or geographic locations. Data from biological recording schemes have the potential to provide objective measures of habitat association, with the ability to account for spatial variation. We used data on 50 British butterfly species as a test case to investigate the correspondence of data-derived measures of habitat association with expert opinion, from two different butterfly recording schemes. One scheme collected large quantities of occurrence data (c. 3 million records) and the other, lower quantities of standardised monitoring data (c. 1400 sites). We used general linear mixed effects models to derive scores of association with broad-leaf woodland for both datasets and compared them with scores canvassed from experts. Scores derived from occurrence and abundance data both showed strongly positive correlations with expert opinion. However, only for occurrence data did these fell within the range of correlations between experts. Data-derived scores showed regional spatial variation in the strength of butterfly associations with broad-leaf woodland, with a significant latitudinal trend in 26% of species. Sub-sampling of the data suggested a mean sample size of 5000 occurrence records per species to gain an accurate estimation of habitat association, although habitat specialists are likely to be readily detected using several hundred records. Occurrence data from recording schemes can thus provide easily obtained, objective, quantitative measures of habitat association.
Resumo:
More than 70 years ago it was recognised that ionospheric F2-layer critical frequencies [foF2] had a strong relationship to sunspot number. Using historic datasets from the Slough and Washington ionosondes, we evaluate the best statistical fits of foF2 to sunspot numbers (at each Universal Time [UT] separately) in order to search for drifts and abrupt changes in the fit residuals over Solar Cycles 17-21. This test is carried out for the original composite of the Wolf/Zürich/International sunspot number [R], the new “backbone” group sunspot number [RBB] and the proposed “corrected sunspot number” [RC]. Polynomial fits are made both with and without allowance for the white-light facular area, which has been reported as being associated with cycle-to-cycle changes in the sunspot number - foF2 relationship. Over the interval studied here, R, RBB, and RC largely differ in their allowance for the “Waldmeier discontinuity” around 1945 (the correction factor for which for R, RBB and RC is, respectively, zero, effectively over 20 %, and explicitly 11.6 %). It is shown that for Solar Cycles 18-21, all three sunspot data sequences perform well, but that the fit residuals are lowest and most uniform for RBB. We here use foF2 for those UTs for which R, RBB, and RC all give correlations exceeding 0.99 for intervals both before and after the Waldmeier discontinuity. The error introduced by the Waldmeier discontinuity causes R to underestimate the fitted values based on the foF2 data for 1932-1945 but RBB overestimates them by almost the same factor, implying that the correction for the Waldmeier discontinuity inherent in RBB is too large by a factor of two. Fit residuals are smallest and most uniform for RC and the ionospheric data support the optimum discontinuity multiplicative correction factor derived from the independent Royal Greenwich Observatory (RGO) sunspot group data for the same interval.
Resumo:
Creating non-word lists is a necessary but time consuming exercise often needed when conducting behavioural language tasks involving lexical decision-making or non-word reading. The following article describes the process whereby we created a list of 226 non-words matching 226 of the Snodgrass picture set (Snodgrass & Vanderwart, 1980).The non-words were matched for number of syllables, stress pattern, number of phonemes, bigram count and presence and location of the target sound when relevant.