55 resultados para Standardised tests
Resumo:
Deception-detection is the crux of Turing’s experiment to examine machine thinking conveyed through a capacity to respond with sustained and satisfactory answers to unrestricted questions put by a human interrogator. However, in 60 years to the month since the publication of Computing Machinery and Intelligence little agreement exists for a canonical format for Turing’s textual game of imitation, deception and machine intelligence. This research raises from the trapped mine of philosophical claims, counter-claims and rebuttals Turing’s own distinct five minutes question-answer imitation game, which he envisioned practicalised in two different ways: a) A two-participant, interrogator-witness viva voce, b) A three-participant, comparison of a machine with a human both questioned simultaneously by a human interrogator. Using Loebner’s 18th Prize for Artificial Intelligence contest, and Colby et al.’s 1972 transcript analysis paradigm, this research practicalised Turing’s imitation game with over 400 human participants and 13 machines across three original experiments. Results show that, at the current state of technology, a deception rate of 8.33% was achieved by machines in 60 human-machine simultaneous comparison tests. Results also show more than 1 in 3 Reviewers succumbed to hidden interlocutor misidentification after reading transcripts from experiment 2. Deception-detection is essential to uncover the increasing number of malfeasant programmes, such as CyberLover, developed to steal identity and financially defraud users in chatrooms across the Internet. Practicalising Turing’s two tests can assist in understanding natural dialogue and mitigate the risk from cybercrime.
Resumo:
In view of the increasing interest in home-grown legumes as components of diets for non-ruminant livestock and in an attempt to reduce the reliance on imported soya bean meal (SBM), two experiments were conducted to evaluate samples of peas and faba beans for their standardised ileal digestibility (SID) of amino acids determined with young broiler chicks. Experiment 1 evaluated six faba bean and seven pea cultivars and Experiment 2 evaluated two faba bean and three pea cultivars as well as a sample of soya bean meal provided as a reference material. Peas and beans were added at 750g/kg as the only source of protein/amino acids in a semi-synthetic diet containing the inert marker titanium dioxide; SBM was added, in a control diet, at 500g/kg. Each diet was fed to six replicates of a cage containing two Ross-type broilers for 96h at which point birds were culled allowing removal of ileal digesta. Chemical analyses allowed the calculation of the coefficient of SID of amino acids. There were no differences between samples of the same pulse species (P>0.05) but peas had higher values (P<0.05), similar to SBM, than beans. Trypsin inhibitor content (expressed as g trypsin inhibitor units/mg sample) of all pea samples was low and in the range 0.83–1.77mg/kg. There was relatively little variation in bean tannin content and composition amongst the coloured-flowered varieties; however, the white-flowered cultivar had no tannins. There was no correlation between tannin content and coefficient of SID. The content of SID of amino acids (g/kg legume) was higher in SBM when compared with peas and beans by virtue of having higher total concentrations.
Resumo:
We investigate for 26 OECD economies whether their current account imbalances to GDP are driven by stochastic trends. Regarding bounded stationarity as the more natural counterpart of sustainability, results from Phillips–Perron tests for unit root and bounded unit root processes are contrasted. While the former hint at stationarity of current account imbalances for 12 economies, the latter indicate bounded stationarity for only six economies. Through panel-based test statistics, current account imbalances are diagnosed as bounded non-stationary. Thus, (spurious) rejections of the unit root hypothesis might be due to the existence of bounds reflecting hidden policy controls or financial crises.
Resumo:
A series of imitation games involving 3-participant (simultaneous comparison of two hidden entities) and 2-participant (direct interrogation of a hidden entity) were conducted at Bletchley Park on the 100th anniversary of Alan Turing’s birth: 23 June 2012. From the ongoing analysis of over 150 games involving (expert and non-expert, males and females, adults and child) judges, machines and hidden humans (foils for the machines), we present six particular conversations that took place between human judges and a hidden entity that produced unexpected results. From this sample we focus on features of Turing’s machine intelligence test that the mathematician/code breaker did not consider in his examination for machine thinking: the subjective nature of attributing intelligence to another mind.
Resumo:
With the growing number and significance of urban meteorological networks (UMNs) across the world, it is becoming critical to establish a standard metadata protocol. Indeed, a review of existing UMNs indicate large variations in the quality, quantity, and availability of metadata containing technical information (i.e., equipment, communication methods) and network practices (i.e., quality assurance/quality control and data management procedures). Without such metadata, the utility of UMNs is greatly compromised. There is a need to bring together the currently disparate sets of guidelines to ensure informed and well-documented future deployments. This should significantly improve the quality, and therefore the applicability, of the high-resolution data available from such networks. Here, the first metadata protocol for UMNs is proposed, drawing on current recommendations for urban climate stations and identified best practice in existing networks
Resumo:
We consider tests of forecast encompassing for probability forecasts, for both quadratic and logarithmic scoring rules. We propose test statistics for the null of forecast encompassing, present the limiting distributions of the test statistics, and investigate the impact of estimating the forecasting models' parameters on these distributions. The small-sample performance is investigated, in terms of small numbers of forecasts and model estimation sample sizes. We show the usefulness of the tests for the evaluation of recession probability forecasts from logit models with different leading indicators as explanatory variables, and for evaluating survey-based probability forecasts.
Resumo:
Tests, as learning events, are often more effective than are additional study opportunities, especially when recall is tested after a long retention interval. To what degree, though, do prior test or study events support subsequent study activities? We set out to test an implication of Bjork and Bjork’s (1992) new theory of disuse—that, under some circumstances, prior study may facilitate subsequent study more than does prior testing. Participants learned English–Swahili translations and then underwent a practice phase during which some items were tested (without feedback) and other items were restudied. Although tested items were better recalled after a 1-week delay than were restudied items, this benefit did not persist after participants had the opportunity to study the items again via feedback. In fact, after this additional study opportunity, items that had been restudied earlier were better recalled than were items that had been tested earlier. These results suggest that measuring the memorial consequences of testing requires more than a single test of retention and, theoretically, a consideration of the differing status of initially recallable and nonrecallable items.
Resumo:
We test whether there are nonlinearities in the response of short- and long-term interest rates to the spread in interest rates, and assess the out-of-sample predictability of interest rates using linear and nonlinear models. We find strong evidence of nonlinearities in the response of interest rates to the spread. Nonlinearities are shown to result in more accurate short-horizon forecasts, especially of the spread.
Resumo:
This paper proposes and implements a new methodology for forecasting time series, based on bicorrelations and cross-bicorrelations. It is shown that the forecasting technique arises as a natural extension of, and as a complement to, existing univariate and multivariate non-linearity tests. The formulations are essentially modified autoregressive or vector autoregressive models respectively, which can be estimated using ordinary least squares. The techniques are applied to a set of high-frequency exchange rate returns, and their out-of-sample forecasting performance is compared to that of other time series models
Resumo:
A number of recent papers have employed the BDS test as a general test for mis-specification for linear and nonlinear models. We show that for a particular class of conditionally heteroscedastic models, the BDS test is unable to detect a common mis-specification. Our results also demonstrate that specific rather than portmanteau diagnostics are required to detect neglected asymmetry in volatility. However for both classes of tests reasonable power is only obtained using very large sample sizes.
Resumo:
This paper employs an extensive Monte Carlo study to test the size and power of the BDS and close return methods of testing for departures from independent and identical distribution. It is found that the finite sample properties of the BDS test are far superior and that the close return method cannot be recommended as a model diagnostic. Neither test can be reliably used for very small samples, while the close return test has low power even at large sample sizes
Resumo:
This paper presents and implements a number of tests for non-linear dependence and a test for chaos using transactions prices on three LIFFE futures contracts: the Short Sterling interest rate contract, the Long Gilt government bond contract, and the FTSE 100 stock index futures contract. While previous studies of high frequency futures market data use only those transactions which involve a price change, we use all of the transaction prices on these contracts whether they involve a price change or not. Our results indicate irrefutable evidence of non-linearity in two of the three contracts, although we find no evidence of a chaotic process in any of the series. We are also able to provide some indications of the effect of the duration of the trading day on the degree of non-linearity of the underlying contract. The trading day for the Long Gilt contract was extended in August 1994, and prior to this date there is no evidence of any structure in the return series. However, after the extension of the trading day we do find evidence of a non-linear return structure.
Resumo:
The present study aims to evaluate the probiotic potential of lactic acid bacteria (LAB) isolated from naturally fermented olives and select candidates to be used as probiotic starters for the improvement of the traditional fermentation process and the production of newly added value functional foods. Seventy one (71) lactic acid bacterial strains (17 Leuconostoc mesenteroides, 1 Ln. pseudomesenteroides, 13 Lactobacillus plantarum, 37 Lb. pentosus, 1 Lb. paraplantarum, and 2 Lb. paracasei subsp. paracasei) isolated from table olives were screened for their probiotic potential. Lb. rhamnosus GG and Lb. casei Shirota were used as reference strains. The in vitro tests included survival in simulated gastrointestinal tract conditions, antimicrobial activity (against Listeria monocytogenes, Salmonella Enteritidis, Escherichia coli O157:H7), Caco-2 surface adhesion, resistance to 9 antibiotics and haemolytic activity. Three (3) Lb. pentosus, 4 Lb. plantarum and 2 Lb. paracasei subsp. paracasei strains demonstrated the highest final population (>8 log cfu/ml) after 3 h of exposure at low pH. The majority of the tested strains were resistant to bile salts even after 4 h of exposure, while 5 Lb. plantarum and 7 Lb. pentosus strains exhibited partial bile salt hydrolase activity. None of the strains inhibited the growth of the pathogens tested. Variable efficiency to adhere to Caco-2 cells was observed. This was the same regarding strains' susceptibility towards different antibiotics. None of the strains exhibited β-haemolytic activity. As a whole, 4 strains of Lb. pentosus, 3 strains of Lb. plantarum and 2 strains of Lb. paracasei subsp. paracasei were found to possess desirable in vitro probiotic properties similar to or even better than the reference probiotic strains Lb. casei Shirota and Lb. rhamnosus GG. These strains are good candidates for further investigation both with in vivo studies to elucidate their potential health benefits and in olive fermentation processes to assess their technological performance as novel probiotic starters.
Resumo:
To date, only one study has investigated educational attainment in poor (reading) comprehenders, providing evidence of poor performance on national UK school tests at age 11 years relative to peers (Cain & Oakhill, 2006). In the present study, we adopted a longitudinal approach, tracking attainment on such tests from 11 years to the end of compulsory schooling in the UK (age 16 years). We aimed to investigate the proposal that educational weaknesses (defined as poor performance on national assessments) might become more pronounced over time, as the curriculum places increasing demands on reading comprehension. Participants comprised 15 poor comprehenders and 15 controls; groups were matched for chronological age, nonverbal reasoning ability and decoding skill. Children were identified at age 9 years using standardised measures of nonverbal reasoning, decoding and reading comprehension. These measures, along with a measure of oral vocabulary knowledge, were repeated at age 11 years. Data on educational attainment were collected from all participants (N = 30) at age 11 and from a subgroup (n = 21) at 16 years. Compared to controls, educational attainment in poor comprehenders was lower at ages 11 and 16 years, an effect that was significant at 11 years. When poor comprehenders were compared to national performance levels, they showed significantly lower performance at both time points. Low educational attainment was not evident for all poor comprehenders. Nonetheless, our findings point to a link between reading comprehension difficulties in mid to late childhood and poor educational outcomes at ages 11 and 16 years. At these ages, pupils in the UK are making key transitions: they move from primary to secondary schools at 11, and out of compulsory schooling at 16.