885 resultados para Technological tests
Resumo:
A series of imitation games involving 3-participant (simultaneous comparison of two hidden entities) and 2-participant (direct interrogation of a hidden entity) were conducted at Bletchley Park on the 100th anniversary of Alan Turing’s birth: 23 June 2012. From the ongoing analysis of over 150 games involving (expert and non-expert, males and females, adults and child) judges, machines and hidden humans (foils for the machines), we present six particular conversations that took place between human judges and a hidden entity that produced unexpected results. From this sample we focus on features of Turing’s machine intelligence test that the mathematician/code breaker did not consider in his examination for machine thinking: the subjective nature of attributing intelligence to another mind.
Resumo:
We consider tests of forecast encompassing for probability forecasts, for both quadratic and logarithmic scoring rules. We propose test statistics for the null of forecast encompassing, present the limiting distributions of the test statistics, and investigate the impact of estimating the forecasting models' parameters on these distributions. The small-sample performance is investigated, in terms of small numbers of forecasts and model estimation sample sizes. We show the usefulness of the tests for the evaluation of recession probability forecasts from logit models with different leading indicators as explanatory variables, and for evaluating survey-based probability forecasts.
Resumo:
The character of settlement patterns within the late Mesolithic communities of north-west Europe is a topic of substantial debate. An important case study concerns the five shell middens on the island of Oronsay, Inner Hebrides, western Scotland. Two conflicting interpretations have been proposed: the evidence from seasonality indicators and stable isotope analysis of human bones has been used to support a model of year-round settlement on this small island; alternatively, the middens have been interpreted as resulting from short-term intermittent visits to Oronsay within a regionally mobile settlement pattern. We contribute to this debate by describing Storakaig, a newly discovered site on the nearby island of Islay, undertaking a Bayesian chronological analysis and providing evidence for technological continuity between Oronsay and sites elsewhere in the region. While this new evidence remains open to alternative interpretation, we suggest that it makes regional mobility rather than year-round settlement on Oronsay a more viable interpretation for the Oronsay middens. Our analysis also confirms the likely overlap of the late Mesolithic with the earliest Neolithic within western Scotland.
Resumo:
Tests, as learning events, are often more effective than are additional study opportunities, especially when recall is tested after a long retention interval. To what degree, though, do prior test or study events support subsequent study activities? We set out to test an implication of Bjork and Bjork’s (1992) new theory of disuse—that, under some circumstances, prior study may facilitate subsequent study more than does prior testing. Participants learned English–Swahili translations and then underwent a practice phase during which some items were tested (without feedback) and other items were restudied. Although tested items were better recalled after a 1-week delay than were restudied items, this benefit did not persist after participants had the opportunity to study the items again via feedback. In fact, after this additional study opportunity, items that had been restudied earlier were better recalled than were items that had been tested earlier. These results suggest that measuring the memorial consequences of testing requires more than a single test of retention and, theoretically, a consideration of the differing status of initially recallable and nonrecallable items.
Resumo:
We test whether there are nonlinearities in the response of short- and long-term interest rates to the spread in interest rates, and assess the out-of-sample predictability of interest rates using linear and nonlinear models. We find strong evidence of nonlinearities in the response of interest rates to the spread. Nonlinearities are shown to result in more accurate short-horizon forecasts, especially of the spread.
Resumo:
This paper proposes and implements a new methodology for forecasting time series, based on bicorrelations and cross-bicorrelations. It is shown that the forecasting technique arises as a natural extension of, and as a complement to, existing univariate and multivariate non-linearity tests. The formulations are essentially modified autoregressive or vector autoregressive models respectively, which can be estimated using ordinary least squares. The techniques are applied to a set of high-frequency exchange rate returns, and their out-of-sample forecasting performance is compared to that of other time series models
Resumo:
A number of recent papers have employed the BDS test as a general test for mis-specification for linear and nonlinear models. We show that for a particular class of conditionally heteroscedastic models, the BDS test is unable to detect a common mis-specification. Our results also demonstrate that specific rather than portmanteau diagnostics are required to detect neglected asymmetry in volatility. However for both classes of tests reasonable power is only obtained using very large sample sizes.
Resumo:
This paper employs an extensive Monte Carlo study to test the size and power of the BDS and close return methods of testing for departures from independent and identical distribution. It is found that the finite sample properties of the BDS test are far superior and that the close return method cannot be recommended as a model diagnostic. Neither test can be reliably used for very small samples, while the close return test has low power even at large sample sizes
Resumo:
This paper presents and implements a number of tests for non-linear dependence and a test for chaos using transactions prices on three LIFFE futures contracts: the Short Sterling interest rate contract, the Long Gilt government bond contract, and the FTSE 100 stock index futures contract. While previous studies of high frequency futures market data use only those transactions which involve a price change, we use all of the transaction prices on these contracts whether they involve a price change or not. Our results indicate irrefutable evidence of non-linearity in two of the three contracts, although we find no evidence of a chaotic process in any of the series. We are also able to provide some indications of the effect of the duration of the trading day on the degree of non-linearity of the underlying contract. The trading day for the Long Gilt contract was extended in August 1994, and prior to this date there is no evidence of any structure in the return series. However, after the extension of the trading day we do find evidence of a non-linear return structure.
Resumo:
This review is an output of the International Life Sciences Institute (ILSI) Europe Marker Initiative, which aims to identify evidence-based criteria for selecting adequate measures of nutrient effects on health through comprehensive literature review. Experts in cognitive and nutrition sciences examined the applicability of these proposed criteria to the field of cognition with respect to the various cognitive domains usually assessed to reflect brain or neurological function. This review covers cognitive domains important in the assessment of neuronal integrity and function, commonly used tests and their state of validation, and the application of the measures to studies of nutrition and nutritional intervention trials. The aim is to identify domain-specific cognitive tests that are sensitive to nutrient interventions and from which guidance can be provided to aid the application of selection criteria for choosing the most suitable tests for proposed nutritional intervention studies using cognitive outcomes. The material in this review serves as a background and guidance document for nutritionists, neuropsychologists, psychiatrists, and neurologists interested in assessing mental health in terms of cognitive test performance and for scientists intending to test the effects of food or food components on cognitive function.
Resumo:
This paper presents some important issues on misidentification of human interlocutors in text-based communication during practical Turing tests. The study here presents transcripts in which human judges succumbed to theconfederate effect, misidentifying hidden human foils for machines. An attempt is made to assess the reasons for this. The practical Turing tests in question were held on 23 June 2012 at Bletchley Park, England. A selection of actual full transcripts from the tests is shown and an analysis is given in each case. As a result of these tests, conclusions are drawn with regard to the sort of strategies which can perhaps lead to erroneous conclusions when one is involved as an interrogator. Such results also serve to indicate conversational directions to avoid for those machine designers who wish to create a conversational entity that performs well on the Turing test.
Resumo:
Interpretation of utterances affects an interrogator’s determination of human from machine during live Turing tests. Here, we consider transcripts realised as a result of a series of practical Turing tests that were held on 23 June 2012 at Bletchley Park, England. The focus in this paper is to consider the effects of lying and truth-telling on the human judges by the hidden entities, whether human or a machine. Turing test transcripts provide a glimpse into short text communication, the type that occurs in emails: how does the reader determine truth from the content of a stranger’s textual message? Different types of lying in the conversations are explored, and the judge’s attribution of human or machine is investigated in each test.
Resumo:
Using a case study approach, this paper examines the role of privatisation on the industrial landscape of Tanzania. We examine the impact of foreign direct investment (FDI) through acquisitions on technology transfer within the acquired firm as well as the development of linkages to other firms based in the host country. Our results suggest that technological upgrading has occurred following FDI, the intensity of which reflects the type of firm-specific assets of the parent multinational enterprise (MNE), as well as the pre-acquisition state of these industrial activities. Improved backward linkages are also evident with local economic agents, but their type and extent – reflecting Tanzania's comparative advantage in the primary sector – confirm that capabilities both within the acquired firms and also in the industrial base of the host country greatly influence the magnitude and intensity of technological upgrading. 'Narrower' technology gaps between the MNE affiliate and the domestic sector are more likely to result in backward linkages and determine the type of technological content of inputs sourced locally rather than within the MNE.
Resumo:
We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.