43 resultados para Crosshole tests
Resumo:
This paper proposes new methodologies for evaluating out-of-sample forecastingperformance that are robust to the choice of the estimation window size. The methodologies involve evaluating the predictive ability of forecasting models over a wide rangeof window sizes. We show that the tests proposed in the literature may lack the powerto detect predictive ability and might be subject to data snooping across differentwindow sizes if used repeatedly. An empirical application shows the usefulness of themethodologies for evaluating exchange rate models' forecasting ability.
Resumo:
We propose new methods for evaluating predictive densities that focus on the models' actual predictive ability in finite samples. The tests offer a simple way of evaluatingthe correct specification of predictive densities, either parametric or non-parametric.The results indicate that our tests are well sized and have good power in detecting mis-specification in predictive densities. An empirical application to the Survey ofProfessional Forecasters and a baseline Dynamic Stochastic General Equilibrium modelshows the usefulness of our methodology.
Resumo:
Este trabajo se divide en tres partes: contextualización, estado del arte en evaluación de la usabilidad en dispositivos móviles y propuesta y validación de un método que combina eyetracker de sobremesa y dispositivos móviles. El trabajo culmina con un estudio experimental con un doble propósito: realizar un primer estudio de la validez del método y analizar empíricamente cómo sacarle el máximo rendimiento tratando en todo momento de equipararlo al uso real de un dispositivo físico.
A priori parameterisation of the CERES soil-crop models and tests against several European data sets
Resumo:
Mechanistic soil-crop models have become indispensable tools to investigate the effect of management practices on the productivity or environmental impacts of arable crops. Ideally these models may claim to be universally applicable because they simulate the major processes governing the fate of inputs such as fertiliser nitrogen or pesticides. However, because they deal with complex systems and uncertain phenomena, site-specific calibration is usually a prerequisite to ensure their predictions are realistic. This statement implies that some experimental knowledge on the system to be simulated should be available prior to any modelling attempt, and raises a tremendous limitation to practical applications of models. Because the demand for more general simulation results is high, modellers have nevertheless taken the bold step of extrapolating a model tested within a limited sample of real conditions to a much larger domain. While methodological questions are often disregarded in this extrapolation process, they are specifically addressed in this paper, and in particular the issue of models a priori parameterisation. We thus implemented and tested a standard procedure to parameterize the soil components of a modified version of the CERES models. The procedure converts routinely-available soil properties into functional characteristics by means of pedo-transfer functions. The resulting predictions of soil water and nitrogen dynamics, as well as crop biomass, nitrogen content and leaf area index were compared to observations from trials conducted in five locations across Europe (southern Italy, northern Spain, northern France and northern Germany). In three cases, the model’s performance was judged acceptable when compared to experimental errors on the measurements, based on a test of the model’s root mean squared error (RMSE). Significant deviations between observations and model outputs were however noted in all sites, and could be ascribed to various model routines. In decreasing importance, these were: water balance, the turnover of soil organic matter, and crop N uptake. A better match to field observations could therefore be achieved by visually adjusting related parameters, such as field-capacity water content or the size of soil microbial biomass. As a result, model predictions fell within the measurement errors in all sites for most variables, and the model’s RMSE was within the range of published values for similar tests. We conclude that the proposed a priori method yields acceptable simulations with only a 50% probability, a figure which may be greatly increased through a posteriori calibration. Modellers should thus exercise caution when extrapolating their models to a large sample of pedo-climatic conditions for which they have only limited information.
Resumo:
Evaluar una arquitectura de la información en un sitio web ya desplegado no resulta una tarea sencilla. La mayoría de las técnicas se centran en examinar la usabilidad del sistema que, aunque afecta a la arquitectura de la información, no es el único factor que influye en ella. La principal técnica que se utiliza es el test de estrés de navegación. Se muestra un aporte metodológico para hacer dicha técnica más informativa, llevándola más allá de la simple anotación en papel por parte del usuario de respuestas a las preguntas de navegación planteadas. Se propone la combinación de ésta con otras técnicas de evaluación de la usabilidad: la técnica de pensar en voz alta o thinking aloud y un cuestionario de usabilidad. Se ha utilizado un sistema de seguimiento de la mirada o eye tracking para complementar la información obtenida mediante las técnicas aplicadas. El enfoque metodológico planteado se ha puesto a prueba analizando una serie de sitios web de bibliotecas de universidades públicas españolas. Se muestra en los resultados la validez del enfoque empleado, así como el valor que dicho enfoque y el uso del eye tracking aportan al análisis de la arquitectura de la información respecto al test de estrés de navegación tradicional.
Resumo:
This study deals with the statistical properties of a randomization test applied to an ABAB design in cases where the desirable random assignment of the points of change in phase is not possible. In order to obtain information about each possible data division we carried out a conditional Monte Carlo simulation with 100,000 samples for each systematically chosen triplet. Robustness and power are studied under several experimental conditions: different autocorrelation levels and different effect sizes, as well as different phase lengths determined by the points of change. Type I error rates were distorted by the presence of autocorrelation for the majority of data divisions. Satisfactory Type II error rates were obtained only for large treatment effects. The relationship between the lengths of the four phases appeared to be an important factor for the robustness and the power of the randomization test.
Resumo:
Monte Carlo simulations were used to generate data for ABAB designs of different lengths. The points of change in phase are randomly determined before gathering behaviour measurements, which allows the use of a randomization test as an analytic technique. Data simulation and analysis can be based either on data-division-specific or on common distributions. Following one method or another affects the results obtained after the randomization test has been applied. Therefore, the goal of the study was to examine these effects in more detail. The discrepancies in these approaches are obvious when data with zero treatment effect are considered and such approaches have implications for statistical power studies. Data-division-specific distributions provide more detailed information about the performance of the statistical technique.
Resumo:
Sobriety checkpoints are not usually randomly located by traffic authorities. As such, information provided by non-random alcohol tests cannot be used to infer the characteristics of the general driving population. In this paper a case study is presented in which the prevalence of alcohol-impaired driving is estimated for the general population of drivers. A stratified probabilistic sample was designed to represent vehicles circulating in non-urban areas of Catalonia (Spain), a region characterized by its complex transportation network and dense traffic around the metropolis of Barcelona. Random breath alcohol concentration tests were performed during spring 2012 on 7,596 drivers. The estimated prevalence of alcohol-impaired drivers was 1.29 PER CENT, which is roughly a third of the rate obtained in non-random tests. Higher rates were found on weekends (1.90 PER CENT on Saturdays, 4.29 PER CENT on Sundays) and especially at night. The rate is higher for men (1.45 PER CENT) than for women (0.64 PER CENT) and the percentage of positive outcomes shows an increasing pattern with age. In vehicles with two occupants, the proportion of alcohol-impaired drivers is estimated at 2.62 PER CENT, but when the driver was alone the rate drops to 0.84 PER CENT, which might reflect the socialization of drinking habits. The results are compared with outcomes in previous surveys, showing a decreasing trend in the prevalence of alcohol-impaired drivers over time.
Resumo:
This study evaluated the performance of the Tuberculin Skin Test (TST) and Quantiferon-TB Gold in-Tube (QFT) and the possible association of factors which may modify their results in young children (0-6 years) with recent contact with an index tuberculosis case. Materials and Methods: A cross-sectional study including 135 children was conducted in Manaus, Amazonas-Brazil. The TST and QFT were performed and the tests results were analyzed in relation to the personal characteristics of the children studied and their relationship with the index case. Results: The rates of positivity were 34.8% (TST) and 26.7% (QFT), with 14.1% of indeterminations by the QFT. Concordance between tests was fair (Kappa = 0.35 P<0.001). Both the TST and QFT were associated with the intensity of exposure (Linear OR = 1.286, P = 0.005; Linear OR = 1.161, P = 0.035 respectively) with only the TST being associated with the time of exposure (Linear OR = 1.149, P = 0.009). The presence of intestinal helminths in the TST+ group was associated with negative QFT results (OR = 0.064, P = 0.049). In the TST- group lower levels of ferritin were associated with QFT+ results (Linear OR = 0.956, P = 0.036). Conclusions: Concordance between the TST and QFT was lower than expected. The factors associated with the discordant results were intestinal helminths, ferritin levels and exposure time to the index tuberculosis case. In TST+ group, helminths were associated with negative QFT results suggesting impaired cell-mediated immunity. The TST-&QFT+ group had a shorter exposure time and lower ferritin levels, suggesting that QFT is faster and ferritin may be a potential biomarker of early stages of tuberculosis infection.
Resumo:
[cat] Aquest estudi destaca la importància de considerar un nivell d’agregació adequat en els anàlisis de demanda, ja que treballar utilitzant un nivell d’agregació inadequat pot donar lloc a estimacions esbiaixades. Aquest fet es mostra a través de l’anàlisi de diferents productes de lluç fresc comercialitzats a Mercabarna, el mercat majorista de Barcelona. La literatura sobre la demanda de peix tracta al lluç com un únic producte i espècie. No obstant això, en el mercat espanyol, es comercialitzen molts peixos com a lluç, els quals mostren comportaments molt diferents (des de béns inferiors fins a béns de luxe). Els resultats obtinguts, en concordança amb les observacions empíriques, demostren que l’anàlisi s’ha de realitzar amb un major grau de detall que a nivell d’espècie. Això qüestiona els resultats d’anteriors estudis de demanda i la majoria de les bases de dades, on l’observació del nivell d’agregació adequat dels productes no es té en compte.
Resumo:
[cat] Aquest estudi destaca la importància de considerar un nivell d’agregació adequat en els anàlisis de demanda, ja que treballar utilitzant un nivell d’agregació inadequat pot donar lloc a estimacions esbiaixades. Aquest fet es mostra a través de l’anàlisi de diferents productes de lluç fresc comercialitzats a Mercabarna, el mercat majorista de Barcelona. La literatura sobre la demanda de peix tracta al lluç com un únic producte i espècie. No obstant això, en el mercat espanyol, es comercialitzen molts peixos com a lluç, els quals mostren comportaments molt diferents (des de béns inferiors fins a béns de luxe). Els resultats obtinguts, en concordança amb les observacions empíriques, demostren que l’anàlisi s’ha de realitzar amb un major grau de detall que a nivell d’espècie. Això qüestiona els resultats d’anteriors estudis de demanda i la majoria de les bases de dades, on l’observació del nivell d’agregació adequat dels productes no es té en compte.
Resumo:
The identification of biomarkers of vascular cognitive impairment is urgent for its early diagnosis. The aim of this study was to detect and monitor changes in brain structure and connectivity, and to correlate them with the decline in executive function. We examined the feasibility of early diagnostic magnetic resonance imaging (MRI) to predict cognitive impairment before onset in an animal model of chronic hypertension: Spontaneously Hypertensive Rats. Cognitive performance was tested in an operant conditioning paradigm that evaluated learning, memory, and behavioral flexibility skills. Behavioral tests were coupled with longitudinal diffusion weighted imaging acquired with 126 diffusion gradient directions and 0.3 mm(3) isometric resolution at 10, 14, 18, 22, 26, and 40 weeks after birth. Diffusion weighted imaging was analyzed in two different ways, by regional characterization of diffusion tensor imaging (DTI) indices, and by assessing changes in structural brain network organization based on Q-Ball tractography. Already at the first evaluated times, DTI scalar maps revealed significant differences in many regions, suggesting loss of integrity in white and gray matter of spontaneously hypertensive rats when compared to normotensive control rats. In addition, graph theory analysis of the structural brain network demonstrated a significant decrease of hierarchical modularity, global and local efficacy, with predictive value as shown by regional three-fold cross validation study. Moreover, these decreases were significantly correlated with the behavioral performance deficits observed at subsequent time points, suggesting that the diffusion weighted imaging and connectivity studies can unravel neuroimaging alterations even overt signs of cognitive impairment become apparent.
Resumo:
In the current study, we evaluated various robust statistical methods for comparing two independent groups. Two scenarios for simulation were generated: one of equality and another of population mean differences. In each of the scenarios, 33 experimental conditions were used as a function of sample size, standard deviation and asymmetry. For each condition, 5000 replications per group were generated. The results obtained by this study show an adequate type error I rate but not a high power for the confidence intervals. In general, for the two scenarios studied (mean population differences and not mean population differences) in the different conditions analysed, the Mann-Whitney U-test demonstrated strong performance, and a little worse the t-test of Yuen-Welch.