Biblioteca Digital

6 resultados para testing method

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna

Response times in computerized adaptive testing: a method for cheating detection

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the field of educational and psychological measurement, the shift from paper-based to computerized tests has become a prominent trend in recent years. Computerized tests allow for more complex and personalized test administration procedures, like Computerized Adaptive Testing (CAT). CAT, following the Item Response Theory (IRT) models, dynamically generates tests based on test-taker responses, driven by complex statistical algorithms. Even if CAT structures are complex, they are flexible and convenient, but concerns about test security should be addressed. Frequent item administration can lead to item exposure and cheating, necessitating preventive and diagnostic measures. In this thesis a method called "CHeater identification using Interim Person fit Statistic" (CHIPS) is developed, designed to identify and limit cheaters in real-time during test administration. CHIPS utilizes response times (RTs) to calculate an Interim Person fit Statistic (IPS), allowing for on-the-fly intervention using a more secret item bank. Also, a slight modification is proposed to overcome situations with constant speed, called Modified-CHIPS (M-CHIPS). A simulation study assesses CHIPS, highlighting its effectiveness in identifying and controlling cheaters. However, it reveals limitations when cheaters possess all correct answers. The M-CHIPS overcame this limitation. Furthermore, the method has shown not to be influenced by the cheaters’ ability distribution or the level of correlation between ability and speed of test-takers. Finally, the method has demonstrated flexibility for the choice of significance level and the transition from fixed-length tests to variable-length ones. The thesis discusses potential applications, including the suitability of the method for multiple-choice tests, assumptions about RT distribution and level of item pre-knowledge. Also limitations are discussed to explore future developments such as different RT distributions, unusual honest respondent behaviors, and field testing in real-world scenarios. In summary, CHIPS and M-CHIPS offer real-time cheating detection in CAT, enhancing test security and ability estimation while not penalizing test respondents.

Veja mais

Multiple testing in spatial epidemiology: a Bayesian approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work we aim to propose a new approach for preliminary epidemiological studies on Standardized Mortality Ratios (SMR) collected in many spatial regions. A preliminary study on SMRs aims to formulate hypotheses to be investigated via individual epidemiological studies that avoid bias carried on by aggregated analyses. Starting from collecting disease counts and calculating expected disease counts by means of reference population disease rates, in each area an SMR is derived as the MLE under the Poisson assumption on each observation. Such estimators have high standard errors in small areas, i.e. where the expected count is low either because of the low population underlying the area or the rarity of the disease under study. Disease mapping models and other techniques for screening disease rates among the map aiming to detect anomalies and possible high-risk areas have been proposed in literature according to the classic and the Bayesian paradigm. Our proposal is approaching this issue by a decision-oriented method, which focus on multiple testing control, without however leaving the preliminary study perspective that an analysis on SMR indicators is asked to. We implement the control of the FDR, a quantity largely used to address multiple comparisons problems in the eld of microarray data analysis but which is not usually employed in disease mapping. Controlling the FDR means providing an estimate of the FDR for a set of rejected null hypotheses. The small areas issue arises diculties in applying traditional methods for FDR estimation, that are usually based only on the p-values knowledge (Benjamini and Hochberg, 1995; Storey, 2003). Tests evaluated by a traditional p-value provide weak power in small areas, where the expected number of disease cases is small. Moreover tests cannot be assumed as independent when spatial correlation between SMRs is expected, neither they are identical distributed when population underlying the map is heterogeneous. The Bayesian paradigm oers a way to overcome the inappropriateness of p-values based methods. Another peculiarity of the present work is to propose a hierarchical full Bayesian model for FDR estimation in testing many null hypothesis of absence of risk.We will use concepts of Bayesian models for disease mapping, referring in particular to the Besag York and Mollié model (1991) often used in practice for its exible prior assumption on the risks distribution across regions. The borrowing of strength between prior and likelihood typical of a hierarchical Bayesian model takes the advantage of evaluating a singular test (i.e. a test in a singular area) by means of all observations in the map under study, rather than just by means of the singular observation. This allows to improve the power test in small areas and addressing more appropriately the spatial correlation issue that suggests that relative risks are closer in spatially contiguous regions. The proposed model aims to estimate the FDR by means of the MCMC estimated posterior probabilities b i's of the null hypothesis (absence of risk) for each area. An estimate of the expected FDR conditional on data (\FDR) can be calculated in any set of b i's relative to areas declared at high-risk (where thenull hypothesis is rejected) by averaging the b i's themselves. The\FDR can be used to provide an easy decision rule for selecting high-risk areas, i.e. selecting as many as possible areas such that the\FDR is non-lower than a prexed value; we call them\FDR based decision (or selection) rules. The sensitivity and specicity of such rule depend on the accuracy of the FDR estimate, the over-estimation of FDR causing a loss of power and the under-estimation of FDR producing a loss of specicity. Moreover, our model has the interesting feature of still being able to provide an estimate of relative risk values as in the Besag York and Mollié model (1991). A simulation study to evaluate the model performance in FDR estimation accuracy, sensitivity and specificity of the decision rule, and goodness of estimation of relative risks, was set up. We chose a real map from which we generated several spatial scenarios whose counts of disease vary according to the spatial correlation degree, the size areas, the number of areas where the null hypothesis is true and the risk level in the latter areas. In summarizing simulation results we will always consider the FDR estimation in sets constituted by all b i's selected lower than a threshold t. We will show graphs of the\FDR and the true FDR (known by simulation) plotted against a threshold t to assess the FDR estimation. Varying the threshold we can learn which FDR values can be accurately estimated by the practitioner willing to apply the model (by the closeness between\FDR and true FDR). By plotting the calculated sensitivity and specicity (both known by simulation) vs the\FDR we can check the sensitivity and specicity of the corresponding\FDR based decision rules. For investigating the over-smoothing level of relative risk estimates we will compare box-plots of such estimates in high-risk areas (known by simulation), obtained by both our model and the classic Besag York Mollié model. All the summary tools are worked out for all simulated scenarios (in total 54 scenarios). Results show that FDR is well estimated (in the worst case we get an overestimation, hence a conservative FDR control) in small areas, low risk levels and spatially correlated risks scenarios, that are our primary aims. In such scenarios we have good estimates of the FDR for all values less or equal than 0.10. The sensitivity of\FDR based decision rules is generally low but specicity is high. In such scenario the use of\FDR = 0:05 or\FDR = 0:10 based selection rule can be suggested. In cases where the number of true alternative hypotheses (number of true high-risk areas) is small, also FDR = 0:15 values are well estimated, and \FDR = 0:15 based decision rules gains power maintaining an high specicity. On the other hand, in non-small areas and non-small risk level scenarios the FDR is under-estimated unless for very small values of it (much lower than 0.05); this resulting in a loss of specicity of a\FDR = 0:05 based decision rule. In such scenario\FDR = 0:05 or, even worse,\FDR = 0:1 based decision rules cannot be suggested because the true FDR is actually much higher. As regards the relative risk estimation, our model achieves almost the same results of the classic Besag York Molliè model. For this reason, our model is interesting for its ability to perform both the estimation of relative risk values and the FDR control, except for non-small areas and large risk level scenarios. A case of study is nally presented to show how the method can be used in epidemiology.

Veja mais

Astronomical site testing in the era of the extremely large telescopes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The quality of astronomical sites is the first step to be considered to have the best performances from the telescopes. In particular, the efficiency of large telescopes in UV, IR, radio etc. is critically dependent on atmospheric transparency. It is well known that the random optical effects induced on the light propagation by turbulent atmosphere also limit telescope’s performances. Nowadays, clear appears the importance to correlate the main atmospheric physical parameters with the optical quality reachable by large aperture telescopes. The sky quality evaluation improved with the introduction of new techniques, new instrumentations and with the understanding of the link between the meteorological (or synoptical parameters and the observational conditions thanks to the application of the theories of electromagnetic waves propagation in turbulent medias: what we actually call astroclimatology. At the present the site campaigns are evolved and are performed using the classical scheme of optical seeing properties, meteorological parameters, sky transparency, sky darkness and cloudiness. New concept are added and are related to the geophysical properties such as seismicity, microseismicity, local variability of the climate, atmospheric conditions related to the ground optical turbulence and ground wind regimes, aerosol presence, use of satellite data. The purpose of this project is to provide reliable methods to analyze the atmospheric properties that affect ground-based optical astronomical observations and to correlate them with the main atmospheric parameters generating turbulence and affecting the photometric accuracy. The first part of the research concerns the analysis and interpretation of longand short-time scale meteorological data at two of the most important astronomical sites located in very different environments: the Paranal Observatory in the Atacama Desert (Chile), and the Observatorio del Roque de Los Muchachos(ORM) located in La Palma (Canary Islands, Spain). The optical properties of airborne dust at ORM have been investigated collecting outdoor data using a ground-based dust monitor. Because of its dryness, Paranal is a suitable observatory for near-IR observations, thus the extinction properties in the spectral range 1.00-2.30 um have been investigated using an empirical method. Furthermore, this PhD research has been developed using several turbulence profilers in the selection of the site for the European Extremely Large Telescope(E-ELT). During the campaigns the properties of the turbulence at different heights at Paranal and in the sites located in northern Chile and Argentina have been studied. This given the possibility to characterize the surface layer turbulence at Paranal and its connection with local meteorological conditions.

Veja mais

Testing future weak lensing surveys through simulations of observations

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Weak lensing experiments such as the future ESA-accepted mission Euclid aim to measure cosmological parameters with unprecedented accuracy. It is important to assess the precision that can be obtained in these measurements by applying analysis software on mock images that contain many sources of noise present in the real data. In this Thesis, we show a method to perform simulations of observations, that produce realistic images of the sky according to characteristics of the instrument and of the survey. We then use these images to test the performances of the Euclid mission. In particular, we concentrate on the precision of the photometric redshift measurements, which are key data to perform cosmic shear tomography. We calculate the fraction of the total observed sample that must be discarded to reach the required level of precision, that is equal to 0.05(1+z) for a galaxy with measured redshift z, with different ancillary ground-based observations. The results highlight the importance of u-band observations, especially to discriminate between low (z < 0.5) and high (z ~ 3) redshifts, and the need for good observing sites, with seeing FWHM < 1. arcsec. We then construct an optimal filter to detect galaxy clusters through photometric catalogues of galaxies, and we test it on the COSMOS field, obtaining 27 lensing-confirmed detections. Applying this algorithm on mock Euclid data, we verify the possibility to detect clusters with mass above 10^14.2 solar masses with a low rate of false detections.

Veja mais

Earthquake forecasting and seismic hazard analysis: some insights on the testing phase and the modeling

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis is divided in three chapters. In the first chapter we analyse the results of the world forecasting experiment run by the Collaboratory for the Study of Earthquake Predictability (CSEP). We take the opportunity of this experiment to contribute to the definition of a more robust and reliable statistical procedure to evaluate earthquake forecasting models. We first present the models and the target earthquakes to be forecast. Then we explain the consistency and comparison tests that are used in CSEP experiments to evaluate the performance of the models. Introducing a methodology to create ensemble forecasting models, we show that models, when properly combined, are almost always better performing that any single model. In the second chapter we discuss in depth one of the basic features of PSHA: the declustering of the seismicity rates. We first introduce the Cornell-McGuire method for PSHA and we present the different motivations that stand behind the need of declustering seismic catalogs. Using a theorem of the modern probability (Le Cam's theorem) we show that the declustering is not necessary to obtain a Poissonian behaviour of the exceedances that is usually considered fundamental to transform exceedance rates in exceedance probabilities in the PSHA framework. We present a method to correct PSHA for declustering, building a more realistic PSHA. In the last chapter we explore the methods that are commonly used to take into account the epistemic uncertainty in PSHA. The most widely used method is the logic tree that stands at the basis of the most advanced seismic hazard maps. We illustrate the probabilistic structure of the logic tree, and then we show that this structure is not adequate to describe the epistemic uncertainty. We then propose a new probabilistic framework based on the ensemble modelling that properly accounts for epistemic uncertainties in PSHA.

Veja mais

Supporting requirement elicitation and ontology testing in knowledge graph engineering

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge graphs and ontologies are closely related concepts in the field of knowledge representation. In recent years, knowledge graphs have gained increasing popularity and are serving as essential components in many knowledge engineering projects that view them as crucial to their success. The conceptual foundation of the knowledge graph is provided by ontologies. Ontology modeling is an iterative engineering process that consists of steps such as the elicitation and formalization of requirements, the development, testing, refactoring, and release of the ontology. The testing of the ontology is a crucial and occasionally overlooked step of the process due to the lack of integrated tools to support it. As a result of this gap in the state-of-the-art, the testing of the ontology is completed manually, which requires a considerable amount of time and effort from the ontology engineers. The lack of tool support is noticed in the requirement elicitation process as well. In this aspect, the rise in the adoption and accessibility of knowledge graphs allows for the development and use of automated tools to assist with the elicitation of requirements from such a complementary source of data. Therefore, this doctoral research is focused on developing methods and tools that support the requirement elicitation and testing steps of an ontology engineering process. To support the testing of the ontology, we have developed XDTesting, a web application that is integrated with the GitHub platform that serves as an ontology testing manager. Concurrently, to support the elicitation and documentation of competency questions, we have defined and implemented RevOnt, a method to extract competency questions from knowledge graphs. Both methods are evaluated through their implementation and the results are promising.

Veja mais

6 resultados para testing method

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna

Filtro por publicador

Response times in computerized adaptive testing: a method for cheating detection

Multiple testing in spatial epidemiology: a Bayesian approach

Astronomical site testing in the era of the extremely large telescopes

Testing future weak lensing surveys through simulations of observations

Earthquake forecasting and seismic hazard analysis: some insights on the testing phase and the modeling

Supporting requirement elicitation and ontology testing in knowledge graph engineering