928 resultados para Uncertainty quantification
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. The paper considers a data driven approach in modelling uncertainty in spatial predictions. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic features and describe stochastic variability and non-uniqueness of spatial properties. It is able to capture and preserve key spatial dependencies such as connectivity, which is often difficult to achieve with two-point geostatistical models. Semi-supervised SVR is designed to integrate various kinds of conditioning data and learn dependences from them. A stochastic semi-supervised SVR model is integrated into a Bayesian framework to quantify uncertainty with multiple models fitted to dynamic observations. The developed approach is illustrated with a reservoir case study. The resulting probabilistic production forecasts are described by uncertainty envelopes.
Resumo:
In groundwater applications, Monte Carlo methods are employed to model the uncertainty on geological parameters. However, their brute-force application becomes computationally prohibitive for highly detailed geological descriptions, complex physical processes, and a large number of realizations. The Distance Kernel Method (DKM) overcomes this issue by clustering the realizations in a multidimensional space based on the flow responses obtained by means of an approximate (computationally cheaper) model; then, the uncertainty is estimated from the exact responses that are computed only for one representative realization per cluster (the medoid). Usually, DKM is employed to decrease the size of the sample of realizations that are considered to estimate the uncertainty. We propose to use the information from the approximate responses for uncertainty quantification. The subset of exact solutions provided by DKM is then employed to construct an error model and correct the potential bias of the approximate model. Two error models are devised that both employ the difference between approximate and exact medoid solutions, but differ in the way medoid errors are interpolated to correct the whole set of realizations. The Local Error Model rests upon the clustering defined by DKM and can be seen as a natural way to account for intra-cluster variability; the Global Error Model employs a linear interpolation of all medoid errors regardless of the cluster to which the single realization belongs. These error models are evaluated for an idealized pollution problem in which the uncertainty of the breakthrough curve needs to be estimated. For this numerical test case, we demonstrate that the error models improve the uncertainty quantification provided by the DKM algorithm and are effective in correcting the bias of the estimate computed solely from the MsFV results. The framework presented here is not specific to the methods considered and can be applied to other combinations of approximate models and techniques to select a subset of realizations
Resumo:
Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.
Resumo:
Approximate models (proxies) can be employed to reduce the computational costs of estimating uncertainty. The price to pay is that the approximations introduced by the proxy model can lead to a biased estimation. To avoid this problem and ensure a reliable uncertainty quantification, we propose to combine functional data analysis and machine learning to build error models that allow us to obtain an accurate prediction of the exact response without solving the exact model for all realizations. We build the relationship between proxy and exact model on a learning set of geostatistical realizations for which both exact and approximate solvers are run. Functional principal components analysis (FPCA) is used to investigate the variability in the two sets of curves and reduce the dimensionality of the problem while maximizing the retained information. Once obtained, the error model can be used to predict the exact response of any realization on the basis of the sole proxy response. This methodology is purpose-oriented as the error model is constructed directly for the quantity of interest, rather than for the state of the system. Also, the dimensionality reduction performed by FPCA allows a diagnostic of the quality of the error model to assess the informativeness of the learning set and the fidelity of the proxy to the exact model. The possibility of obtaining a prediction of the exact response for any newly generated realization suggests that the methodology can be effectively used beyond the context of uncertainty quantification, in particular for Bayesian inference and optimization.
Resumo:
Uncertainty quantification (UQ) is both an old and new concept. The current novelty lies in the interactions and synthesis of mathematical models, computer experiments, statistics, field/real experiments, and probability theory, with a particular emphasize on the large-scale simulations by computer models. The challenges not only come from the complication of scientific questions, but also from the size of the information. It is the focus in this thesis to provide statistical models that are scalable to massive data produced in computer experiments and real experiments, through fast and robust statistical inference.
Chapter 2 provides a practical approach for simultaneously emulating/approximating massive number of functions, with the application on hazard quantification of Soufri\`{e}re Hills volcano in Montserrate island. Chapter 3 discusses another problem with massive data, in which the number of observations of a function is large. An exact algorithm that is linear in time is developed for the problem of interpolation of Methylation levels. Chapter 4 and Chapter 5 are both about the robust inference of the models. Chapter 4 provides a new criteria robustness parameter estimation criteria and several ways of inference have been shown to satisfy such criteria. Chapter 5 develops a new prior that satisfies some more criteria and is thus proposed to use in practice.
Resumo:
In the framework of a global transition to a low-carbon energy mix, the interest in advanced nuclear Small Modular Reactors (SMRs) has been growing at the international level. Due to the high level of maturity reached by Severe Accident Codes for currently operating rectors, their applicability to advanced SMRs is starting to be studied. Within the present work of thesis and in the framework of a collaboration between ENEA, UNIBO and IRSN, an ASTEC code model of a generic IRIS reactor has been developed. The simulation of a DBA sequence involving the operation of all the passive safety systems of the generic IRIS has been carried out to investigate the code model capability in the prediction of the thermal-hydraulics characterizing an integral SMR adopting a passive mitigation strategy. The following simulation of 4 BDBAs sequences explores the applicability of Severe Accident Codes to advance SMRs in beyond-design and core-degradation conditions. The uncertainty affecting a code simulation can be estimated by using the method of Input Uncertainty Propagation, whose application has been realized through the RAVEN-ASTEC coupling and implementation on an HPC platform. This probabilistic methodology has been employed in a study of the uncertainty affecting the passive safety system operation in the DBA simulation of ASTEC, providing a further characterization of the thermal-hydraulics of this sequence. The application of the Uncertainty Quantification method to early core-melt phenomena has been investigated in the framework of a BEPU analysis of the ASTEC simulation of the QUENCH test-6 experiment. A possible solution to the encountered challenges has been proposed through the application of a Limit Surface search algorithm.
Resumo:
Despite the many models developed for phosphorus concentration prediction at differing spatial and temporal scales, there has been little effort to quantify uncertainty in their predictions. Model prediction uncertainty quantification is desirable, for informed decision-making in river-systems management. An uncertainty analysis of the process-based model, integrated catchment model of phosphorus (INCA-P), within the generalised likelihood uncertainty estimation (GLUE) framework is presented. The framework is applied to the Lugg catchment (1,077 km2), a River Wye tributary, on the England–Wales border. Daily discharge and monthly phosphorus (total reactive and total), for a limited number of reaches, are used to initially assess uncertainty and sensitivity of 44 model parameters, identified as being most important for discharge and phosphorus predictions. This study demonstrates that parameter homogeneity assumptions (spatial heterogeneity is treated as land use type fractional areas) can achieve higher model fits, than a previous expertly calibrated parameter set. The model is capable of reproducing the hydrology, but a threshold Nash-Sutcliffe co-efficient of determination (E or R 2) of 0.3 is not achieved when simulating observed total phosphorus (TP) data in the upland reaches or total reactive phosphorus (TRP) in any reach. Despite this, the model reproduces the general dynamics of TP and TRP, in point source dominated lower reaches. This paper discusses why this application of INCA-P fails to find any parameter sets, which simultaneously describe all observed data acceptably. The discussion focuses on uncertainty of readily available input data, and whether such process-based models should be used when there isn’t sufficient data to support the many parameters.
Resumo:
The anticipated growth of air traffic worldwide requires enhanced Air Traffic Management (ATM) technologies and procedures to increase the system capacity, efficiency, and resilience, while reducing environmental impact and maintaining operational safety. To deal with these challenges, new automation and information exchange capabilities are being developed through different modernisation initiatives toward a new global operational concept called Trajectory Based Operations (TBO), in which aircraft trajectory information becomes the cornerstone of advanced ATM applications. This transformation will lead to higher levels of system complexity requiring enhanced Decision Support Tools (DST) to aid humans in the decision making processes. These will rely on accurate predicted aircraft trajectories, provided by advanced Trajectory Predictors (TP). The trajectory prediction process is subject to stochastic effects that introduce uncertainty into the predictions. Regardless of the assumptions that define the aircraft motion model underpinning the TP, deviations between predicted and actual trajectories are unavoidable. This thesis proposes an innovative method to characterise the uncertainty associated with a trajectory prediction based on the mathematical theory of Polynomial Chaos Expansions (PCE). Assuming univariate PCEs of the trajectory prediction inputs, the method describes how to generate multivariate PCEs of the prediction outputs that quantify their associated uncertainty. Arbitrary PCE (aPCE) was chosen because it allows a higher degree of flexibility to model input uncertainty. The obtained polynomial description can be used in subsequent prediction sensitivity analyses thanks to the relationship between polynomial coefficients and Sobol indices. The Sobol indices enable ranking the input parameters according to their influence on trajectory prediction uncertainty. The applicability of the aPCE-based uncertainty quantification detailed herein is analysed through a study case. This study case represents a typical aircraft trajectory prediction problem in ATM, in which uncertain parameters regarding aircraft performance, aircraft intent description, weather forecast, and initial conditions are considered simultaneously. Numerical results are compared to those obtained from a Monte Carlo simulation, demonstrating the advantages of the proposed method. The thesis includes two examples of DSTs (Demand and Capacity Balancing tool, and Arrival Manager) to illustrate the potential benefits of exploiting the proposed uncertainty quantification method.
Resumo:
The aim of this work is to present a general overview of state-of-the-art related to design for uncertainty with a focus on aerospace structures. In particular, a simulation on a FCCZ lattice cell and on the profile shape of a nozzle will be performed. Optimization under uncertainty is characterized by the need to make decisions without complete knowledge of the problem data. When dealing with a complex problem, non-linearity, or optimization, two main issues are raised: the uncertainty of the feasibility of the solution and the uncertainty of the objective value of the function. In the first part, the Design Of Experiments (DOE) methodologies, Uncertainty Quantification (UQ), and then Uncertainty optimization will be deepened. The second part will show an application of the previous theories on through a commercial software. Nowadays multiobjective optimization on high non-linear problem can be a powerful tool to approach new concept solutions or to develop cutting-edge design. In this thesis an effective improvement have been reached on a rocket nozzle. Future work could include the introduction of multi scale modelling, multiphysics approach and every strategy useful to simulate as much possible real operative condition of the studied design.
Resumo:
O ensaio de dureza, e mais concretamente o ensaio de micro dureza Vickers, é no universo dos ensaios mecânicos um dos mais utilizados quer seja na indústria, no ensino ou na investigação e desenvolvimento de produto no âmbito das ciências dos materiais. Na grande maioria dos casos, a utilização deste ensaio tem como principal aplicação a caracterização ou controlo da qualidade de fabrico de materiais metálicos. Sendo um ensaio de relativa simplicidade de execução, rapidez e com resultados comparáveis e relacionáveis a outras grandezas físicas das propriedades dos materiais. Contudo, e tratando-se de um método de ensaio cuja intervenção humana é importante, na medição da indentação gerada por penetração mecânica através de um sistema ótico, não deixa de exibir algumas debilidades que daí advêm, como sendo o treino dos técnicos e respetivas acuidades visuais, fenómenos de fadiga visual que afetam os resultados ao longo de um turno de trabalho; ora estes fenómenos afetam a repetibilidade e reprodutibilidade dos resultados obtidos no ensaio. O CINFU possui um micro durómetro Vickers, cuja realização dos ensaios depende de um técnico treinado para a execução do mesmo, apresentando todas as debilidades já mencionadas e que o tornou elegível para o estudo e aplicação de uma solução alternativa. Assim, esta dissertação apresenta o desenvolvimento de uma solução alternativa ao método ótico convencional na medição de micro dureza Vickers. Utilizando programação em LabVIEW da National Instruments, juntamente com as ferramentas de visão computacional (NI Vision), o programa começa por solicitar ao técnico a seleção da câmara para aquisição da imagem digital acoplada ao micro durómetro, seleção do método de ensaio (Força de ensaio); posteriormente o programa efetua o tratamento da imagem (aplicação de filtros para eliminação do ruído de fundo da imagem original), segue-se, por indicação do operador, a zona de interesse (ROI) e por sua vez são identificadas automaticamente os vértices da calote e respetivas distâncias das diagonais geradas concluindo, após aceitação das mesmas, com o respetivo cálculo de micro dureza resultante. Para validação dos resultados foram utilizados blocos-padrão de dureza certificada (CRM), cujos resultados foram satisfatórios, tendo-se obtido um elevado nível de exatidão nas medições efetuadas. Por fim, desenvolveu-se uma folha de cálculo em Excel com a determinação da incerteza associada às medições de micro dureza Vickers. Foram então comparados os resultados nas duas metodologias possíveis, pelo método ótico convencional e pela utilização das ferramentas de visão computacional, tendo-se obtido bons resultados com a solução proposta.
Resumo:
L'utilisation efficace des systèmes géothermaux, la séquestration du CO2 pour limiter le changement climatique et la prévention de l'intrusion d'eau salée dans les aquifères costaux ne sont que quelques exemples qui démontrent notre besoin en technologies nouvelles pour suivre l'évolution des processus souterrains à partir de la surface. Un défi majeur est d'assurer la caractérisation et l'optimisation des performances de ces technologies à différentes échelles spatiales et temporelles. Les méthodes électromagnétiques (EM) d'ondes planes sont sensibles à la conductivité électrique du sous-sol et, par conséquent, à la conductivité électrique des fluides saturant la roche, à la présence de fractures connectées, à la température et aux matériaux géologiques. Ces méthodes sont régies par des équations valides sur de larges gammes de fréquences, permettant détudier de manières analogues des processus allant de quelques mètres sous la surface jusqu'à plusieurs kilomètres de profondeur. Néanmoins, ces méthodes sont soumises à une perte de résolution avec la profondeur à cause des propriétés diffusives du champ électromagnétique. Pour cette raison, l'estimation des modèles du sous-sol par ces méthodes doit prendre en compte des informations a priori afin de contraindre les modèles autant que possible et de permettre la quantification des incertitudes de ces modèles de façon appropriée. Dans la présente thèse, je développe des approches permettant la caractérisation statique et dynamique du sous-sol à l'aide d'ondes EM planes. Dans une première partie, je présente une approche déterministe permettant de réaliser des inversions répétées dans le temps (time-lapse) de données d'ondes EM planes en deux dimensions. Cette stratégie est basée sur l'incorporation dans l'algorithme d'informations a priori en fonction des changements du modèle de conductivité électrique attendus. Ceci est réalisé en intégrant une régularisation stochastique et des contraintes flexibles par rapport à la gamme des changements attendus en utilisant les multiplicateurs de Lagrange. J'utilise des normes différentes de la norme l2 pour contraindre la structure du modèle et obtenir des transitions abruptes entre les régions du model qui subissent des changements dans le temps et celles qui n'en subissent pas. Aussi, j'incorpore une stratégie afin d'éliminer les erreurs systématiques de données time-lapse. Ce travail a mis en évidence l'amélioration de la caractérisation des changements temporels par rapport aux approches classiques qui réalisent des inversions indépendantes à chaque pas de temps et comparent les modèles. Dans la seconde partie de cette thèse, j'adopte un formalisme bayésien et je teste la possibilité de quantifier les incertitudes sur les paramètres du modèle dans l'inversion d'ondes EM planes. Pour ce faire, je présente une stratégie d'inversion probabiliste basée sur des pixels à deux dimensions pour des inversions de données d'ondes EM planes et de tomographies de résistivité électrique (ERT) séparées et jointes. Je compare les incertitudes des paramètres du modèle en considérant différents types d'information a priori sur la structure du modèle et différentes fonctions de vraisemblance pour décrire les erreurs sur les données. Les résultats indiquent que la régularisation du modèle est nécessaire lorsqu'on a à faire à un large nombre de paramètres car cela permet d'accélérer la convergence des chaînes et d'obtenir des modèles plus réalistes. Cependent, ces contraintes mènent à des incertitudes d'estimations plus faibles, ce qui implique des distributions a posteriori qui ne contiennent pas le vrai modèledans les régions ou` la méthode présente une sensibilité limitée. Cette situation peut être améliorée en combinant des méthodes d'ondes EM planes avec d'autres méthodes complémentaires telles que l'ERT. De plus, je montre que le poids de régularisation des paramètres et l'écart-type des erreurs sur les données peuvent être retrouvés par une inversion probabiliste. Finalement, j'évalue la possibilité de caractériser une distribution tridimensionnelle d'un panache de traceur salin injecté dans le sous-sol en réalisant une inversion probabiliste time-lapse tridimensionnelle d'ondes EM planes. Etant donné que les inversions probabilistes sont très coûteuses en temps de calcul lorsque l'espace des paramètres présente une grande dimension, je propose une stratégie de réduction du modèle ou` les coefficients de décomposition des moments de Legendre du panache de traceur injecté ainsi que sa position sont estimés. Pour ce faire, un modèle de résistivité de base est nécessaire. Il peut être obtenu avant l'expérience time-lapse. Un test synthétique montre que la méthodologie marche bien quand le modèle de résistivité de base est caractérisé correctement. Cette méthodologie est aussi appliquée à un test de trac¸age par injection d'une solution saline et d'acides réalisé dans un système géothermal en Australie, puis comparée à une inversion time-lapse tridimensionnelle réalisée selon une approche déterministe. L'inversion probabiliste permet de mieux contraindre le panache du traceur salin gr^ace à la grande quantité d'informations a priori incluse dans l'algorithme. Néanmoins, les changements de conductivités nécessaires pour expliquer les changements observés dans les données sont plus grands que ce qu'expliquent notre connaissance actuelle des phénomenès physiques. Ce problème peut être lié à la qualité limitée du modèle de résistivité de base utilisé, indiquant ainsi que des efforts plus grands devront être fournis dans le futur pour obtenir des modèles de base de bonne qualité avant de réaliser des expériences dynamiques. Les études décrites dans cette thèse montrent que les méthodes d'ondes EM planes sont très utiles pour caractériser et suivre les variations temporelles du sous-sol sur de larges échelles. Les présentes approches améliorent l'évaluation des modèles obtenus, autant en termes d'incorporation d'informations a priori, qu'en termes de quantification d'incertitudes a posteriori. De plus, les stratégies développées peuvent être appliquées à d'autres méthodes géophysiques, et offrent une grande flexibilité pour l'incorporation d'informations additionnelles lorsqu'elles sont disponibles. -- The efficient use of geothermal systems, the sequestration of CO2 to mitigate climate change, and the prevention of seawater intrusion in coastal aquifers are only some examples that demonstrate the need for novel technologies to monitor subsurface processes from the surface. A main challenge is to assure optimal performance of such technologies at different temporal and spatial scales. Plane-wave electromagnetic (EM) methods are sensitive to subsurface electrical conductivity and consequently to fluid conductivity, fracture connectivity, temperature, and rock mineralogy. These methods have governing equations that are the same over a large range of frequencies, thus allowing to study in an analogous manner processes on scales ranging from few meters close to the surface down to several hundreds of kilometers depth. Unfortunately, they suffer from a significant resolution loss with depth due to the diffusive nature of the electromagnetic fields. Therefore, estimations of subsurface models that use these methods should incorporate a priori information to better constrain the models, and provide appropriate measures of model uncertainty. During my thesis, I have developed approaches to improve the static and dynamic characterization of the subsurface with plane-wave EM methods. In the first part of this thesis, I present a two-dimensional deterministic approach to perform time-lapse inversion of plane-wave EM data. The strategy is based on the incorporation of prior information into the inversion algorithm regarding the expected temporal changes in electrical conductivity. This is done by incorporating a flexible stochastic regularization and constraints regarding the expected ranges of the changes by using Lagrange multipliers. I use non-l2 norms to penalize the model update in order to obtain sharp transitions between regions that experience temporal changes and regions that do not. I also incorporate a time-lapse differencing strategy to remove systematic errors in the time-lapse inversion. This work presents improvements in the characterization of temporal changes with respect to the classical approach of performing separate inversions and computing differences between the models. In the second part of this thesis, I adopt a Bayesian framework and use Markov chain Monte Carlo (MCMC) simulations to quantify model parameter uncertainty in plane-wave EM inversion. For this purpose, I present a two-dimensional pixel-based probabilistic inversion strategy for separate and joint inversions of plane-wave EM and electrical resistivity tomography (ERT) data. I compare the uncertainties of the model parameters when considering different types of prior information on the model structure and different likelihood functions to describe the data errors. The results indicate that model regularization is necessary when dealing with a large number of model parameters because it helps to accelerate the convergence of the chains and leads to more realistic models. These constraints also lead to smaller uncertainty estimates, which imply posterior distributions that do not include the true underlying model in regions where the method has limited sensitivity. This situation can be improved by combining planewave EM methods with complimentary geophysical methods such as ERT. In addition, I show that an appropriate regularization weight and the standard deviation of the data errors can be retrieved by the MCMC inversion. Finally, I evaluate the possibility of characterizing the three-dimensional distribution of an injected water plume by performing three-dimensional time-lapse MCMC inversion of planewave EM data. Since MCMC inversion involves a significant computational burden in high parameter dimensions, I propose a model reduction strategy where the coefficients of a Legendre moment decomposition of the injected water plume and its location are estimated. For this purpose, a base resistivity model is needed which is obtained prior to the time-lapse experiment. A synthetic test shows that the methodology works well when the base resistivity model is correctly characterized. The methodology is also applied to an injection experiment performed in a geothermal system in Australia, and compared to a three-dimensional time-lapse inversion performed within a deterministic framework. The MCMC inversion better constrains the water plumes due to the larger amount of prior information that is included in the algorithm. The conductivity changes needed to explain the time-lapse data are much larger than what is physically possible based on present day understandings. This issue may be related to the base resistivity model used, therefore indicating that more efforts should be given to obtain high-quality base models prior to dynamic experiments. The studies described herein give clear evidence that plane-wave EM methods are useful to characterize and monitor the subsurface at a wide range of scales. The presented approaches contribute to an improved appraisal of the obtained models, both in terms of the incorporation of prior information in the algorithms and the posterior uncertainty quantification. In addition, the developed strategies can be applied to other geophysical methods, and offer great flexibility to incorporate additional information when available.
Resumo:
Radioactive soil-contamination mapping and risk assessment is a vital issue for decision makers. Traditional approaches for mapping the spatial concentration of radionuclides employ various regression-based models, which usually provide a single-value prediction realization accompanied (in some cases) by estimation error. Such approaches do not provide the capability for rigorous uncertainty quantification or probabilistic mapping. Machine learning is a recent and fast-developing approach based on learning patterns and information from data. Artificial neural networks for prediction mapping have been especially powerful in combination with spatial statistics. A data-driven approach provides the opportunity to integrate additional relevant information about spatial phenomena into a prediction model for more accurate spatial estimates and associated uncertainty. Machine-learning algorithms can also be used for a wider spectrum of problems than before: classification, probability density estimation, and so forth. Stochastic simulations are used to model spatial variability and uncertainty. Unlike regression models, they provide multiple realizations of a particular spatial pattern that allow uncertainty and risk quantification. This paper reviews the most recent methods of spatial data analysis, prediction, and risk mapping, based on machine learning and stochastic simulations in comparison with more traditional regression models. The radioactive fallout from the Chernobyl Nuclear Power Plant accident is used to illustrate the application of the models for prediction and classification problems. This fallout is a unique case study that provides the challenging task of analyzing huge amounts of data ('hard' direct measurements, as well as supplementary information and expert estimates) and solving particular decision-oriented problems.
Resumo:
Cette thèse porte sur l’évaluation de la cohérence du réseau conceptuel démontré par des étudiants de niveau collégial inscrits en sciences de la nature. L’évaluation de cette cohérence s’est basée sur l’analyse des tableaux de Burt issus des réponses à des questionnaires à choix multiples, sur l’étude détaillée des indices de discrimination spécifique qui seront décrits plus en détail dans le corps de l’ouvrage et sur l’analyse de séquences vidéos d’étudiants effectuant une expérimentation en contexte réel. Au terme de ce projet, quatre grands axes de recherche ont été exploré. 1) Quelle est la cohérence conceptuelle démontrée en physique newtonienne ? 2) Est-ce que la maîtrise du calcul d’incertitude est corrélée au développement de la pensée logique ou à la maîtrise des mathématiques ? 3) Quelle est la cohérence conceptuelle démontrée dans la quantification de l’incertitude expérimentale ? 4) Quelles sont les procédures concrètement mise en place par des étudiants pour quantifier l’incertitude expérimentale dans un contexte de laboratoire semi-dirigé ? Les principales conclusions qui ressortent pour chacun des axes peuvent se formuler ainsi. 1) Les conceptions erronées les plus répandues ne sont pas solidement ancrées dans un réseau conceptuel rigide. Par exemple, un étudiant réussissant une question sur la troisième loi de Newton (sujet le moins bien réussi du Force Concept Inventory) montre une probabilité à peine supérieure de réussir une autre question sur ce même sujet que les autres participants. De nombreux couples de questions révèlent un indice de discrimination spécifique négatif indiquant une faible cohérence conceptuelle en prétest et une cohérence conceptuelle légèrement améliorée en post-test. 2) Si une petite proportion des étudiants ont montré des carences marquées pour les questions reliées au contrôle des variables et à celles traitant de la relation entre la forme graphique de données expérimentales et un modèle mathématique, la majorité des étudiants peuvent être considérés comme maîtrisant adéquatement ces deux sujets. Toutefois, presque tous les étudiants démontrent une absence de maîtrise des principes sous-jacent à la quantification de l’incertitude expérimentale et de la propagation des incertitudes (ci-après appelé métrologie). Aucune corrélation statistiquement significative n’a été observée entre ces trois domaines, laissant entendre qu’il s’agit d’habiletés cognitives largement indépendantes. Le tableau de Burt a pu mettre en lumière une plus grande cohérence conceptuelle entre les questions de contrôle des variables que n’aurait pu le laisser supposer la matrice des coefficients de corrélation de Pearson. En métrologie, des questions équivalentes n’ont pas fait ressortir une cohérence conceptuelle clairement démontrée. 3) L’analyse d’un questionnaire entièrement dédié à la métrologie laisse entrevoir des conceptions erronées issues des apprentissages effectués dans les cours antérieurs (obstacles didactiques), des conceptions erronées basées sur des modèles intuitifs et une absence de compréhension globale des concepts métrologiques bien que certains concepts paraissent en voie d’acquisition. 4) Lorsque les étudiants sont laissés à eux-mêmes, les mêmes difficultés identifiées par l’analyse du questionnaire du point 3) reviennent ce qui corrobore les résultats obtenus. Cependant, nous avons pu observer d’autres comportements reliés à la mesure en laboratoire qui n’auraient pas pu être évalués par le questionnaire à choix multiples. Des entretiens d’explicitations tenus immédiatement après chaque séance ont permis aux participants de détailler certains aspects de leur méthodologie métrologique, notamment, l’emploi de procédures de répétitions de mesures expérimentales, leurs stratégies pour quantifier l’incertitude et les raisons sous-tendant l’estimation numérique des incertitudes de lecture. L’emploi des algorithmes de propagation des incertitudes a été adéquat dans l’ensemble. De nombreuses conceptions erronées en métrologie semblent résister fortement à l’apprentissage. Notons, entre autres, l’assignation de la résolution d’un appareil de mesure à affichage numérique comme valeur de l’incertitude et l’absence de procédures d’empilement pour diminuer l’incertitude. La conception que la précision d’une valeur numérique ne peut être inférieure à la tolérance d’un appareil semble fermement ancrée.