111 resultados para Hidden Markov random fields
Resumo:
Des progrès significatifs ont été réalisés dans le domaine de l'intégration quantitative des données géophysique et hydrologique l'échelle locale. Cependant, l'extension à de plus grandes échelles des approches correspondantes constitue encore un défi majeur. Il est néanmoins extrêmement important de relever ce défi pour développer des modèles fiables de flux des eaux souterraines et de transport de contaminant. Pour résoudre ce problème, j'ai développé une technique d'intégration des données hydrogéophysiques basée sur une procédure bayésienne de simulation séquentielle en deux étapes. Cette procédure vise des problèmes à plus grande échelle. L'objectif est de simuler la distribution d'un paramètre hydraulique cible à partir, d'une part, de mesures d'un paramètre géophysique pertinent qui couvrent l'espace de manière exhaustive, mais avec une faible résolution (spatiale) et, d'autre part, de mesures locales de très haute résolution des mêmes paramètres géophysique et hydraulique. Pour cela, mon algorithme lie dans un premier temps les données géophysiques de faible et de haute résolution à travers une procédure de réduction déchelle. Les données géophysiques régionales réduites sont ensuite reliées au champ du paramètre hydraulique à haute résolution. J'illustre d'abord l'application de cette nouvelle approche dintégration des données à une base de données synthétiques réaliste. Celle-ci est constituée de mesures de conductivité hydraulique et électrique de haute résolution réalisées dans les mêmes forages ainsi que destimations des conductivités électriques obtenues à partir de mesures de tomographic de résistivité électrique (ERT) sur l'ensemble de l'espace. Ces dernières mesures ont une faible résolution spatiale. La viabilité globale de cette méthode est testée en effectuant les simulations de flux et de transport au travers du modèle original du champ de conductivité hydraulique ainsi que du modèle simulé. Les simulations sont alors comparées. Les résultats obtenus indiquent que la procédure dintégration des données proposée permet d'obtenir des estimations de la conductivité en adéquation avec la structure à grande échelle ainsi que des predictions fiables des caractéristiques de transports sur des distances de moyenne à grande échelle. Les résultats correspondant au scénario de terrain indiquent que l'approche d'intégration des données nouvellement mise au point est capable d'appréhender correctement les hétérogénéitées à petite échelle aussi bien que les tendances à gande échelle du champ hydraulique prévalent. Les résultats montrent également une flexibilté remarquable et une robustesse de cette nouvelle approche dintégration des données. De ce fait, elle est susceptible d'être appliquée à un large éventail de données géophysiques et hydrologiques, à toutes les gammes déchelles. Dans la deuxième partie de ma thèse, j'évalue en détail la viabilité du réechantillonnage geostatique séquentiel comme mécanisme de proposition pour les méthodes Markov Chain Monte Carlo (MCMC) appliquées à des probmes inverses géophysiques et hydrologiques de grande dimension . L'objectif est de permettre une quantification plus précise et plus réaliste des incertitudes associées aux modèles obtenus. En considérant une série dexemples de tomographic radar puits à puits, j'étudie deux classes de stratégies de rééchantillonnage spatial en considérant leur habilité à générer efficacement et précisément des réalisations de la distribution postérieure bayésienne. Les résultats obtenus montrent que, malgré sa popularité, le réechantillonnage séquentiel est plutôt inefficace à générer des échantillons postérieurs indépendants pour des études de cas synthétiques réalistes, notamment pour le cas assez communs et importants où il existe de fortes corrélations spatiales entre le modèle et les paramètres. Pour résoudre ce problème, j'ai développé un nouvelle approche de perturbation basée sur une déformation progressive. Cette approche est flexible en ce qui concerne le nombre de paramètres du modèle et lintensité de la perturbation. Par rapport au rééchantillonage séquentiel, cette nouvelle approche s'avère être très efficace pour diminuer le nombre requis d'itérations pour générer des échantillons indépendants à partir de la distribution postérieure bayésienne. - Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending corresponding approaches beyond the local scale still represents a major challenge, yet is critically important for the development of reliable groundwater flow and contaminant transport models. To address this issue, I have developed a hydrogeophysical data integration technique based on a two-step Bayesian sequential simulation procedure that is specifically targeted towards larger-scale problems. The objective is to simulate the distribution of a target hydraulic parameter based on spatially exhaustive, but poorly resolved, measurements of a pertinent geophysical parameter and locally highly resolved, but spatially sparse, measurements of the considered geophysical and hydraulic parameters. To this end, my algorithm links the low- and high-resolution geophysical data via a downscaling procedure before relating the downscaled regional-scale geophysical data to the high-resolution hydraulic parameter field. I first illustrate the application of this novel data integration approach to a realistic synthetic database consisting of collocated high-resolution borehole measurements of the hydraulic and electrical conductivities and spatially exhaustive, low-resolution electrical conductivity estimates obtained from electrical resistivity tomography (ERT). The overall viability of this method is tested and verified by performing and comparing flow and transport simulations through the original and simulated hydraulic conductivity fields. The corresponding results indicate that the proposed data integration procedure does indeed allow for obtaining faithful estimates of the larger-scale hydraulic conductivity structure and reliable predictions of the transport characteristics over medium- to regional-scale distances. The approach is then applied to a corresponding field scenario consisting of collocated high- resolution measurements of the electrical conductivity, as measured using a cone penetrometer testing (CPT) system, and the hydraulic conductivity, as estimated from electromagnetic flowmeter and slug test measurements, in combination with spatially exhaustive low-resolution electrical conductivity estimates obtained from surface-based electrical resistivity tomography (ERT). The corresponding results indicate that the newly developed data integration approach is indeed capable of adequately capturing both the small-scale heterogeneity as well as the larger-scale trend of the prevailing hydraulic conductivity field. The results also indicate that this novel data integration approach is remarkably flexible and robust and hence can be expected to be applicable to a wide range of geophysical and hydrological data at all scale ranges. In the second part of my thesis, I evaluate in detail the viability of sequential geostatistical resampling as a proposal mechanism for Markov Chain Monte Carlo (MCMC) methods applied to high-dimensional geophysical and hydrological inverse problems in order to allow for a more accurate and realistic quantification of the uncertainty associated with the thus inferred models. Focusing on a series of pertinent crosshole georadar tomographic examples, I investigated two classes of geostatistical resampling strategies with regard to their ability to efficiently and accurately generate independent realizations from the Bayesian posterior distribution. The corresponding results indicate that, despite its popularity, sequential resampling is rather inefficient at drawing independent posterior samples for realistic synthetic case studies, notably for the practically common and important scenario of pronounced spatial correlation between model parameters. To address this issue, I have developed a new gradual-deformation-based perturbation approach, which is flexible with regard to the number of model parameters as well as the perturbation strength. Compared to sequential resampling, this newly proposed approach was proven to be highly effective in decreasing the number of iterations required for drawing independent samples from the Bayesian posterior distribution.
Resumo:
In this paper, we study the average crossing number of equilateral random walks and polygons. We show that the mean average crossing number ACN of all equilateral random walks of length n is of the form . A similar result holds for equilateral random polygons. These results are confirmed by our numerical studies. Furthermore, our numerical studies indicate that when random polygons of length n are divided into individual knot types, the for each knot type can be described by a function of the form where a, b and c are constants depending on and n0 is the minimal number of segments required to form . The profiles diverge from each other, with more complex knots showing higher than less complex knots. Moreover, the profiles intersect with the ACN profile of all closed walks. These points of intersection define the equilibrium length of , i.e., the chain length at which a statistical ensemble of configurations with given knot type -upon cutting, equilibration and reclosure to a new knot type -does not show a tendency to increase or decrease . This concept of equilibrium length seems to be universal, and applies also to other length-dependent observables for random knots, such as the mean radius of gyration Rg.
Resumo:
This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.
Resumo:
The quantity of interest for high-energy photon beam therapy recommended by most dosimetric protocols is the absorbed dose to water. Thus, ionization chambers are calibrated in absorbed dose to water, which is the same quantity as what is calculated by most treatment planning systems (TPS). However, when measurements are performed in a low-density medium, the presence of the ionization chamber generates a perturbation at the level of the secondary particle range. Therefore, the measured quantity is close to the absorbed dose to a volume of water equivalent to the chamber volume. This quantity is not equivalent to the dose calculated by a TPS, which is the absorbed dose to an infinitesimally small volume of water. This phenomenon can lead to an overestimation of the absorbed dose measured with an ionization chamber of up to 40% in extreme cases. In this paper, we propose a method to calculate correction factors based on the Monte Carlo simulations. These correction factors are obtained by the ratio of the absorbed dose to water in a low-density medium □D(w,Q,V1)(low) averaged over a scoring volume V₁ for a geometry where V₁ is filled with the low-density medium and the absorbed dose to water □D(w,QV2)(low) averaged over a volume V₂ for a geometry where V₂ is filled with water. In the Monte Carlo simulations, □D(w,QV2)(low) is obtained by replacing the volume of the ionization chamber by an equivalent volume of water, according to the definition of the absorbed dose to water. The method is validated in two different configurations which allowed us to study the behavior of this correction factor as a function of depth in phantom, photon beam energy, phantom density and field size.
Resumo:
OBJECTIVES: To determine whether nalmefene combined with psychosocial support is cost-effective compared with psychosocial support alone for reducing alcohol consumption in alcohol-dependent patients with high/very high drinking risk levels (DRLs) as defined by the WHO, and to evaluate the public health benefit of reducing harmful alcohol-attributable diseases, injuries and deaths. DESIGN: Decision modelling using Markov chains compared costs and effects over 5 years. SETTING: The analysis was from the perspective of the National Health Service (NHS) in England and Wales. PARTICIPANTS: The model considered the licensed population for nalmefene, specifically adults with both alcohol dependence and high/very high DRLs, who do not require immediate detoxification and who continue to have high/very high DRLs after initial assessment. DATA SOURCES: We modelled treatment effect using data from three clinical trials for nalmefene (ESENSE 1 (NCT00811720), ESENSE 2 (NCT00812461) and SENSE (NCT00811941)). Baseline characteristics of the model population, treatment resource utilisation and utilities were from these trials. We estimated the number of alcohol-attributable events occurring at different levels of alcohol consumption based on published epidemiological risk-relation studies. Health-related costs were from UK sources. MAIN OUTCOME MEASURES: We measured incremental cost per quality-adjusted life year (QALY) gained and number of alcohol-attributable harmful events avoided. RESULTS: Nalmefene in combination with psychosocial support had an incremental cost-effectiveness ratio (ICER) of £5204 per QALY gained, and was therefore cost-effective at the £20,000 per QALY gained decision threshold. Sensitivity analyses showed that the conclusion was robust. Nalmefene plus psychosocial support led to the avoidance of 7179 alcohol-attributable diseases/injuries and 309 deaths per 100,000 patients compared to psychosocial support alone over the course of 5 years. CONCLUSIONS: Nalmefene can be seen as a cost-effective treatment for alcohol dependence, with substantial public health benefits. TRIAL REGISTRATION NUMBERS: This cost-effectiveness analysis was developed based on data from three randomised clinical trials: ESENSE 1 (NCT00811720), ESENSE 2 (NCT00812461) and SENSE (NCT00811941).
Resumo:
Cette thèse s'intéresse à étudier les propriétés extrémales de certains modèles de risque d'intérêt dans diverses applications de l'assurance, de la finance et des statistiques. Cette thèse se développe selon deux axes principaux, à savoir: Dans la première partie, nous nous concentrons sur deux modèles de risques univariés, c'est-à- dire, un modèle de risque de déflation et un modèle de risque de réassurance. Nous étudions le développement des queues de distribution sous certaines conditions des risques commun¬s. Les principaux résultats sont ainsi illustrés par des exemples typiques et des simulations numériques. Enfin, les résultats sont appliqués aux domaines des assurances, par exemple, les approximations de Value-at-Risk, d'espérance conditionnelle unilatérale etc. La deuxième partie de cette thèse est consacrée à trois modèles à deux variables: Le premier modèle concerne la censure à deux variables des événements extrême. Pour ce modèle, nous proposons tout d'abord une classe d'estimateurs pour les coefficients de dépendance et la probabilité des queues de distributions. Ces estimateurs sont flexibles en raison d'un paramètre de réglage. Leurs distributions asymptotiques sont obtenues sous certaines condi¬tions lentes bivariées de second ordre. Ensuite, nous donnons quelques exemples et présentons une petite étude de simulations de Monte Carlo, suivie par une application sur un ensemble de données réelles d'assurance. L'objectif de notre deuxième modèle de risque à deux variables est l'étude de coefficients de dépendance des queues de distributions obliques et asymétriques à deux variables. Ces distri¬butions obliques et asymétriques sont largement utiles dans les applications statistiques. Elles sont générées principalement par le mélange moyenne-variance de lois normales et le mélange de lois normales asymétriques d'échelles, qui distinguent la structure de dépendance de queue comme indiqué par nos principaux résultats. Le troisième modèle de risque à deux variables concerne le rapprochement des maxima de séries triangulaires elliptiques obliques. Les résultats théoriques sont fondés sur certaines hypothèses concernant le périmètre aléatoire sous-jacent des queues de distributions. -- This thesis aims to investigate the extremal properties of certain risk models of interest in vari¬ous applications from insurance, finance and statistics. This thesis develops along two principal lines, namely: In the first part, we focus on two univariate risk models, i.e., deflated risk and reinsurance risk models. Therein we investigate their tail expansions under certain tail conditions of the common risks. Our main results are illustrated by some typical examples and numerical simu¬lations as well. Finally, the findings are formulated into some applications in insurance fields, for instance, the approximations of Value-at-Risk, conditional tail expectations etc. The second part of this thesis is devoted to the following three bivariate models: The first model is concerned with bivariate censoring of extreme events. For this model, we first propose a class of estimators for both tail dependence coefficient and tail probability. These estimators are flexible due to a tuning parameter and their asymptotic distributions are obtained under some second order bivariate slowly varying conditions of the model. Then, we give some examples and present a small Monte Carlo simulation study followed by an application on a real-data set from insurance. The objective of our second bivariate risk model is the investigation of tail dependence coefficient of bivariate skew slash distributions. Such skew slash distributions are extensively useful in statistical applications and they are generated mainly by normal mean-variance mixture and scaled skew-normal mixture, which distinguish the tail dependence structure as shown by our principle results. The third bivariate risk model is concerned with the approximation of the component-wise maxima of skew elliptical triangular arrays. The theoretical results are based on certain tail assumptions on the underlying random radius.
Resumo:
Navigation by means of cognitive maps appears to require the hippocampus; hippocampal place cells (PCs) appear to store spatial memories because their discharge is confined to cell-specific places called firing fields (FFs). Experiments with rats manipulated idiothetic and landmark-related information to understand the relationship between PC activity and spatial rotation. Rotating a circular arena in the caused a discrepancy between these cuse. This discrepancy caused most FFs to disappear in both the arena and room reference frames. However, FFs persisted in the rotating arena frame when the discrepancy was reduced by darkness or by a card in the arena. The discrepancy was increased by "field clamping" the rat in a room-defined FF location by rotations that countered its locomotion. Most FFs disspared and reappeared an hour or more after the clamp. Place-avoidance experiments showed that navigation uses independent idiothetic and exteroceptive memories. Rats learned to avoid the unmarked footshock region within a circular arena. When acquired on the stable arena in the light, the location of the punishment was learned by using both room and idiothetic cues; extinction in the dark transferred to the following session in the light. If, however, extinction occured during rotation, only the arena-frame avoidance was extinguished in darkness; the room-defined location was avoided when the light were turned back on. Idiothetic memory of room-defined avoidance was not formed during rotation in light; regardless of rotation with a randomly dispersed pellet. The resulting behaviour alternated between random pellet searching and target-directed navigation, making it possible to examine PC correlates of these two classes of spatial behaviour. The independence of idiothetic and exteroceptive spatial memories and the disruption of PC firing during rotation suggest that PCs may not be necessary for spatial cognition; this idea can be tested by recording during place-avoidance and preference tasks.
The Mixture Transition Distribution Model for High-Order Markov Chains and Non-Gaussian Time Series.
Resumo:
OBJECTIVE To better define the concordance of visual loss in patients with nonarteritic anterior ischemic optic neuropathy (NAION). METHODS The medical records of 86 patients with bilateral sequential NAION were reviewed retrospectively, and visual function was assessed using visual acuity, Goldmann visual fields, color vision, and relative afferent papillary defect. A quantitative total visual field score and score per quadrant were analyzed for each eye using the numerical Goldmann visual field scoring method. RESULTS Outcome measures were visual acuity, visual field, color vision, and relative afferent papillary defect. A statistically significant correlation was found between fellow eyes for multiple parameters, including logMAR visual acuity (P = .01), global visual field (P < .001), superior visual field (P < .001), and inferior visual field (P < .001). The mean deviation of total (P < .001) and pattern (P < .001) deviation analyses was significantly less between fellow eyes than between first and second eyes of different patients. CONCLUSIONS Visual function between fellow eyes showed a fair to moderate correlation that was statistically significant. The pattern of vision loss was also more similar in fellow eyes than between eyes of different patients. These results may help allow better prediction of visual outcome for the second eye in patients with NAION.
Resumo:
In this paper, we study the average inter-crossing number between two random walks and two random polygons in the three-dimensional space. The random walks and polygons in this paper are the so-called equilateral random walks and polygons in which each segment of the walk or polygon is of unit length. We show that the mean average inter-crossing number ICN between two equilateral random walks of the same length n is approximately linear in terms of n and we were able to determine the prefactor of the linear term, which is a = (3 In 2)/(8) approximate to 0.2599. In the case of two random polygons of length n, the mean average inter-crossing number ICN is also linear, but the prefactor of the linear term is different from that of the random walks. These approximations apply when the starting points of the random walks and polygons are of a distance p apart and p is small compared to n. We propose a fitting model that would capture the theoretical asymptotic behaviour of the mean average ICN for large values of p. Our simulation result shows that the model in fact works very well for the entire range of p. We also study the mean ICN between two equilateral random walks and polygons of different lengths. An interesting result is that even if one random walk (polygon) has a fixed length, the mean average ICN between the two random walks (polygons) would still approach infinity if the length of the other random walk (polygon) approached infinity. The data provided by our simulations match our theoretical predictions very well.
Resumo:
We introduce disk matrices which encode the knotting of all subchains in circular knot configurations. The disk matrices allow us to dissect circular knots into their subknots, i.e. knot types formed by subchains of the global knot. The identification of subknots is based on the study of linear chains in which a knot type is associated to the chain by means of a spatially robust closure protocol. We characterize the sets of observed subknot types in global knots taking energy-minimized shapes such as KnotPlot configurations and ideal geometric configurations. We compare the sets of observed subknots to knot types obtained by changing crossings in the classical prime knot diagrams. Building upon this analysis, we study the sets of subknots in random configurations of corresponding knot types. In many of the knot types we analyzed, the sets of subknots from the ideal geometric configurations are found in each of the hundreds of random configurations of the same global knot type. We also compare the sets of subknots observed in open protein knots with the subknots observed in the ideal configurations of the corresponding knot type. This comparison enables us to explain the specific dispositions of subknots in the analyzed protein knots.