1000 resultados para Geostatistical models
Resumo:
In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Mat`ern models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion.
Resumo:
Soil properties have an enormous impact on economic and environmental aspects of agricultural production. Quantitative relationships between soil properties and the factors that influence their variability are the basis of digital soil mapping. The predictive models of soil properties evaluated in this work are statistical (multiple linear regression-MLR) and geostatistical (ordinary kriging and co-kriging). The study was conducted in the municipality of Bom Jardim, RJ, using a soil database with 208 sampling points. Predictive models were evaluated for sand, silt and clay fractions, pH in water and organic carbon at six depths according to the specifications of the consortium of digital soil mapping at the global level (GlobalSoilMap). Continuous covariates and categorical predictors were used and their contributions to the model assessed. Only the environmental covariates elevation, aspect, stream power index (SPI), soil wetness index (SWI), normalized difference vegetation index (NDVI), and b3/b2 band ratio were significantly correlated with soil properties. The predictive models had a mean coefficient of determination of 0.21. Best results were obtained with the geostatistical predictive models, where the highest coefficient of determination 0.43 was associated with sand properties between 60 to 100 cm deep. The use of a sparse data set of soil properties for digital mapping can explain only part of the spatial variation of these properties. The results may be related to the sampling density and the quantity and quality of the environmental covariates and predictive models used.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. The paper considers a data driven approach in modelling uncertainty in spatial predictions. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic features and describe stochastic variability and non-uniqueness of spatial properties. It is able to capture and preserve key spatial dependencies such as connectivity, which is often difficult to achieve with two-point geostatistical models. Semi-supervised SVR is designed to integrate various kinds of conditioning data and learn dependences from them. A stochastic semi-supervised SVR model is integrated into a Bayesian framework to quantify uncertainty with multiple models fitted to dynamic observations. The developed approach is illustrated with a reservoir case study. The resulting probabilistic production forecasts are described by uncertainty envelopes.
Resumo:
Anaemia has a significant impact on child development and mortality and is a severe public health problem in most countries in sub-Saharan Africa. Nutritional and infectious causes of anaemia are geographically variable and anaemia maps based on information on the major aetiologies of anaemia are important for identifying communities most in need and the relative contribution of major causes. We investigated the consistency between ecological and individual-level approaches to anaemia mapping, by building spatial anaemia models for children aged ≤15 years using different modeling approaches. We aimed to a) quantify the role of malnutrition, malaria, Schistosoma haematobium and soil-transmitted helminths (STH) for anaemia endemicity in children aged ≤15 years and b) develop a high resolution predictive risk map of anaemia for the municipality of Dande in Northern Angola. We used parasitological survey data on children aged ≤15 years to build Bayesian geostatistical models of malaria (PfPR≤15), S. haematobium, Ascaris lumbricoides and Trichuris trichiura and predict small-scale spatial variation in these infections. The predictions and their associated uncertainty were used as inputs for a model of anemia prevalence to predict small-scale spatial variation of anaemia. Stunting, PfPR≤15, and S. haematobium infections were significantly associated with anaemia risk. An estimated 12.5%, 15.6%, and 9.8%, of anaemia cases could be averted by treating malnutrition, malaria, S. haematobium, respectively. Spatial clusters of high risk of anaemia (>86%) were identified. Using an individual-level approach to anaemia mapping at a small spatial scale, we found that anaemia in children aged ≤15 years is highly heterogeneous and that malnutrition and parasitic infections are important contributors to the spatial variation in anemia risk. The results presented in this study can help inform the integration of the current provincial malaria control program with ancillary micronutrient supplementation and control of neglected tropical diseases, such as urogenital schistosomiasis and STH infection.
Resumo:
Anaemia is known to have an impact on child development and mortality and is a severe public health problem in most countries in sub-Saharan Africa. We investigated the consistency between ecological and individual-level approaches to anaemia mapping by building spatial anaemia models for children aged ≤15 years using different modelling approaches. We aimed to (i) quantify the role of malnutrition, malaria, Schistosoma haematobium and soil-transmitted helminths (STHs) in anaemia endemicity; and (ii) develop a high resolution predictive risk map of anaemia for the municipality of Dande in northern Angola. We used parasitological survey data for children aged ≤15 years to build Bayesian geostatistical models of malaria (PfPR≤15), S. haematobium, Ascaris lumbricoides and Trichuris trichiura and predict small-scale spatial variations in these infections. Malnutrition, PfPR≤15, and S. haematobium infections were significantly associated with anaemia risk. An estimated 12.5%, 15.6% and 9.8% of anaemia cases could be averted by treating malnutrition, malaria and S. haematobium, respectively. Spatial clusters of high risk of anaemia (>86%) were identified. Using an individual-level approach to anaemia mapping at a small spatial scale, we found that anaemia in children aged ≤15 years is highly heterogeneous and that malnutrition and parasitic infections are important contributors to the spatial variation in anaemia risk. The results presented in this study can help inform the integration of the current provincial malaria control programme with ancillary micronutrient supplementation and control of neglected tropical diseases such as urogenital schistosomiasis and STH infections.
Resumo:
A set of radiation measurements were carried out in several public and private institutions. These were selected with basis on the people affluence and passage to these sites. These measurements were registration formed either indoor, outdoor or underground and were compiled in three Case Studies. Radiation doses measurements were also made, surface and underground locations, and compiled in other two Case Studies. There were sampled, at the same time, humidity, temperature, atmospheric pressure and relevant construction materials at sampling locations. They were collected and registration formed to analyse if there is any relation or contribution for the measured value in each specific place. Geostatistical models were used to elaborate maps of the results both for radiation values and for doses. Preliminary relations were established among the measured parameters.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Geológica (Georrecursos)
Resumo:
The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
The present study deals with the analysis and mapping of Swiss franc interest rates. Interest rates depend on time and maturity, defining term structure of the interest rate curves (IRC). In the present study IRC are considered in a two-dimensional feature space - time and maturity. Exploratory data analysis includes a variety of tools widely used in econophysics and geostatistics. Geostatistical models and machine learning algorithms (multilayer perceptron and Support Vector Machines) were applied to produce interest rate maps. IR maps can be used for the visualisation and pattern perception purposes, to develop and to explore economical hypotheses, to produce dynamic asset-liability simulations and for financial risk assessments. The feasibility of an application of interest rates mapping approach for the IRC forecasting is considered as well. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
In view of the limited number of drill holes, interpolation of the data becomes a relatively complex task. In this study, we sought to make estimates associated with lithological types, since a quantification based on lithology can be extracted from the empty spaces in the sampling. For example, QBarton is always below the median of the biotitic litotype, information which can be used in the elaboration of geostatistical models in situations where samples are lacking. To overcome bias in the data, required by geostatistical conceptualization, we worked with the residuals obtained from the adjustment of a surface and the observed values, for the variographic analysis. The final results made possible a more optimized evaluation of the final costs required for the construction project.
Resumo:
Pós-graduação em Agronomia (Produção Vegetal) - FCAV
Resumo:
Mycobacterium bovis infects the wildlife species badgers Meles meles who are linked with the spread of the associated disease tuberculosis (TB) in cattle. Control of livestock infections depends in part on the spatial and social structure of the wildlife host. Here we describe spatial association of M. bovis infection in a badger population using data from the first year of the Four Area Project in Ireland. Using second-order intensity functions, we show there is strong evidence of clustering of TB cases in each the four areas, i.e. a global tendency for infected cases to occur near other infected cases. Using estimated intensity functions, we identify locations where particular strains of TB cluster. Generalized linear geostatistical models are used to assess the practical range at which spatial correlation occurs and is found to exceed 6 in all areas. The study is of relevance concerning the scale of localized badger culling in the control of the disease in cattle.
Resumo:
Preservation of rivers and water resources is crucial in most environmental policies and many efforts are made to assess water quality. Environmental monitoring of large river networks are based on measurement stations. Compared to the total length of river networks, their number is often limited and there is a need to extend environmental variables that are measured locally to the whole river network. The objective of this paper is to propose several relevant geostatistical models for river modeling. These models use river distance and are based on two contrasting assumptions about dependency along a river network. Inference using maximum likelihood, model selection criterion and prediction by kriging are then developed. We illustrate our approach on two variables that differ by their distributional and spatial characteristics: summer water temperature and nitrate concentration. The data come from 141 to 187 monitoring stations in a network on a large river located in the Northeast of France that is more than 5000 km long and includes Meuse and Moselle basins. We first evaluated different spatial models and then gave prediction maps and error variance maps for the whole stream network.