989 resultados para scattered data interpolation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This letter discusses the detection and correction ofresidual motion errors that appear in airborne synthetic apertureradar (SAR) interferograms due to the lack of precision in the navigationsystem. As it is shown, the effect of this lack of precision istwofold: azimuth registration errors and phase azimuth undulations.Up to now, the correction of the former was carried out byestimating the registration error and interpolating, while the latterwas based on the estimation of the phase azimuth undulations tocompensate the phase of the computed interferogram. In this letter,a new correction method is proposed, which avoids the interpolationstep and corrects at the same time the azimuth phase undulations.Additionally, the spectral diversity technique, used to estimateregistration errors, is critically analyzed. Airborne L-bandrepeat-pass interferometric data of the German Aerospace Center(DLR) experimental airborne SAR is used to validate the method

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The structural modeling of spatial dependence, using a geostatistical approach, is an indispensable tool to determine parameters that define this structure, applied on interpolation of values at unsampled points by kriging techniques. However, the estimation of parameters can be greatly affected by the presence of atypical observations in sampled data. The purpose of this study was to use diagnostic techniques in Gaussian spatial linear models in geostatistics to evaluate the sensitivity of maximum likelihood and restrict maximum likelihood estimators to small perturbations in these data. For this purpose, studies with simulated and experimental data were conducted. Results with simulated data showed that the diagnostic techniques were efficient to identify the perturbation in data. The results with real data indicated that atypical values among the sampled data may have a strong influence on thematic maps, thus changing the spatial dependence structure. The application of diagnostic techniques should be part of any geostatistical analysis, to ensure a better quality of the information from thematic maps.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper deals with the development and application of the generic methodology for automatic processing (mapping and classification) of environmental data. General Regression Neural Network (GRNN) is considered in detail and is proposed as an efficient tool to solve the problem of spatial data mapping (regression). The Probabilistic Neural Network (PNN) is considered as an automatic tool for spatial classifications. The automatic tuning of isotropic and anisotropic GRNN/PNN models using cross-validation procedure is presented. Results are compared with the k-Nearest-Neighbours (k-NN) interpolation algorithm using independent validation data set. Real case studies are based on decision-oriented mapping and classification of radioactively contaminated territories.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this work was to build mock-ups of complete yerba mate plants in several stages of development, using the InterpolMate software, and to compute photosynthesis on the interpolated structure. The mock-ups of yerba-mate were first built in the VPlants software for three growth stages. Male and female plants grown in two contrasting environments (monoculture and forest understory) were considered. To model the dynamic 3D architecture of yerba-mate plants during the biennial growth interval between two subsequent prunings, data sets of branch development collected in 38 dates were used. The estimated values obtained from the mock-ups, including leaf photosynthesis and sexual dimorphism, are very close to those observed in the field. However, this similarity was limited to reconstructions that included growth units from original data sets. The modeling of growth dynamics enables the estimation of photosynthesis for the entire yerba mate plant, which is not easily measurable in the field. The InterpolMate software is efficient for building yerba mate mock-ups.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper deals with the development and application of the methodology for automatic mapping of pollution/contamination data. General Regression Neural Network (GRNN) is considered in detail and is proposed as an efficient tool to solve this problem. The automatic tuning of isotropic and an anisotropic GRNN model using cross-validation procedure is presented. Results are compared with k-nearest-neighbours interpolation algorithm using independent validation data set. Quality of mapping is controlled by the analysis of raw data and the residuals using variography. Maps of probabilities of exceeding a given decision level and ?thick? isoline visualization of the uncertainties are presented as examples of decision-oriented mapping. Real case study is based on mapping of radioactively contaminated territories.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

After decades of mergers and acquisitions and successive technology trends such as CRM, ERP and DW, the data in enterprise systems is scattered and inconsistent. Global organizations face the challenge of addressing local uses of shared business entities, such as customer and material, and at the same time have a consistent, unique, and consolidate view of financial indicators. In addition, current enterprise systems do not accommodate the pace of organizational changes and immense efforts are required to maintain data. When it comes to systems integration, ERPs are considered “closed” and expensive. Data structures are complex and the “out-of-the-box” integration options offered are not based on industry standards. Therefore expensive and time-consuming projects are undertaken in order to have required data flowing according to business processes needs. Master Data Management (MDM) emerges as one discipline focused on ensuring long-term data consistency. Presented as a technology-enabled business discipline, it emphasizes business process and governance to model and maintain the data related to key business entities. There are immense technical and organizational challenges to accomplish the “single version of the truth” MDM mantra. Adding one central repository of master data might prove unfeasible in a few scenarios, thus an incremental approach is recommended, starting from areas most critically affected by data issues. This research aims at understanding the current literature on MDM and contrasting it with views from professionals. The data collected from interviews revealed details on the complexities of data structures and data management practices in global organizations, reinforcing the call for more in-depth research on organizational aspects of MDM. The most difficult piece of master data to manage is the “local” part, the attributes related to the sourcing and storing of materials in one particular warehouse in The Netherlands or a complex set of pricing rules for a subsidiary of a customer in Brazil. From a practical perspective, this research evaluates one MDM solution under development at a Finnish IT solution-provider. By means of applying an existing assessment method, the research attempts at providing the company with one possible tool to evaluate its product from a vendor-agnostics perspective.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most of the applications of airborne laser scanner data to forestry require that the point cloud be normalized, i.e., each point represents height from the ground instead of elevation. To normalize the point cloud, a digital terrain model (DTM), which is derived from the ground returns in the point cloud, is employed. Unfortunately, extracting accurate DTMs from airborne laser scanner data is a challenging task, especially in tropical forests where the canopy is normally very thick (partially closed), leading to a situation in which only a limited number of laser pulses reach the ground. Therefore, robust algorithms for extracting accurate DTMs in low-ground-point-densitysituations are needed in order to realize the full potential of airborne laser scanner data to forestry. The objective of this thesis is to develop algorithms for processing airborne laser scanner data in order to: (1) extract DTMs in demanding forest conditions (complex terrain and low number of ground points) for applications in forestry; (2) estimate canopy base height (CBH) for forest fire behavior modeling; and (3) assess the robustness of LiDAR-based high-resolution biomass estimation models against different field plot designs. Here, the aim is to find out if field plot data gathered by professional foresters can be combined with field plot data gathered by professionally trained community foresters and used in LiDAR-based high-resolution biomass estimation modeling without affecting prediction performance. The question of interest in this case is whether or not the local forest communities can achieve the level technical proficiency required for accurate forest monitoring. The algorithms for extracting DTMs from LiDAR point clouds presented in this thesis address the challenges of extracting DTMs in low-ground-point situations and in complex terrain while the algorithm for CBH estimation addresses the challenge of variations in the distribution of points in the LiDAR point cloud caused by things like variations in tree species and season of data acquisition. These algorithms are adaptive (with respect to point cloud characteristics) and exhibit a high degree of tolerance to variations in the density and distribution of points in the LiDAR point cloud. Results of comparison with existing DTM extraction algorithms showed that DTM extraction algorithms proposed in this thesis performed better with respect to accuracy of estimating tree heights from airborne laser scanner data. On the other hand, the proposed DTM extraction algorithms, being mostly based on trend surface interpolation, can not retain small artifacts in the terrain (e.g., bumps, small hills and depressions). Therefore, the DTMs generated by these algorithms are only suitable for forestry applications where the primary objective is to estimate tree heights from normalized airborne laser scanner data. On the other hand, the algorithm for estimating CBH proposed in this thesis is based on the idea of moving voxel in which gaps (openings in the canopy) which act as fuel breaks are located and their height is estimated. Test results showed a slight improvement in CBH estimation accuracy over existing CBH estimation methods which are based on height percentiles in the airborne laser scanner data. However, being based on the idea of moving voxel, this algorithm has one main advantage over existing CBH estimation methods in the context of forest fire modeling: it has great potential in providing information about vertical fuel continuity. This information can be used to create vertical fuel continuity maps which can provide more realistic information on the risk of crown fires compared to CBH.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a general method of generating continuous fractal interpolation surfaces by iterated function systems on an arbitrary data set over rectangular grids and estimate their Box-counting dimension.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A recurrent iterated function system (RIFS) is a genaralization of an IFS and provides nonself-affine fractal sets which are closer to natural objects. In general, it's attractor is not a continuous surface in R3. A recurrent fractal interpolation surface (RFIS) is an attractor of RIFS which is a graph of bivariate continuous interpolation function. We introduce a general method of generating recurrent interpolation surface which are at- tractors of RIFSs about any data set on a grid.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates the use of data assimilation in coastal area morphodynamic modelling using Morecambe Bay as a study site. A simple model of the bay has been enhanced with a data assimilation scheme to better predict large-scale changes in bathymetry observed in the bay over a 3-year period. The 2DH decoupled morphodynamic model developed for the work is described, as is the optimal interpolation scheme used to assimilate waterline observations into the model run. Each waterline was acquired from a SAR satellite image and is essentially a contour of the bathymetry at some level within the inter-tidal zone of the bay. For model parameters calibrated against validation observations, model performance is good, even without data assimilation. However the use of data assimilation successfully compensates for a particular failing of the model, and helps to keep the model bathymetry on track. It also improves the ability of the model to predict future bathymetry. Although the benefits of data assimilation are demonstrated using waterline observations, any observations of morphology could potentially be used. These results suggest that data assimilation should be considered for use in future coastal area morphodynamic models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data assimilation – the set of techniques whereby information from observing systems and models is combined optimally – is rapidly becoming prominent in endeavours to exploit Earth Observation for Earth sciences, including climate prediction. This paper explains the broad principles of data assimilation, outlining different approaches (optimal interpolation, three-dimensional and four-dimensional variational methods, the Kalman Filter), together with the approximations that are often necessary to make them practicable. After pointing out a variety of benefits of data assimilation, the paper then outlines some practical applications of the exploitation of Earth Observation by data assimilation in the areas of operational oceanography, chemical weather forecasting and carbon cycle modelling. Finally, some challenges for the future are noted.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although accuracy of digital elevation models (DEMs) can be quantified and measured in different ways, each is influenced by three main factors: terrain character, sampling strategy and interpolation method. These parameters, and their interaction, are discussed. The generation of DEMs from digitised contours is emphasised because this is the major source of DEMs, particularly within member countries of OEEPE. Such DEMs often exhibit unwelcome artifacts, depending on the interpolation method employed. The origin and magnitude of these effects and how they can be reduced to improve the accuracy of the DEMs are also discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Four-dimensional variational data assimilation (4D-Var) combines the information from a time sequence of observations with the model dynamics and a background state to produce an analysis. In this paper, a new mathematical insight into the behaviour of 4D-Var is gained from an extension of concepts that are used to assess the qualitative information content of observations in satellite retrievals. It is shown that the 4D-Var analysis increments can be written as a linear combination of the singular vectors of a matrix which is a function of both the observational and the forecast model systems. This formulation is used to consider the filtering and interpolating aspects of 4D-Var using idealized case-studies based on a simple model of baroclinic instability. The results of the 4D-Var case-studies exhibit the reconstruction of the state in unobserved regions as a consequence of the interpolation of observations through time. The results also exhibit the filtering of components with small spatial scales that correspond to noise, and the filtering of structures in unobserved regions. The singular vector perspective gives a very clear view of this filtering and interpolating by the 4D-Var algorithm and shows that the appropriate specification of the a priori statistics is vital to extract the largest possible amount of useful information from the observations. Copyright © 2005 Royal Meteorological Society