906 resultados para Probabilistic estimation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many transportation agencies maintain grade as an attribute in roadway inventory databases; however, the information is often in an aggregated format. Cross slope is rarely included in large roadway inventories. Accurate methods available to collect grade and cross slope include global positioning systems, traditional surveying, and mobile mapping systems. However, most agencies do not have the resources to utilize these methods to collect grade and cross slope on a large scale. This report discusses the use of LIDAR to extract roadway grade and cross slope for large-scale inventories. Current data collection methods and their advantages and disadvantages are discussed. A pilot study to extract grade and cross slope from a LIDAR data set, including methodology, results, and conclusions, is presented. This report describes the regression methodology used to extract and evaluate the accuracy of grade and cross slope from three dimensional surfaces created from LIDAR data. The use of LIDAR data to extract grade and cross slope on tangent highway segments was evaluated and compared against grade and cross slope collected using an automatic level for 10 test segments along Iowa Highway 1. Grade and cross slope were measured from a surface model created from LIDAR data points collected for the study area. While grade could be estimated to within 1%, study results indicate that cross slope cannot practically be estimated using a LIDAR derived surface model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Well developed experimental procedures currently exist for retrieving and analyzing particle evidence from hands of individuals suspected of being associated with the discharge of a firearm. Although analytical approaches (e.g. automated Scanning Electron Microscopy with Energy Dispersive X-ray (SEM-EDS) microanalysis) allow the determination of the presence of elements typically found in gunshot residue (GSR) particles, such analyses provide no information about a given particle's actual source. Possible origins for which scientists may need to account for are a primary exposure to the discharge of a firearm or a secondary transfer due to a contaminated environment. In order to approach such sources of uncertainty in the context of evidential assessment, this paper studies the construction and practical implementation of graphical probability models (i.e. Bayesian networks). These can assist forensic scientists in making the issue tractable within a probabilistic perspective. The proposed models focus on likelihood ratio calculations at various levels of detail as well as case pre-assessment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The goal of this study was to investigate the impact of computing parameters and the location of volumes of interest (VOI) on the calculation of 3D noise power spectrum (NPS) in order to determine an optimal set of computing parameters and propose a robust method for evaluating the noise properties of imaging systems. Noise stationarity in noise volumes acquired with a water phantom on a 128-MDCT and a 320-MDCT scanner were analyzed in the spatial domain in order to define locally stationary VOIs. The influence of the computing parameters in the 3D NPS measurement: the sampling distances bx,y,z and the VOI lengths Lx,y,z, the number of VOIs NVOI and the structured noise were investigated to minimize measurement errors. The effect of the VOI locations on the NPS was also investigated. Results showed that the noise (standard deviation) varies more in the r-direction (phantom radius) than z-direction plane. A 25 × 25 × 40 mm(3) VOI associated with DFOV = 200 mm (Lx,y,z = 64, bx,y = 0.391 mm with 512 × 512 matrix) and a first-order detrending method to reduce structured noise led to an accurate NPS estimation. NPS estimated from off centered small VOIs had a directional dependency contrary to NPS obtained from large VOIs located in the center of the volume or from small VOIs located on a concentric circle. This showed that the VOI size and location play a major role in the determination of NPS when images are not stationary. This study emphasizes the need for consistent measurement methods to assess and compare image quality in CT.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a seabed profile estimation and following method for close proximity inspection of 3D underwater structures using autonomous underwater vehicles (AUVs). The presented method is used to determine a path allowing the AUV to pass its sensors over all points of the target structure, which is known as coverage path planning. Our profile following method goes beyond traditional seabed following at a safe altitude and exploits hovering capabilities of recent AUV developments. A range sonar is used to incrementally construct a local probabilistic map representation of the environment and estimates of the local profile are obtained via linear regression. Two behavior-based controllers use these estimates to perform horizontal and vertical profile following. We build upon these tools to address coverage path planning for 3D underwater structures using a (potentially inaccurate) prior map and following cross-section profiles of the target structure. The feasibility of the proposed method is demonstrated using the GIRONA 500 AUV both in simulation using synthetic and real-world bathymetric data and in pool trials

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this work was to develop a procedure to estimate soybean crop areas in Rio Grande do Sul state, Brazil. Estimations were made based on the temporal profiles of the enhanced vegetation index (Evi) calculated from moderate resolution imaging spectroradiometer (Modis) images. The methodology developed for soybean classification was named Modis crop detection algorithm (MCDA). The MCDA provides soybean area estimates in December (first forecast), using images from the sowing period, and March (second forecast), using images from the sowing and maximum crop development periods. The results obtained by the MCDA were compared with the official estimates on soybean area of the Instituto Brasileiro de Geografia e Estatística. The coefficients of determination ranged from 0.91 to 0.95, indicating good agreement between the estimates. For the 2000/2001 crop year, the MCDA soybean crop map was evaluated using a soybean crop map derived from Landsat images, and the overall map accuracy was approximately 82%, with similar commission and omission errors. The MCDA was able to estimate soybean crop areas in Rio Grande do Sul State and to generate an annual thematic map with the geographic position of the soybean fields. The soybean crop area estimates by the MCDA are in good agreement with the official agricultural statistics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Osteoporosis (OP) is a systemic skeletal disease characterized by a low bone mineral density (BMD) and a micro-architectural (MA) deterioration. Clinical risk factors (CRF) are often used as a MA approximation. MA is yet evaluable in daily practice by the trabecular bone score (TBS) measure. TBS is very simple to obtain, by reanalyzing a lumbar DXA-scan. TBS has proven to have diagnosis and prognosis values, partially independent of CRF and BMD. The aim of the OsteoLaus cohort is to combine in daily practice the CRF and the information given by DXA (BMD, TBS and vertebral fracture assessment (VFA)) to better identify women at high fracture risk. The OsteoLaus cohort (1400 women 50 to 80 years living in Lausanne, Switzerland) started in 2010. This study is derived from the cohort COLAUS who started in Lausanne in 2003. The main goal of COLAUS is to obtain information on the epidemiology and genetic determinants of cardiovascular risk in 6700 men and women. CRF for OP, bone ultrasound of the heel, lumbar spine and hip BMD, VFA by DXA and MA evaluation by TBS are recorded in OsteoLaus. Preliminary results are reported. We included 631 women: mean age 67.4 ± 6.7 years, BMI 26.1 ± 4.6, mean lumbar spine BMD 0.943 ± 0.168 (T-score − 1.4 SD), and TBS 1.271 ± 0.103. As expected, correlation between BMD and site matched TBS is low (r2 = 0.16). Prevalence of VFx grade 2/3, major OP Fx and all OP Fx is 8.4%, 17.0% and 26.0% respectively. Age- and BMI-adjusted ORs (per SD decrease) are 1.8 (1.2-2.5), 1.6 (1.2-2.1), and 1.3 (1.1-1.6) for BMD for the different categories of fractures and 2.0 (1.4-3.0), 1.9 (1.4-2.5), and 1.4 (1.1-1.7) for TBS respectively. Only 32 to 37% of women with OP Fx have a BMD < − 2.5 SD or a TBS < 1.200. If we combine a BMD < − 2.5 SD or a TBS < 1.200, 54 to 60% of women with an osteoporotic Fx are identified. As in the already published studies, these preliminary results confirm the partial independence between BMD and TBS. More importantly, a combination of TBS subsequent to BMD increases significantly the identification of women with prevalent OP Fx which would have been misclassified by BMD alone. For the first time we are able to have complementary information about fracture (VFA), density (BMD), micro- and macro architecture (TBS and HAS) from a simple, low ionizing radiation and cheap device: DXA. Such complementary information is very useful for the patient in the daily practice and moreover will likely have an impact on cost effectiveness analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract

Relevância:

20.00% 20.00%

Publicador:

Resumo:

PURPOSE: To use measurement by cycling power meters (Pmes) to evaluate the accuracy of commonly used models for estimating uphill cycling power (Pest). Experiments were designed to explore the influence of wind speed and steepness of climb on accuracy of Pest. The authors hypothesized that the random error in Pest would be largely influenced by the windy conditions, the bias would be diminished in steeper climbs, and windy conditions would induce larger bias in Pest. METHODS: Sixteen well-trained cyclists performed 15 uphill-cycling trials (range: length 1.3-6.3 km, slope 4.4-10.7%) in a random order. Trials included different riding position in a group (lead or follow) and different wind speeds. Pmes was quantified using a power meter, and Pest was calculated with a methodology used by journalists reporting on the Tour de France. RESULTS: Overall, the difference between Pmes and Pest was -0.95% (95%CI: -10.4%, +8.5%) for all trials and 0.24% (-6.1%, +6.6%) in conditions without wind (<2 m/s). The relationship between percent slope and the error between Pest and Pmes were considered trivial. CONCLUSIONS: Aerodynamic drag (affected by wind velocity and orientation, frontal area, drafting, and speed) is the most confounding factor. The mean estimated values are close to the power-output values measured by power meters, but the random error is between ±6% and ±10%. Moreover, at the power outputs (>400 W) produced by professional riders, this error is likely to be higher. This observation calls into question the validity of releasing individual values without reporting the range of random errors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the current issue of epidemiology, Danaei and colleagues elegantly estimated both the direct effect and the indirect effect-that is, the effect mediated by blood pressure, cholesterol, glucose, fibrinogen, and high-sensitivity C-reactive protein-of body mass index (BMI) on the risk of coronary heart disease (CHD). they analyzed data from 9 cohort studies including 58,322 patients and 9459 CHD events, with baseline measurements between 1954 and 2001. Using sophisticated and cutting-edge methods for direct and indirect effect estimations, the authors estimated that half of the risk of overweight and obesity would be mediated by blood pressure, cholesterol, and glucose. Few additional percentage points of the risk would be mediated by fibrinogen and hs-CRP. How should we understand these estimates? Can we say that if obese persons reduce their body weight and reach a normal body weight, their excess risk of CHD would be reduced by half through an improvement in these mediators and by half through the reduction in BmI itself? Is that also true if these individuals are prevented from becoming obese in the first place? Can we also conclude that if these mediators are well controlled in obese individuals through other means than a body weight reduction, their excess risk of CHD would be reduced by half? Let us confront these estimates with observations from studies evaluating 2 interventions to reduce body weight, that is, bariatric surgery in patients with severe obesity and intensive lifestyle intervention in overweight patients with diabetes