161 resultados para machine theory
em Université de Lausanne, Switzerland
Resumo:
The book presents the state of the art in machine learning algorithms (artificial neural networks of different architectures, support vector machines, etc.) as applied to the classification and mapping of spatially distributed environmental data. Basic geostatistical algorithms are presented as well. New trends in machine learning and their application to spatial data are given, and real case studies based on environmental and pollution data are carried out. The book provides a CD-ROM with the Machine Learning Office software, including sample sets of data, that will allow both students and researchers to put the concepts rapidly to practice.
Resumo:
The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.
Resumo:
The class of Schoenberg transformations, embedding Euclidean distances into higher dimensional Euclidean spaces, is presented, and derived from theorems on positive definite and conditionally negative definite matrices. Original results on the arc lengths, angles and curvature of the transformations are proposed, and visualized on artificial data sets by classical multidimensional scaling. A distance-based discriminant algorithm and a robust multidimensional centroid estimate illustrate the theory, closely connected to the Gaussian kernels of Machine Learning.
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
Le corps humain est l'objet privilégié d'action de la médecine, mais aussi réalité vécue, image, symbole, représentation et l'objet d'interprétation et de théorisation. Tous ces éléments constitutifs du corps influencent la façon dont la médecine le traite. Dans cette série de trois articles, nous abordons le corps sous différentes perspectives : médicale (1), phénoménologique (2), psychosomatique et socio-anthropologique (3). Ce premier article traite des représentations du corps en médecine, dont nous décrivons quatre types distincts, qui renvoient à autant de démarches scientifiques spécifiques et de formes de légitimité clinique : le corps-objet de l'anatomie, le corps-machine de la physiologie, le corps cybernétique de la biologie et le corps statistique de l'épidémiologie. The human body is the object upon which medicine is acting, but also lived reality, image, symbol, representation and the object of elaboration and theory. All these elements which constitute the body influence the way medicine is treating it. In this series of three articles, we address the human body from various perspectives: medical (1), phenomenological (2), psychosomatic and socio-anthropological (3). This first article discusses four distinct types of representation of the body within medicine, each related to a specific epistemology and shaping a distinct kind of clinical legitimacy: the body-object of anatomy, the body-machine of physiology, the cybernetic body of biology, the statistical body of epidemiology.
Resumo:
Game theory describes and analyzes strategic interaction. It is usually distinguished between static games, which are strategic situations in which the players choose only once as well as simultaneously, and dynamic games, which are strategic situations involving sequential choices. In addition, dynamic games can be further classified according to perfect and imperfect information. Indeed, a dynamic game is said to exhibit perfect information, whenever at any point of the game every player has full informational access to all choices that have been conducted so far. However, in the case of imperfect information some players are not fully informed about some choices. Game-theoretic analysis proceeds in two steps. Firstly, games are modelled by so-called form structures which extract and formalize the significant parts of the underlying strategic interaction. The basic and most commonly used models of games are the normal form, which rather sparsely describes a game merely in terms of the players' strategy sets and utilities, and the extensive form, which models a game in a more detailed way as a tree. In fact, it is standard to formalize static games with the normal form and dynamic games with the extensive form. Secondly, solution concepts are developed to solve models of games in the sense of identifying the choices that should be taken by rational players. Indeed, the ultimate objective of the classical approach to game theory, which is of normative character, is the development of a solution concept that is capable of identifying a unique choice for every player in an arbitrary game. However, given the large variety of games, it is not at all certain whether it is possible to device a solution concept with such universal capability. Alternatively, interactive epistemology provides an epistemic approach to game theory of descriptive character. This rather recent discipline analyzes the relation between knowledge, belief and choice of game-playing agents in an epistemic framework. The description of the players' choices in a given game relative to various epistemic assumptions constitutes the fundamental problem addressed by an epistemic approach to game theory. In a general sense, the objective of interactive epistemology consists in characterizing existing game-theoretic solution concepts in terms of epistemic assumptions as well as in proposing novel solution concepts by studying the game-theoretic implications of refined or new epistemic hypotheses. Intuitively, an epistemic model of a game can be interpreted as representing the reasoning of the players. Indeed, before making a decision in a game, the players reason about the game and their respective opponents, given their knowledge and beliefs. Precisely these epistemic mental states on which players base their decisions are explicitly expressible in an epistemic framework. In this PhD thesis, we consider an epistemic approach to game theory from a foundational point of view. In Chapter 1, basic game-theoretic notions as well as Aumann's epistemic framework for games are expounded and illustrated. Also, Aumann's sufficient conditions for backward induction are presented and his conceptual views discussed. In Chapter 2, Aumann's interactive epistemology is conceptually analyzed. In Chapter 3, which is based on joint work with Conrad Heilmann, a three-stage account for dynamic games is introduced and a type-based epistemic model is extended with a notion of agent connectedness. Then, sufficient conditions for backward induction are derived. In Chapter 4, which is based on joint work with Jérémie Cabessa, a topological approach to interactive epistemology is initiated. In particular, the epistemic-topological operator limit knowledge is defined and some implications for games considered. In Chapter 5, which is based on joint work with Jérémie Cabessa and Andrés Perea, Aumann's impossibility theorem on agreeing to disagree is revisited and weakened in the sense that possible contexts are provided in which agents can indeed agree to disagree.
Advanced mapping of environmental data: Geostatistics, Machine Learning and Bayesian Maximum Entropy
Resumo:
This book combines geostatistics and global mapping systems to present an up-to-the-minute study of environmental data. Featuring numerous case studies, the reference covers model dependent (geostatistics) and data driven (machine learning algorithms) analysis techniques such as risk mapping, conditional stochastic simulations, descriptions of spatial uncertainty and variability, artificial neural networks (ANN) for spatial data, Bayesian maximum entropy (BME), and more.
Resumo:
A growing number of studies have been addressing the relationship between theory of mind (TOM) and executive functions (EF) in patients with acquired neurological pathology. In order to provide a global overview on the main findings, we conducted a systematic review on group studies where we aimed to (1) evaluate the patterns of impaired and preserved abilities of both TOM and EF in groups of patients with acquired neurological pathology and (2) investigate the existence of particular relations between different EF domains and TOM tasks. The search was conducted in Pubmed/Medline. A total of 24 articles met the inclusion criteria. We considered for analysis classical clinically accepted TOM tasks (first- and second-order false belief stories, the Faux Pas test, Happe's stories, the Mind in the Eyes task, and Cartoon's tasks) and EF domains (updating, shifting, inhibition, and access). The review suggests that (1) EF and TOM appear tightly associated. However, the few dissociations observed suggest they cannot be reduced to a single function; (2) no executive subprocess could be specifically associated with TOM performances; (3) the first-order false belief task and the Happe's story task seem to be less sensitive to neurological pathologies and less associated to EF. Even though the analysis of the reviewed studies demonstrates a close relationship between TOM and EF in patients with acquired neurological pathology, the nature of this relationship must be further investigated. Studies investigating ecological consequences of TOM and EF deficits, and intervention researches may bring further contributions to this question.