17 resultados para learning theory
em Université de Lausanne, Switzerland
Resumo:
In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).
Resumo:
BACKGROUND: The structure and organisation of ecological interactions within an ecosystem is modified by the evolution and coevolution of the individual species it contains. Understanding how historical conditions have shaped this architecture is vital for understanding system responses to change at scales from the microbial upwards. However, in the absence of a group selection process, the collective behaviours and ecosystem functions exhibited by the whole community cannot be organised or adapted in a Darwinian sense. A long-standing open question thus persists: Are there alternative organising principles that enable us to understand and predict how the coevolution of the component species creates and maintains complex collective behaviours exhibited by the ecosystem as a whole? RESULTS: Here we answer this question by incorporating principles from connectionist learning, a previously unrelated discipline already using well-developed theories on how emergent behaviours arise in simple networks. Specifically, we show conditions where natural selection on ecological interactions is functionally equivalent to a simple type of connectionist learning, 'unsupervised learning', well-known in neural-network models of cognitive systems to produce many non-trivial collective behaviours. Accordingly, we find that a community can self-organise in a well-defined and non-trivial sense without selection at the community level; its organisation can be conditioned by past experience in the same sense as connectionist learning models habituate to stimuli. This conditioning drives the community to form a distributed ecological memory of multiple past states, causing the community to: a) converge to these states from any random initial composition; b) accurately restore historical compositions from small fragments; c) recover a state composition following disturbance; and d) to correctly classify ambiguous initial compositions according to their similarity to learned compositions. We examine how the formation of alternative stable states alters the community's response to changing environmental forcing, and we identify conditions under which the ecosystem exhibits hysteresis with potential for catastrophic regime shifts. CONCLUSIONS: This work highlights the potential of connectionist theory to expand our understanding of evo-eco dynamics and collective ecological behaviours. Within this framework we find that, despite not being a Darwinian unit, ecological communities can behave like connectionist learning systems, creating internal conditions that habituate to past environmental conditions and actively recalling those conditions. REVIEWERS: This article was reviewed by Prof. Ricard V Solé, Universitat Pompeu Fabra, Barcelona and Prof. Rob Knight, University of Colorado, Boulder.
Resumo:
The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.
Resumo:
RésuméCette étude a pour objectif d'observer l'évolution des actes agressifs dans deux sports d'équipes en fonction de facteurs situationnels (périodes de jeu, lieu de la faute, état du score) et du type d'agressions (instrumentales, hostile). 60 matchs professionnels de football et de hockey sur glace ont été filmés puis analysés à l'aide de grilles d'observation différenciant les deux types d'agressions. Les résultats révèlent que dans ces deux sports, les agressions instrumentales sont plus fréquentes dans les zones importantes du terrain (milieu ou défense) ou lorsque le score est serré. En revanche, les agressions hostiles ne varient pas (ou peu) selon ces facteurs. Les résultats sont discutés au regard de la théorie de l'apprentissage social et de l'hypothèse frustration-agression.AbstractThis study aims at examining observed aggression in two team sports as a function of situational triggers (periods, zones of field, games score) and of type of aggression (instrumental, hostile). 60 soccer and ice hockey games were recorded and analyzed using a grid that differentiates the two types of aggression. The results revealed that theses two sports, instrumental aggressions were more frequent in important zones of field (neutral or defensive ones) and in tied score situations. However, no difference was found for hostile aggression according to these factors. The discussion focused on the social learning theory and frustration-aggression hypothesis.
Resumo:
Recently, kernel-based Machine Learning methods have gained great popularity in many data analysis and data mining fields: pattern recognition, biocomputing, speech and vision, engineering, remote sensing etc. The paper describes the use of kernel methods to approach the processing of large datasets from environmental monitoring networks. Several typical problems of the environmental sciences and their solutions provided by kernel-based methods are considered: classification of categorical data (soil type classification), mapping of environmental and pollution continuous information (pollution of soil by radionuclides), mapping with auxiliary information (climatic data from Aral Sea region). The promising developments, such as automatic emergency hot spot detection and monitoring network optimization are discussed as well.
Resumo:
To be diagnostically useful, structural MRI must reliably distinguish Alzheimer's disease (AD) from normal aging in individual scans. Recent advances in statistical learning theory have led to the application of support vector machines to MRI for detection of a variety of disease states. The aims of this study were to assess how successfully support vector machines assigned individual diagnoses and to determine whether data-sets combined from multiple scanners and different centres could be used to obtain effective classification of scans. We used linear support vector machines to classify the grey matter segment of T1-weighted MR scans from pathologically proven AD patients and cognitively normal elderly individuals obtained from two centres with different scanning equipment. Because the clinical diagnosis of mild AD is difficult we also tested the ability of support vector machines to differentiate control scans from patients without post-mortem confirmation. Finally we sought to use these methods to differentiate scans between patients suffering from AD from those with frontotemporal lobar degeneration. Up to 96% of pathologically verified AD patients were correctly classified using whole brain images. Data from different centres were successfully combined achieving comparable results from the separate analyses. Importantly, data from one centre could be used to train a support vector machine to accurately differentiate AD and normal ageing scans obtained from another centre with different subjects and different scanner equipment. Patients with mild, clinically probable AD and age/sex matched controls were correctly separated in 89% of cases which is compatible with published diagnosis rates in the best clinical centres. This method correctly assigned 89% of patients with post-mortem confirmed diagnosis of either AD or frontotemporal lobar degeneration to their respective group. Our study leads to three conclusions: Firstly, support vector machines successfully separate patients with AD from healthy aging subjects. Secondly, they perform well in the differential diagnosis of two different forms of dementia. Thirdly, the method is robust and can be generalized across different centres. This suggests an important role for computer based diagnostic image analysis for clinical practice.
Resumo:
Due to the advances in sensor networks and remote sensing technologies, the acquisition and storage rates of meteorological and climatological data increases every day and ask for novel and efficient processing algorithms. A fundamental problem of data analysis and modeling is the spatial prediction of meteorological variables in complex orography, which serves among others to extended climatological analyses, for the assimilation of data into numerical weather prediction models, for preparing inputs to hydrological models and for real time monitoring and short-term forecasting of weather.In this thesis, a new framework for spatial estimation is proposed by taking advantage of a class of algorithms emerging from the statistical learning theory. Nonparametric kernel-based methods for nonlinear data classification, regression and target detection, known as support vector machines (SVM), are adapted for mapping of meteorological variables in complex orography.With the advent of high resolution digital elevation models, the field of spatial prediction met new horizons. In fact, by exploiting image processing tools along with physical heuristics, an incredible number of terrain features which account for the topographic conditions at multiple spatial scales can be extracted. Such features are highly relevant for the mapping of meteorological variables because they control a considerable part of the spatial variability of meteorological fields in the complex Alpine orography. For instance, patterns of orographic rainfall, wind speed and cold air pools are known to be correlated with particular terrain forms, e.g. convex/concave surfaces and upwind sides of mountain slopes.Kernel-based methods are employed to learn the nonlinear statistical dependence which links the multidimensional space of geographical and topographic explanatory variables to the variable of interest, that is the wind speed as measured at the weather stations or the occurrence of orographic rainfall patterns as extracted from sequences of radar images. Compared to low dimensional models integrating only the geographical coordinates, the proposed framework opens a way to regionalize meteorological variables which are multidimensional in nature and rarely show spatial auto-correlation in the original space making the use of classical geostatistics tangled.The challenges which are explored during the thesis are manifolds. First, the complexity of models is optimized to impose appropriate smoothness properties and reduce the impact of noisy measurements. Secondly, a multiple kernel extension of SVM is considered to select the multiscale features which explain most of the spatial variability of wind speed. Then, SVM target detection methods are implemented to describe the orographic conditions which cause persistent and stationary rainfall patterns. Finally, the optimal splitting of the data is studied to estimate realistic performances and confidence intervals characterizing the uncertainty of predictions.The resulting maps of average wind speeds find applications within renewable resources assessment and opens a route to decrease the temporal scale of analysis to meet hydrological requirements. Furthermore, the maps depicting the susceptibility to orographic rainfall enhancement can be used to improve current radar-based quantitative precipitation estimation and forecasting systems and to generate stochastic ensembles of precipitation fields conditioned upon the orography.
Resumo:
Spatial data analysis mapping and visualization is of great importance in various fields: environment, pollution, natural hazards and risks, epidemiology, spatial econometrics, etc. A basic task of spatial mapping is to make predictions based on some empirical data (measurements). A number of state-of-the-art methods can be used for the task: deterministic interpolations, methods of geostatistics: the family of kriging estimators (Deutsch and Journel, 1997), machine learning algorithms such as artificial neural networks (ANN) of different architectures, hybrid ANN-geostatistics models (Kanevski and Maignan, 2004; Kanevski et al., 1996), etc. All the methods mentioned above can be used for solving the problem of spatial data mapping. Environmental empirical data are always contaminated/corrupted by noise, and often with noise of unknown nature. That's one of the reasons why deterministic models can be inconsistent, since they treat the measurements as values of some unknown function that should be interpolated. Kriging estimators treat the measurements as the realization of some spatial randomn process. To obtain the estimation with kriging one has to model the spatial structure of the data: spatial correlation function or (semi-)variogram. This task can be complicated if there is not sufficient number of measurements and variogram is sensitive to outliers and extremes. ANN is a powerful tool, but it also suffers from the number of reasons. of a special type ? multiplayer perceptrons ? are often used as a detrending tool in hybrid (ANN+geostatistics) models (Kanevski and Maignank, 2004). Therefore, development and adaptation of the method that would be nonlinear and robust to noise in measurements, would deal with the small empirical datasets and which has solid mathematical background is of great importance. The present paper deals with such model, based on Statistical Learning Theory (SLT) - Support Vector Regression. SLT is a general mathematical framework devoted to the problem of estimation of the dependencies from empirical data (Hastie et al, 2004; Vapnik, 1998). SLT models for classification - Support Vector Machines - have shown good results on different machine learning tasks. The results of SVM classification of spatial data are also promising (Kanevski et al, 2002). The properties of SVM for regression - Support Vector Regression (SVR) are less studied. First results of the application of SVR for spatial mapping of physical quantities were obtained by the authorsin for mapping of medium porosity (Kanevski et al, 1999), and for mapping of radioactively contaminated territories (Kanevski and Canu, 2000). The present paper is devoted to further understanding of the properties of SVR model for spatial data analysis and mapping. Detailed description of the SVR theory can be found in (Cristianini and Shawe-Taylor, 2000; Smola, 1996) and basic equations for the nonlinear modeling are given in section 2. Section 3 discusses the application of SVR for spatial data mapping on the real case study - soil pollution by Cs137 radionuclide. Section 4 discusses the properties of the modelapplied to noised data or data with outliers.
Resumo:
The book presents the state of the art in machine learning algorithms (artificial neural networks of different architectures, support vector machines, etc.) as applied to the classification and mapping of spatially distributed environmental data. Basic geostatistical algorithms are presented as well. New trends in machine learning and their application to spatial data are given, and real case studies based on environmental and pollution data are carried out. The book provides a CD-ROM with the Machine Learning Office software, including sample sets of data, that will allow both students and researchers to put the concepts rapidly to practice.
Resumo:
In the parallel map theory, the hippocampus encodes space with 2 mapping systems. The bearing map is constructed primarily in the dentate gyrus from directional cues such as stimulus gradients. The sketch map is constructed within the hippocampus proper from positional cues. The integrated map emerges when data from the bearing and sketch maps are combined. Because the component maps work in parallel, the impairment of one can reveal residual learning by the other. Such parallel function may explain paradoxes of spatial learning, such as learning after partial hippocampal lesions, taxonomic and sex differences in spatial learning, and the function of hippocampal neurogenesis. By integrating evidence from physiology to phylogeny, the parallel map theory offers a unified explanation for hippocampal function.
Resumo:
The Baldwin effect can be observed if phenotypic learning influences the evolutionary fitness of individuals, which can in turn accelerate or decelerate evolutionary change. Evidence for both learning-induced acceleration and deceleration can be found in the literature. Although the results for both outcomes were supported by specific mathematical or simulation models, no general predictions have been achieved so far. Here we propose a general framework to predict whether evolution benefits from learning or not. It is formulated in terms of the gain function, which quantifies the proportional change of fitness due to learning depending on the genotype value. With an inductive proof we show that a positive gain-function derivative implies that learning accelerates evolution, and a negative one implies deceleration under the condition that the population is distributed on a monotonic part of the fitness landscape. We show that the gain-function framework explains the results of several specific simulation models. We also use the gain-function framework to shed some light on the results of a recent biological experiment with fruit flies.
Resumo:
The class of Schoenberg transformations, embedding Euclidean distances into higher dimensional Euclidean spaces, is presented, and derived from theorems on positive definite and conditionally negative definite matrices. Original results on the arc lengths, angles and curvature of the transformations are proposed, and visualized on artificial data sets by classical multidimensional scaling. A distance-based discriminant algorithm and a robust multidimensional centroid estimate illustrate the theory, closely connected to the Gaussian kernels of Machine Learning.
Resumo:
As a thorough aggregation of probability and graph theory, Bayesian networks currently enjoy widespread interest as a means for studying factors that affect the coherent evaluation of scientific evidence in forensic science. Paper I of this series of papers intends to contribute to the discussion of Bayesian networks as a framework that is helpful for both illustrating and implementing statistical procedures that are commonly employed for the study of uncertainties (e.g. the estimation of unknown quantities). While the respective statistical procedures are widely described in literature, the primary aim of this paper is to offer an essentially non-technical introduction on how interested readers may use these analytical approaches - with the help of Bayesian networks - for processing their own forensic science data. Attention is mainly drawn to the structure and underlying rationale of a series of basic and context-independent network fragments that users may incorporate as building blocs while constructing larger inference models. As an example of how this may be done, the proposed concepts will be used in a second paper (Part II) for specifying graphical probability networks whose purpose is to assist forensic scientists in the evaluation of scientific evidence encountered in the context of forensic document examination (i.e. results of the analysis of black toners present on printed or copied documents).
Resumo:
Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.
Resumo:
In order to understand the development of non-genetically encoded actions during an animal's lifespan, it is necessary to analyze the dynamics and evolution of learning rules producing behavior. Owing to the intrinsic stochastic and frequency-dependent nature of learning dynamics, these rules are often studied in evolutionary biology via agent-based computer simulations. In this paper, we show that stochastic approximation theory can help to qualitatively understand learning dynamics and formulate analytical models for the evolution of learning rules. We consider a population of individuals repeatedly interacting during their lifespan, and where the stage game faced by the individuals fluctuates according to an environmental stochastic process. Individuals adjust their behavioral actions according to learning rules belonging to the class of experience-weighted attraction learning mechanisms, which includes standard reinforcement and Bayesian learning as special cases. We use stochastic approximation theory in order to derive differential equations governing action play probabilities, which turn out to have qualitative features of mutator-selection equations. We then perform agent-based simulations to find the conditions where the deterministic approximation is closest to the original stochastic learning process for standard 2-action 2-player fluctuating games, where interaction between learning rules and preference reversal may occur. Finally, we analyze a simplified model for the evolution of learning in a producer-scrounger game, which shows that the exploration rate can interact in a non-intuitive way with other features of co-evolving learning rules. Overall, our analyses illustrate the usefulness of applying stochastic approximation theory in the study of animal learning.