11 resultados para LEARNING-PROBLEMS

em Université de Lausanne, Switzerland


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The capacity to learn to associate sensory perceptions with appropriate motor actions underlies the success of many animal species, from insects to humans. The evolutionary significance of learning has long been a subject of interest for evolutionary biologists who emphasize the bene¬fit yielded by learning under changing environmental conditions, where it is required to flexibly switch from one behavior to another. However, two unsolved questions are particularly impor¬tant for improving our knowledge of the evolutionary advantages provided by learning, and are addressed in the present work. First, because it is possible to learn the wrong behavior when a task is too complex, the learning rules and their underlying psychological characteristics that generate truly adaptive behavior must be identified with greater precision, and must be linked to the specific ecological problems faced by each species. A framework for predicting behavior from the definition of a learning rule is developed here. Learning rules capture cognitive features such as the tendency to explore, or the ability to infer rewards associated to unchosen actions. It is shown that these features interact in a non-intuitive way to generate adaptive behavior in social interactions where individuals affect each other's fitness. Such behavioral predictions are used in an evolutionary model to demonstrate that, surprisingly, simple trial-and-error learn¬ing is not always outcompeted by more computationally demanding inference-based learning, when population members interact in pairwise social interactions. A second question in the evolution of learning is its link with and relative advantage compared to other simpler forms of phenotypic plasticity. After providing a conceptual clarification on the distinction between genetically determined vs. learned responses to environmental stimuli, a new factor in the evo¬lution of learning is proposed: environmental complexity. A simple mathematical model shows that a measure of environmental complexity, the number of possible stimuli in one's environ¬ment, is critical for the evolution of learning. In conclusion, this work opens roads for modeling interactions between evolving species and their environment in order to predict how natural se¬lection shapes animals' cognitive abilities. - La capacité d'apprendre à associer des sensations perceptives à des actions motrices appropriées est sous-jacente au succès évolutif de nombreuses espèces, depuis les insectes jusqu'aux êtres hu¬mains. L'importance évolutive de l'apprentissage est depuis longtemps un sujet d'intérêt pour les biologistes de l'évolution, et ces derniers mettent l'accent sur le bénéfice de l'apprentissage lorsque les conditions environnementales sont changeantes, car dans ce cas il est nécessaire de passer de manière flexible d'un comportement à l'autre. Cependant, deux questions non résolues sont importantes afin d'améliorer notre savoir quant aux avantages évolutifs procurés par l'apprentissage. Premièrement, puisqu'il est possible d'apprendre un comportement incorrect quand une tâche est trop complexe, les règles d'apprentissage qui permettent d'atteindre un com¬portement réellement adaptatif doivent être identifiées avec une plus grande précision, et doivent être mises en relation avec les problèmes écologiques spécifiques rencontrés par chaque espèce. Un cadre théorique ayant pour but de prédire le comportement à partir de la définition d'une règle d'apprentissage est développé ici. Il est démontré que les caractéristiques cognitives, telles que la tendance à explorer ou la capacité d'inférer les récompenses liées à des actions non ex¬périmentées, interagissent de manière non-intuitive dans les interactions sociales pour produire des comportements adaptatifs. Ces prédictions comportementales sont utilisées dans un modèle évolutif afin de démontrer que, de manière surprenante, l'apprentissage simple par essai-et-erreur n'est pas toujours battu par l'apprentissage basé sur l'inférence qui est pourtant plus exigeant en puissance de calcul, lorsque les membres d'une population interagissent socialement par pair. Une deuxième question quant à l'évolution de l'apprentissage concerne son lien et son avantage relatif vis-à-vis d'autres formes plus simples de plasticité phénotypique. Après avoir clarifié la distinction entre réponses aux stimuli génétiquement déterminées ou apprises, un nouveau fac¬teur favorisant l'évolution de l'apprentissage est proposé : la complexité environnementale. Un modèle mathématique permet de montrer qu'une mesure de la complexité environnementale - le nombre de stimuli rencontrés dans l'environnement - a un rôle fondamental pour l'évolution de l'apprentissage. En conclusion, ce travail ouvre de nombreuses perspectives quant à la mo¬délisation des interactions entre les espèces en évolution et leur environnement, dans le but de comprendre comment la sélection naturelle façonne les capacités cognitives des animaux.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The potential of type-2 fuzzy sets for managing high levels of uncertainty in the subjective knowledge of experts or of numerical information has focused on control and pattern classification systems in recent years. One of the main challenges in designing a type-2 fuzzy logic system is how to estimate the parameters of type-2 fuzzy membership function (T2MF) and the Footprint of Uncertainty (FOU) from imperfect and noisy datasets. This paper presents an automatic approach for learning and tuning Gaussian interval type-2 membership functions (IT2MFs) with application to multi-dimensional pattern classification problems. T2MFs and their FOUs are tuned according to the uncertainties in the training dataset by a combination of genetic algorithm (GA) and crossvalidation techniques. In our GA-based approach, the structure of the chromosome has fewer genes than other GA methods and chromosome initialization is more precise. The proposed approach addresses the application of the interval type-2 fuzzy logic system (IT2FLS) for the problem of nodule classification in a lung Computer Aided Detection (CAD) system. The designed IT2FLS is compared with its type-1 fuzzy logic system (T1FLS) counterpart. The results demonstrate that the IT2FLS outperforms the T1FLS by more than 30% in terms of classification accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reports on the purpose, design, methodology and target audience of E-learning courses in forensic interpretation offered by the authors since 2010, including practical experiences made throughout the implementation period of this project. This initiative was motivated by the fact that reporting results of forensic examinations in a logically correct and scientifically rigorous way is a daily challenge for any forensic practitioner. Indeed, interpretation of raw data and communication of findings in both written and oral statements are topics where knowledge and applied skills are needed. Although most forensic scientists hold educational records in traditional sciences, only few actually followed full courses that focussed on interpretation issues. Such courses should include foundational principles and methodology - including elements of forensic statistics - for the evaluation of forensic data in a way that is tailored to meet the needs of the criminal justice system. In order to help bridge this gap, the authors' initiative seeks to offer educational opportunities that allow practitioners to acquire knowledge and competence in the current approaches to the evaluation and interpretation of forensic findings. These cover, among other aspects, probabilistic reasoning (including Bayesian networks and other methods of forensic statistics, tools and software), case pre-assessment, skills in the oral and written communication of uncertainty, and the development of independence and self-confidence to solve practical inference problems. E-learning was chosen as a general format because it helps to form a trans-institutional online-community of practitioners from varying forensic disciplines and workfield experience such as reporting officers, (chief) scientists, forensic coordinators, but also lawyers who all can interact directly from their personal workplaces without consideration of distances, travel expenses or time schedules. In the authors' experience, the proposed learning initiative supports participants in developing their expertise and skills in forensic interpretation, but also offers an opportunity for the associated institutions and the forensic community to reinforce the development of a harmonized view with regard to interpretation across forensic disciplines, laboratories and judicial systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents multiple kernel learning (MKL) regression as an exploratory spatial data analysis and modelling tool. The MKL approach is introduced as an extension of support vector regression, where MKL uses dedicated kernels to divide a given task into sub-problems and to treat them separately in an effective way. It provides better interpretability to non-linear robust kernel regression at the cost of a more complex numerical optimization. In particular, we investigate the use of MKL as a tool that allows us to avoid using ad-hoc topographic indices as covariables in statistical models in complex terrains. Instead, MKL learns these relationships from the data in a non-parametric fashion. A study on data simulated from real terrain features confirms the ability of MKL to enhance the interpretability of data-driven models and to aid feature selection without degrading predictive performances. Here we examine the stability of the MKL algorithm with respect to the number of training data samples and to the presence of noise. The results of a real case study are also presented, where MKL is able to exploit a large set of terrain features computed at multiple spatial scales, when predicting mean wind speed in an Alpine region.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Radioactive soil-contamination mapping and risk assessment is a vital issue for decision makers. Traditional approaches for mapping the spatial concentration of radionuclides employ various regression-based models, which usually provide a single-value prediction realization accompanied (in some cases) by estimation error. Such approaches do not provide the capability for rigorous uncertainty quantification or probabilistic mapping. Machine learning is a recent and fast-developing approach based on learning patterns and information from data. Artificial neural networks for prediction mapping have been especially powerful in combination with spatial statistics. A data-driven approach provides the opportunity to integrate additional relevant information about spatial phenomena into a prediction model for more accurate spatial estimates and associated uncertainty. Machine-learning algorithms can also be used for a wider spectrum of problems than before: classification, probability density estimation, and so forth. Stochastic simulations are used to model spatial variability and uncertainty. Unlike regression models, they provide multiple realizations of a particular spatial pattern that allow uncertainty and risk quantification. This paper reviews the most recent methods of spatial data analysis, prediction, and risk mapping, based on machine learning and stochastic simulations in comparison with more traditional regression models. The radioactive fallout from the Chernobyl Nuclear Power Plant accident is used to illustrate the application of the models for prediction and classification problems. This fallout is a unique case study that provides the challenging task of analyzing huge amounts of data ('hard' direct measurements, as well as supplementary information and expert estimates) and solving particular decision-oriented problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Randomized controlled trials (RCTs) may be discontinued because of apparent harm, benefit, or futility. Other RCTs are discontinued early because of insufficient recruitment. Trial discontinuation has ethical implications, because participants consent on the premise of contributing to new medical knowledge, Research Ethics Committees (RECs) spend considerable effort reviewing study protocols, and limited resources for conducting research are wasted. Currently, little is known regarding the frequency and characteristics of discontinued RCTs. METHODS/DESIGN: Our aims are, first, to determine the prevalence of RCT discontinuation for specific reasons; second, to determine whether the risk of RCT discontinuation for specific reasons differs between investigator- and industry-initiated RCTs; third, to identify risk factors for RCT discontinuation due to insufficient recruitment; fourth, to determine at what stage RCTs are discontinued; and fifth, to examine the publication history of discontinued RCTs.We are currently assembling a multicenter cohort of RCTs based on protocols approved between 2000 and 2002/3 by 6 RECs in Switzerland, Germany, and Canada. We are extracting data on RCT characteristics and planned recruitment for all included protocols. Completion and publication status is determined using information from correspondence between investigators and RECs, publications identified through literature searches, or by contacting the investigators. We will use multivariable regression models to identify risk factors for trial discontinuation due to insufficient recruitment. We aim to include over 1000 RCTs of which an anticipated 150 will have been discontinued due to insufficient recruitment. DISCUSSION: Our study will provide insights into the prevalence and characteristics of RCTs that were discontinued. Effective recruitment strategies and the anticipation of problems are key issues in the planning and evaluation of trials by investigators, Clinical Trial Units, RECs and funding agencies. Identification and modification of barriers to successful study completion at an early stage could help to reduce the risk of trial discontinuation, save limited resources, and enable RCTs to better meet their ethical requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.