52 resultados para Weakly Supervised Learning
em Université de Lausanne, Switzerland
Resumo:
We show how nonlinear embedding algorithms popular for use with shallow semi-supervised learning techniques such as kernel methods can be applied to deep multilayer architectures, either as a regularizer at the output layer, or on each layer of the architecture. This provides a simple alternative to existing approaches to deep learning whilst yielding competitive error rates compared to those methods, and existing shallow semi-supervised techniques.
Resumo:
This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.
Resumo:
This article presents an experimental study about the classification ability of several classifiers for multi-classclassification of cannabis seedlings. As the cultivation of drug type cannabis is forbidden in Switzerland lawenforcement authorities regularly ask forensic laboratories to determinate the chemotype of a seized cannabisplant and then to conclude if the plantation is legal or not. This classification is mainly performed when theplant is mature as required by the EU official protocol and then the classification of cannabis seedlings is a timeconsuming and costly procedure. A previous study made by the authors has investigated this problematic [1]and showed that it is possible to differentiate between drug type (illegal) and fibre type (legal) cannabis at anearly stage of growth using gas chromatography interfaced with mass spectrometry (GC-MS) based on therelative proportions of eight major leaf compounds. The aims of the present work are on one hand to continueformer work and to optimize the methodology for the discrimination of drug- and fibre type cannabisdeveloped in the previous study and on the other hand to investigate the possibility to predict illegal cannabisvarieties. Seven classifiers for differentiating between cannabis seedlings are evaluated in this paper, namelyLinear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Nearest NeighbourClassification (NNC), Learning Vector Quantization (LVQ), Radial Basis Function Support Vector Machines(RBF SVMs), Random Forest (RF) and Artificial Neural Networks (ANN). The performance of each method wasassessed using the same analytical dataset that consists of 861 samples split into drug- and fibre type cannabiswith drug type cannabis being made up of 12 varieties (i.e. 12 classes). The results show that linear classifiersare not able to manage the distribution of classes in which some overlap areas exist for both classificationproblems. Unlike linear classifiers, NNC and RBF SVMs best differentiate cannabis samples both for 2-class and12-class classifications with average classification results up to 99% and 98%, respectively. Furthermore, RBFSVMs correctly classified into drug type cannabis the independent validation set, which consists of cannabisplants coming from police seizures. In forensic case work this study shows that the discrimination betweencannabis samples at an early stage of growth is possible with fairly high classification performance fordiscriminating between cannabis chemotypes or between drug type cannabis varieties.
Resumo:
Defining an efficient training set is one of the most delicate phases for the success of remote sensing image classification routines. The complexity of the problem, the limited temporal and financial resources, as well as the high intraclass variance can make an algorithm fail if it is trained with a suboptimal dataset. Active learning aims at building efficient training sets by iteratively improving the model performance through sampling. A user-defined heuristic ranks the unlabeled pixels according to a function of the uncertainty of their class membership and then the user is asked to provide labels for the most uncertain pixels. This paper reviews and tests the main families of active learning algorithms: committee, large margin, and posterior probability-based. For each of them, the most recent advances in the remote sensing community are discussed and some heuristics are detailed and tested. Several challenging remote sensing scenarios are considered, including very high spatial resolution and hyperspectral image classification. Finally, guidelines for choosing the good architecture are provided for new and/or unexperienced user.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.
Resumo:
Fluvial deposits are a challenge for modelling flow in sub-surface reservoirs. Connectivity and continuity of permeable bodies have a major impact on fluid flow in porous media. Contemporary object-based and multipoint statistics methods face a problem of robust representation of connected structures. An alternative approach to model petrophysical properties is based on machine learning algorithm ? Support Vector Regression (SVR). Semi-supervised SVR is able to establish spatial connectivity taking into account the prior knowledge on natural similarities. SVR as a learning algorithm is robust to noise and captures dependencies from all available data. Semi-supervised SVR applied to a synthetic fluvial reservoir demonstrated robust results, which are well matched to the flow performance
Resumo:
The present research deals with the review of the analysis and modeling of Swiss franc interest rate curves (IRC) by using unsupervised (SOM, Gaussian Mixtures) and supervised machine (MLP) learning algorithms. IRC are considered as objects embedded into different feature spaces: maturities; maturity-date, parameters of Nelson-Siegel model (NSM). Analysis of NSM parameters and their temporal and clustering structures helps to understand the relevance of model and its potential use for the forecasting. Mapping of IRC in a maturity-date feature space is presented and analyzed for the visualization and forecasting purposes.
Resumo:
We conducted an experiment to assess the use of olfactory traces for spatial orientation in an open environment in rats, Rattus norvegicus. We trained rats to locate a food source at a fixed location from different starting points, in the presence or absence of visual information. A single food source was hidden in an array of 19 petri dishes regularly arranged in an open-field arena. Rats were trained to locate the food source either in white light (with full access to distant visuospatial information) or in darkness (without any visual information). In both cases, the goal was in a fixed location relative to the spatial frame of reference. The results of this experiment revealed that the presence of noncontrolled olfactory traces coherent with the spatial frame of reference enables rats to locate a unique position as accurately in darkness as with full access to visuospatial information. We hypothesize that the olfactory traces complement the use of other orientation mechanisms, such as path integration or the reliance on visuospatial information. This experiment demonstrates that rats can rely on olfactory traces for accurate orientation, and raises questions about the establishment of such traces in the absence of any other orientation mechanism. Copyright 1998 The Association for the Study of Animal Behaviour.
Resumo:
L'objectif principal de ce travail était d'explorer les relations parent-enfant et les processus d'apprentissage familiaux associés aux troubles anxieux. A cet effet, des familles ayant un membre anxieux (la mère ou l'enfant) ont été comparées avec des familles n'ayant aucun membre anxieux. Dans une première étude, l'observation de l'interaction mère-enfant, pendant une situation standardisée de jeu, a révélé que les mères présentant un trouble panique étaient plus susceptibles de se montrer verbalement contrôlantes, critiques et moins sensibles aux besoins de l'enfant, que les mères qui ne présentaient pas de trouble panique. Une deuxième étude a examiné les perceptions des différents membres de la famille quant aux relations au sein de la famille et a indiqué que, par comparaison aux adolescents non-anxieux, les adolescents anxieux étaient plus enclins à éprouver un sentiment d'autonomie individuelle diminué par rapport à leurs parents. Finalement, une troisième étude s'est intéressée à déterminer l'impact d'expériences d'apprentissage moins directes dans l'étiologie de l'anxiété. Les résultats ont indiqué que les mères présentant un trouble panique étaient plus enclines à s'engager dans des comportements qui maintiennent la panique et à impliquer leurs enfants dans ces comportements, que les mères ne présentant pas de trouble panique. En se basant sur des recherches antérieures qui ont établi une relation entre le contrôle parental, la perception de contrôle chez l'enfant et les troubles anxieux, le présent travail non seulement confirme ce lien mais propose également un modèle pour résumer l'état actuel des connaissances concernant les processus familiaux et le développement des troubles anxieux. Deux routes ont été suggérées par lesquelles l'anxiété pourrait être transmise de manière intergénérationnelle. Chacune de ces routes attribue un rôle important à la perception de contrôle chez l'enfant. L'idée est que lorsque les enfants présentent une prédisposition à interpréter le comportement de leurs parents comme hors de leur contrôle, ils seraient plus enclins à développer de l'anxiété. A ce titre, la perception du contrôle représenterait un tampon entre le comportement de contrôle/surprotection des parents et le trouble anxieux chez l'enfant. - The principal objective of the present work was to explore parent-child relationships and family learning processes associated with anxiety disorders. To this purpose, families with and without an anxious family member (mother or child) were compared. In a first study, observation of mother-child interaction, during a standard play situation, revealed that mothers with panic disorder were more likely to display verbal control and criticism, and less likely to display sensitivity toward their children than mothers without panic disorder. A second study examined family members' perceptions of family relationships and indicated that compared to non-anxious adolescents, anxious adolescents were more prone to experience a diminished sense of individual autonomy in relation to their parents. Finally a third study was interested in determining the effect of less direct learning experiences in the aetiology of anxiety. Results indicated that mothers with panic disorder were more likely to engage in panic-maintaining behaviour and to involve their children in this behaviour than mothers without panic disorder. Based on previous research showing a relationship between parental control, children's perception of control, and anxiety disorders, the present work not only further adds evidence to support this link but also proposes a model summarizing the current knowledge concerning family processes and the development of anxiety disorders. Two pathways have been suggested through which anxiety may be intergenerationally transmitted. Both pathways assign an important role to children's perception of control. The idea is that whenever children have a predisposition towards interpreting their parents' behaviour as beyond of their control, they may be more prone to develop anxiety. As such, perceived control may represent a buffer between parental overcontrolling/overprotective behaviours and childhood anxiety disorder.
Resumo:
Locating new wind farms is of crucial importance for energy policies of the next decade. To select the new location, an accurate picture of the wind fields is necessary. However, characterizing wind fields is a difficult task, since the phenomenon is highly nonlinear and related to complex topographical features. In this paper, we propose both a nonparametric model to estimate wind speed at different time instants and a procedure to discover underrepresented topographic conditions, where new measuring stations could be added. Compared to space filling techniques, this last approach privileges optimization of the output space, thus locating new potential measuring sites through the uncertainty of the model itself.
Resumo:
The aim of the present study was to assess the influence of local environmental olfactory cues on place learning in rats. We developed a new experimental design allowing the comparison of the use of local olfactory and visual cues in spatial and discrimination learning. We compared the effect of both types of cues on the discrimination of a single food source in an open-field arena. The goal was either in a fixed or in a variable location, and could be indicated by local olfactory and/or visual cues. The local cues enhanced the discrimination of the goal dish, whether it was in a fixed or in a variable location. However, we did not observe any overshadowing of the spatial information by the local olfactory or visual cue. Rats relied primarily on distant visuospatial information to locate the goal, neglecting local information when it was in conflict with the spatial information.