903 resultados para Learning set


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we propose two active learning algorithms for semiautomatic definition of training samples in remote sensing image classification. Based on predefined heuristics, the classifier ranks the unlabeled pixels and automatically chooses those that are considered the most valuable for its improvement. Once the pixels have been selected, the analyst labels them manually and the process is iterated. Starting with a small and nonoptimal training set, the model itself builds the optimal set of samples which minimizes the classification error. We have applied the proposed algorithms to a variety of remote sensing data, including very high resolution and hyperspectral images, using support vector machines. Experimental results confirm the consistency of the methods. The required number of training samples can be reduced to 10% using the methods proposed, reaching the same level of accuracy as larger data sets. A comparison with a state-of-the-art active learning method, margin sampling, is provided, highlighting advantages of the methods proposed. The effect of spatial resolution and separability of the classes on the quality of the selection of pixels is also discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose a novel unsupervised approach to learning domain-specific ontologies from large open-domain text collections. The method is based on the joint exploitation of Semantic Domains and Super Sense Tagging for Information Retrieval tasks. Our approach is able to retrieve domain specific terms and concepts while associating them with a set of high level ontological types, named supersenses, providing flat ontologies characterized by very high accuracy and pertinence to the domain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents multiple kernel learning (MKL) regression as an exploratory spatial data analysis and modelling tool. The MKL approach is introduced as an extension of support vector regression, where MKL uses dedicated kernels to divide a given task into sub-problems and to treat them separately in an effective way. It provides better interpretability to non-linear robust kernel regression at the cost of a more complex numerical optimization. In particular, we investigate the use of MKL as a tool that allows us to avoid using ad-hoc topographic indices as covariables in statistical models in complex terrains. Instead, MKL learns these relationships from the data in a non-parametric fashion. A study on data simulated from real terrain features confirms the ability of MKL to enhance the interpretability of data-driven models and to aid feature selection without degrading predictive performances. Here we examine the stability of the MKL algorithm with respect to the number of training data samples and to the presence of noise. The results of a real case study are also presented, where MKL is able to exploit a large set of terrain features computed at multiple spatial scales, when predicting mean wind speed in an Alpine region.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our work is focused on alleviating the workload for designers of adaptive courses on the complexity task of authoring adaptive learning designs adjusted to specific user characteristics and the user context. We propose an adaptation platform that consists in a set of intelligent agents where each agent carries out an independent adaptation task. The agents apply machine learning techniques to support the user modelling for the adaptation process

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Context: Ovarian tumors (OT) typing is a competency expected from pathologists, with significant clinical implications. OT however come in numerous different types, some rather rare, with the consequence of few opportunities for practice in some departments. Aim: Our aim was to design a tool for pathologists to train in less common OT typing. Method and Results: Representative slides of 20 less common OT were scanned (Nano Zoomer Digital Hamamatsu®) and the diagnostic algorithm proposed by Young and Scully applied to each case (Young RH and Scully RE, Seminars in Diagnostic Pathology 2001, 18: 161-235) to include: recognition of morphological pattern(s); shortlisting of differential diagnosis; proposition of relevant immunohistochemical markers. The next steps of this project will be: evaluation of the tool in several post-graduate training centers in Europe and Québec; improvement of its design based on evaluation results; diffusion to a larger public. Discussion: In clinical medicine, solving many cases is recognized as of utmost importance for a novice to become an expert. This project relies on the virtual slides technology to provide pathologists with a learning tool aimed at increasing their skills in OT typing. After due evaluation, this model might be extended to other uncommon tumors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Learning is predicted to affect manifold ecological and evolutionary processes, but the extent to which animals rely on learning in nature remains poorly known, especially for short-lived non-social invertebrates. This is in particular the case for Drosophila, a favourite laboratory system to study molecular mechanisms of learning. Here we tested whether Drosophila melanogaster use learned information to choose food while free-flying in a large greenhouse emulating the natural environment. In a series of experiments flies were first given an opportunity to learn which of two food odours was associated with good versus unpalatable taste; subsequently, their preference for the two odours was assessed with olfactory traps set up in the greenhouse. Flies that had experienced palatable apple-flavoured food and unpalatable orange-flavoured food were more likely to be attracted to the odour of apple than flies with the opposite experience. This was true both when the flies first learned in the laboratory and were then released and recaptured in the greenhouse, and when the learning occurred under free-flying conditions in the greenhouse. Furthermore, flies retained the memory of their experience while exploring the greenhouse overnight in the absence of focal odours, pointing to the involvement of consolidated memory. These results support the notion that even small, short lived insects which are not central-place foragers make use of learned cues in their natural environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A reinforcement learning (RL) method was used to train a virtual character to move participants to a specified location. The virtual environment depicted an alleyway displayed through a wide field-of-view head-tracked stereo head-mounted display. Based on proxemics theory, we predicted that when the character approached within a personal or intimate distance to the participants, they would be inclined to move backwards out of the way. We carried out a between-groups experiment with 30 female participants, with 10 assigned arbitrarily to each of the following three groups: In the Intimate condition the character could approach within 0.38m and in the Social condition no nearer than 1.2m. In the Random condition the actions of the virtual character were chosen randomly from among the same set as in the RL method, and the virtual character could approach within 0.38m. The experiment continued in each case until the participant either reached the target or 7 minutes had elapsed. The distributions of the times taken to reach the target showed significant differences between the three groups, with 9 out of 10 in the Intimate condition reaching the target significantly faster than the 6 out of 10 who reached the target in the Social condition. Only 1 out of 10 in the Random condition reached the target. The experiment is an example of applied presence theory: we rely on the many findings that people tend to respond realistically in immersive virtual environments, and use this to get people to achieve a task of which they had been unaware. This method opens up the door for many such applications where the virtual environment adapts to the responses of the human participants with the aim of achieving particular goals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recognition of environmental sounds is believed to proceed through discrimination steps from broad to more narrow categories. Very little is known about the neural processes that underlie fine-grained discrimination within narrow categories or about their plasticity in relation to newly acquired expertise. We investigated how the cortical representation of birdsongs is modulated by brief training to recognize individual species. During a 60-minute session, participants learned to recognize a set of birdsongs; they improved significantly their performance for trained (T) but not control species (C), which were counterbalanced across participants. Auditory evoked potentials (AEPs) were recorded during pre- and post-training sessions. Pre vs. post changes in AEPs were significantly different between T and C i) at 206-232ms post stimulus onset within a cluster on the anterior part of the left superior temporal gyrus; ii) at 246-291ms in the left middle frontal gyrus; and iii) 512-545ms in the left middle temporal gyrus as well as bilaterally in the cingulate cortex. All effects were driven by weaker activity for T than C species. Thus, expertise in discriminating T species modulated early stages of semantic processing, during and immediately after the time window that sustains the discrimination between human vs. animal vocalizations. Moreover, the training-induced plasticity is reflected by the sharpening of a left lateralized semantic network, including the anterior part of the temporal convexity and the frontal cortex. Training to identify birdsongs influenced, however, also the processing of C species, but at a much later stage. Correct discrimination of untrained sounds seems to require an additional step which results from lower-level features analysis such as apperception. We therefore suggest that the access to objects within an auditory semantic category is different and depends on subject's level of expertise. More specifically, correct intra-categorical auditory discrimination for untrained items follows the temporal hierarchy and transpires in a late stage of semantic processing. On the other hand, correct categorization of individually trained stimuli occurs earlier, during a period contemporaneous with human vs. animal vocalization discrimination, and involves a parallel semantic pathway requiring expertise.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Peer-reviewed

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Open Innovation is a relatively new concept which involves a change of paradigm in the R+D+i processes of companies whose aim is to create new technologies or new processes. If to this change, we add the need for innovation in the new green and sustainability economy, and we set out to create a collaborative platform with a learning space where this can happen, we will be facing an overwhelming challenge which requires the application of intelligent programming technologies and languages at the service of education.The aim of the Green IDI (Green Open Innovation) ¿ Economic development and job creation vector in SMEs, based on the environment and sustainability project is to create a platform where companies and individual researchers can perform open innovation processes in the field of sustainability and the environment.The Green IDI (Green Open Innovation) project is funded under the program INNPACTO by the Ministry of Science and Innovation of Spain and is being developed through a consortium formed by the following institutions: GRUPO ICA; COMPARTIA; GRUPO INTERCOM; CETAQUA and the Instituto de Investigación en Inteligencia Artificial (IIIA) from Consejo Superior de Investigaciones Científicas (CSIC). Also the consortium include FUNDACIÓ PRIVADA BARCELONA DIGITAL; PIMEC and UNIVERSITAT OBERTA DE CATALUNYA (UOC).Sustainability and positive action for the environment are considered the principle vector of economic development for companies. As Nicolás Scoli says (2007) ¿in short, preventing unnecessary consumption and the efficient consumption of resources means producing greater wealth with less. Both effects lead to reduced pollution linked to production and consumption¿.The Spanish Sustainable Development Strategy (EEDS) plan defends consumption and sustainable production linked to social and economic development by adhering to the commitment not to endanger ecosystems and abolishing the idea that economic growth is directly proportional to the deterioration of the environment.Uniting the Open Innovation and New Green Economy concepts leads to the "Green Open Innovation¿ Platform creation project.This article analyses the concept of open innovation and defines the importance of the new green and sustainable economy. Lastly, it proposes the creation of eLab. The eLab is defined as an Open Green Innovation Platform personal and collaborative education space which is fed by the interactions of users and which enables innovation processes based on new green economy concepts to be carried out.The creation of a personal learning environment such as eLab on the Green Open Innovation Platform meets the need to offer a collaborative space where platform users can improve their skills regarding the environment and sustainability based on collaborative synergies through Information and Communication Technologies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Learning from demonstration becomes increasingly popular as an efficient way of robot programming. Not only a scientific interest acts as an inspiration in this case but also the possibility of producing the machines that would find application in different areas of life: robots helping with daily routine at home, high performance automata in industries or friendly toys for children. One way to teach a robot to fulfill complex tasks is to start with simple training exercises, combining them to form more difficult behavior. The objective of the Master’s thesis work was to study robot programming with visual input. Dynamic movement primitives (DMPs) were chosen as a tool for motion learning and generation. Assuming a movement to be a spring system influenced by an external force, making this system move, DMPs represent the motion as a set of non-linear differential equations. During the experiments the properties of DMP, such as temporal and spacial invariance, were examined. The effect of the DMP parameters, including spring coefficient, damping factor, temporal scaling, on the trajectory generated were studied.