969 resultados para Supervised learning


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In machine learning, Gaussian process latent variable model (GP-LVM) has been extensively applied in the field of unsupervised dimensionality reduction. When some supervised information, e.g., pairwise constraints or labels of the data, is available, the traditional GP-LVM cannot directly utilize such supervised information to improve the performance of dimensionality reduction. In this case, it is necessary to modify the traditional GP-LVM to make it capable of handing the supervised or semi-supervised learning tasks. For this purpose, we propose a new semi-supervised GP-LVM framework under the pairwise constraints. Through transferring the pairwise constraints in the observed space to the latent space, the constrained priori information on the latent variables can be obtained. Under this constrained priori, the latent variables are optimized by the maximum a posteriori (MAP) algorithm. The effectiveness of the proposed algorithm is demonstrated with experiments on a variety of data sets. © 2010 Elsevier B.V.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This dissertation establishes a novel system for human face learning and recognition based on incremental multilinear Principal Component Analysis (PCA). Most of the existing face recognition systems need training data during the learning process. The system as proposed in this dissertation utilizes an unsupervised or weakly supervised learning approach, in which the learning phase requires a minimal amount of training data. It also overcomes the inability of traditional systems to adapt to the testing phase as the decision process for the newly acquired images continues to rely on that same old training data set. Consequently when a new training set is to be used, the traditional approach will require that the entire eigensystem will have to be generated again. However, as a means to speed up this computational process, the proposed method uses the eigensystem generated from the old training set together with the new images to generate more effectively the new eigensystem in a so-called incremental learning process. In the empirical evaluation phase, there are two key factors that are essential in evaluating the performance of the proposed method: (1) recognition accuracy and (2) computational complexity. In order to establish the most suitable algorithm for this research, a comparative analysis of the best performing methods has been carried out first. The results of the comparative analysis advocated for the initial utilization of the multilinear PCA in our research. As for the consideration of the issue of computational complexity for the subspace update procedure, a novel incremental algorithm, which combines the traditional sequential Karhunen-Loeve (SKL) algorithm with the newly developed incremental modified fast PCA algorithm, was established. In order to utilize the multilinear PCA in the incremental process, a new unfolding method was developed to affix the newly added data at the end of the previous data. The results of the incremental process based on these two methods were obtained to bear out these new theoretical improvements. Some object tracking results using video images are also provided as another challenging task to prove the soundness of this incremental multilinear learning method.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We present and evaluate a novel supervised recurrent neural network architecture, the SARASOM, based on the associative self-organizing map. The performance of the SARASOM is evaluated and compared with the Elman network as well as with a hidden Markov model (HMM) in a number of prediction tasks using sequences of letters, including some experiments with a reduced lexicon of 15 words. The results were very encouraging with the SARASOM learning better and performing with better accuracy than both the Elman network and the HMM.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Due to the prevalence of unlabeled data, semi-supervised learning has drawn significant attention and has been found applicable in many real-world applications. In this paper, we present the so-called Budgeted Semi-supervised Sup-port Vector Machine (BS3VM), a method that leverages the excellent generalization capacity of kernel-based method with the adjacent and distributive information carried in a spectral graph for semi-supervised learning purpose. The fact that the optimization problem of BS3VM can be solved directly in the primal form makes it fast and efficient in memory usage. We validate the proposed method on several benchmark datasets to demonstrate its accuracy and efficiency. The experimental results show that BS3VM can scale up efficiently to the large-scale datasets where it yields a comparable classification accuracy while simultaneously achieving a significant computational speed-up compared with the baselines.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The XML Document Mining track was launched for exploring two main ideas: (1) identifying key problems and new challenges of the emerging field of mining semi-structured documents, and (2) studying and assessing the potential of Machine Learning (ML) techniques for dealing with generic ML tasks in the structured domain, i.e., classification and clustering of semi-structured documents. This track has run for six editions during INEX 2005, 2006, 2007, 2008, 2009 and 2010. The first five editions have been summarized in previous editions and we focus here on the 2010 edition. INEX 2010 included two tasks in the XML Mining track: (1) unsupervised clustering task and (2) semi-supervised classification task where documents are organized in a graph. The clustering task requires the participants to group the documents into clusters without any knowledge of category labels using an unsupervised learning algorithm. On the other hand, the classification task requires the participants to label the documents in the dataset into known categories using a supervised learning algorithm and a training set. This report gives the details of clustering and classification tasks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Field robots often rely on laser range finders (LRFs) to detect obstacles and navigate autonomously. Despite recent progress in sensing technology and perception algorithms, adverse environmental conditions, such as the presence of smoke, remain a challenging issue for these robots. In this paper, we investigate the possibility to improve laser-based perception applications by anticipating situations when laser data are affected by smoke, using supervised learning and state-of-the-art visual image quality analysis. We propose to train a k-nearest-neighbour (kNN) classifier to recognise situations where a laser scan is likely to be affected by smoke, based on visual data quality features. This method is evaluated experimentally using a mobile robot equipped with LRFs and a visual camera. The strengths and limitations of the technique are identified and discussed, and we show that the method is beneficial if conservative decisions are the most appropriate.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Accurate and detailed measurement of an individual's physical activity is a key requirement for helping researchers understand the relationship between physical activity and health. Accelerometers have become the method of choice for measuring physical activity due to their small size, low cost, convenience and their ability to provide objective information about physical activity. However, interpreting accelerometer data once it has been collected can be challenging. In this work, we applied machine learning algorithms to the task of physical activity recognition from triaxial accelerometer data. We employed a simple but effective approach of dividing the accelerometer data into short non-overlapping windows, converting each window into a feature vector, and treating each feature vector as an i.i.d training instance for a supervised learning algorithm. In addition, we improved on this simple approach with a multi-scale ensemble method that did not need to commit to a single window size and was able to leverage the fact that physical activities produced time series with repetitive patterns and discriminative features for physical activity occurred at different temporal scales.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One main challenge in developing a system for visual surveillance event detection is the annotation of target events in the training data. By making use of the assumption that events with security interest are often rare compared to regular behaviours, this paper presents a novel approach by using Kullback-Leibler (KL) divergence for rare event detection in a weakly supervised learning setting, where only clip-level annotation is available. It will be shown that this approach outperforms state-of-the-art methods on a popular real-world dataset, while preserving real time performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Debates on gene patents have necessitated the analysis of patents that disclose and reference human sequences. In this study, we built an automated classifier that assigns sequences to one of nine predefined categories according to their functional roles in patent claims by applying natural language processing and supervised learning techniques. To improve its correctness, we experimented with various feature mappings, resulting in the maximal accuracy of 79%.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Although robotics research has seen advances over the last decades robots are still not in widespread use outside industrial applications. Yet a range of proposed scenarios have robots working together, helping and coexisting with humans in daily life. In all these a clear need to deal with a more unstructured, changing environment arises. I herein present a system that aims to overcome the limitations of highly complex robotic systems, in terms of autonomy and adaptation. The main focus of research is to investigate the use of visual feedback for improving reaching and grasping capabilities of complex robots. To facilitate this a combined integration of computer vision and machine learning techniques is employed. From a robot vision point of view the combination of domain knowledge from both imaging processing and machine learning techniques, can expand the capabilities of robots. I present a novel framework called Cartesian Genetic Programming for Image Processing (CGP-IP). CGP-IP can be trained to detect objects in the incoming camera streams and successfully demonstrated on many different problem domains. The approach requires only a few training images (it was tested with 5 to 10 images per experiment) is fast, scalable and robust yet requires very small training sets. Additionally, it can generate human readable programs that can be further customized and tuned. While CGP-IP is a supervised-learning technique, I show an integration on the iCub, that allows for the autonomous learning of object detection and identification. Finally this dissertation includes two proof-of-concepts that integrate the motion and action sides. First, reactive reaching and grasping is shown. It allows the robot to avoid obstacles detected in the visual stream, while reaching for the intended target object. Furthermore the integration enables us to use the robot in non-static environments, i.e. the reaching is adapted on-the- fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. The second integration highlights the capabilities of these frameworks, by improving the visual detection by performing object manipulation actions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The work is based on the assumption that words with similar syntactic usage have similar meaning, which was proposed by Zellig S. Harris (1954,1968). We study his assumption from two aspects: Firstly, different meanings (word senses) of a word should manifest themselves in different usages (contexts), and secondly, similar usages (contexts) should lead to similar meanings (word senses). If we start with the different meanings of a word, we should be able to find distinct contexts for the meanings in text corpora. We separate the meanings by grouping and labeling contexts in an unsupervised or weakly supervised manner (Publication 1, 2 and 3). We are confronted with the question of how best to represent contexts in order to induce effective classifiers of contexts, because differences in context are the only means we have to separate word senses. If we start with words in similar contexts, we should be able to discover similarities in meaning. We can do this monolingually or multilingually. In the monolingual material, we find synonyms and other related words in an unsupervised way (Publication 4). In the multilingual material, we ?nd translations by supervised learning of transliterations (Publication 5). In both the monolingual and multilingual case, we first discover words with similar contexts, i.e., synonym or translation lists. In the monolingual case we also aim at finding structure in the lists by discovering groups of similar words, e.g., synonym sets. In this introduction to the publications of the thesis, we consider the larger background issues of how meaning arises, how it is quantized into word senses, and how it is modeled. We also consider how to define, collect and represent contexts. We discuss how to evaluate the trained context classi?ers and discovered word sense classifications, and ?nally we present the word sense discovery and disambiguation methods of the publications. This work supports Harris' hypothesis by implementing three new methods modeled on his hypothesis. The methods have practical consequences for creating thesauruses and translation dictionaries, e.g., for information retrieval and machine translation purposes. Keywords: Word senses, Context, Evaluation, Word sense disambiguation, Word sense discovery.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The problem of unsupervised anomaly detection arises in a wide variety of practical applications. While one-class support vector machines have demonstrated their effectiveness as an anomaly detection technique, their ability to model large datasets is limited due to their memory and time complexity for training. To address this issue for supervised learning of kernel machines, there has been growing interest in random projection methods as an alternative to the computationally expensive problems of kernel matrix construction and sup-port vector optimisation. In this paper we leverage the theory of nonlinear random projections and propose the Randomised One-class SVM (R1SVM), which is an efficient and scalable anomaly detection technique that can be trained on large-scale datasets. Our empirical analysis on several real-life and synthetic datasets shows that our randomised 1SVM algorithm achieves comparable or better accuracy to deep auto encoder and traditional kernelised approaches for anomaly detection, while being approximately 100 times faster in training and testing.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Carbon fiber reinforced polymer (CFRP) composite specimens with different thickness, geometry, and stacking sequences were subjected to fatigue spectrum loading in stages. Another set of specimens was subjected to static compression load. On-line acoustic Emission (AE) monitoring was carried out during these tests. Two artificial neural networks, Kohonen-self organizing feature map (KSOM), and multi-layer perceptron (MLP) have been developed for AE signal analysis. AE signals from specimens were clustered using the unsupervised learning KSOM. These clusters were correlated to the failure modes using available a priori information such as AE signal amplitude distributions, time of occurrence of signals, ultrasonic imaging, design of the laminates (stacking sequences, orientation of fibers), and AE parametric plots. Thereafter, AE signals generated from the rest of the specimens were classified by supervised learning MLP. The network developed is made suitable for on-line monitoring of AE signals in the presence of noise, which can be used for detection and identification of failure modes and their growth. The results indicate that the characteristics of AE signals from different failure modes in CFRP remain largely unaffected by the type of load, fiber orientation, and stacking sequences, they being representatives of the type of failure phenomena. The type of loading can have effect only on the extent of damage allowed before the specimens fail and hence on the number of AE signals during the test. The artificial neural networks (ANN) developed and the methods and procedures adopted show significant success in AE signal characterization under noisy environment (detection and identification of failure modes and their growth).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Urbanisation is the increase in the population of cities in proportion to the region's rural population. Urbanisation in India is very rapid with urban population growing at around 2.3 percent per annum. Urban sprawl refers to the dispersed development along highways or surrounding the city and in rural countryside with implications such as loss of agricultural land, open space and ecologically sensitive habitats. Sprawl is thus a pattern and pace of land use in which the rate of land consumed for urban purposes exceeds the rate of population growth resulting in an inefficient and consumptive use of land and its associated resources. This unprecedented urbanisation trend due to burgeoning population has posed serious challenges to the decision makers in the city planning and management process involving plethora of issues like infrastructure development, traffic congestion, and basic amenities (electricity, water, and sanitation), etc. In this context, to aid the decision makers in following the holistic approaches in the city and urban planning, the pattern, analysis, visualization of urban growth and its impact on natural resources has gained importance. This communication, analyses the urbanisation pattern and trends using temporal remote sensing data based on supervised learning using maximum likelihood estimation of multivariate normal density parameters and Bayesian classification approach. The technique is implemented for Greater Bangalore – one of the fastest growing city in the World, with Landsat data of 1973, 1992 and 2000, IRS LISS-3 data of 1999, 2006 and MODIS data of 2002 and 2007. The study shows that there has been a growth of 466% in urban areas of Greater Bangalore across 35 years (1973 to 2007). The study unravels the pattern of growth in Greater Bangalore and its implication on local climate and also on the natural resources, necessitating appropriate strategies for the sustainable management.