157 resultados para Supervised machine learning
Resumo:
Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS–SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS–SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65–85% for hybrid PLS–SVM model respectively. Also it was found that the hybrid PLS–SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS–SVM model.
Resumo:
Recent advances in computer vision and machine learning suggest that a wide range of problems can be addressed more appropriately by considering non-Euclidean geometry. In this paper we explore sparse dictionary learning over the space of linear subspaces, which form Riemannian structures known as Grassmann manifolds. To this end, we propose to embed Grassmann manifolds into the space of symmetric matrices by an isometric mapping, which enables us to devise a closed-form solution for updating a Grassmann dictionary, atom by atom. Furthermore, to handle non-linearity in data, we propose a kernelised version of the dictionary learning algorithm. Experiments on several classification tasks (face recognition, action recognition, dynamic texture classification) show that the proposed approach achieves considerable improvements in discrimination accuracy, in comparison to state-of-the-art methods such as kernelised Affine Hull Method and graph-embedding Grassmann discriminant analysis.
Resumo:
Accurate and detailed measurement of an individual's physical activity is a key requirement for helping researchers understand the relationship between physical activity and health. Accelerometers have become the method of choice for measuring physical activity due to their small size, low cost, convenience and their ability to provide objective information about physical activity. However, interpreting accelerometer data once it has been collected can be challenging. In this work, we applied machine learning algorithms to the task of physical activity recognition from triaxial accelerometer data. We employed a simple but effective approach of dividing the accelerometer data into short non-overlapping windows, converting each window into a feature vector, and treating each feature vector as an i.i.d training instance for a supervised learning algorithm. In addition, we improved on this simple approach with a multi-scale ensemble method that did not need to commit to a single window size and was able to leverage the fact that physical activities produced time series with repetitive patterns and discriminative features for physical activity occurred at different temporal scales.
Resumo:
We present a Connected Learning Analytics (CLA) toolkit, which enables data to be extracted from social media and imported into a Learning Record Store (LRS), as defined by the new xAPI standard. Core to the toolkit is the notion of learner access to their own data. A number of implementational issues are discussed, and an ontology of xAPI verb/object/activity statements as they might be unified across 7 different social media and online environments is introduced. After considering some of the analytics that learners might be interested in discovering about their own processes (the delivery of which is prioritised for the toolkit) we propose a set of learning activities that could be easily implemented, and their data tracked by anyone using the toolkit and a LRS.
Resumo:
This thesis develops a novel approach to robot control that learns to account for a robot's dynamic complexities while executing various control tasks using inspiration from biological sensorimotor control and machine learning. A robot that can learn its own control system can account for complex situations and adapt to changes in control conditions to maximise its performance and reliability in the real world. This research has developed two novel learning methods, with the aim of solving issues with learning control of non-rigid robots that incorporate additional dynamic complexities. The new learning control system was evaluated on a real three degree-of-freedom elastic joint robot arm with a number of experiments: initially validating the learning method and testing its ability to generalise to new tasks, then evaluating the system during a learning control task requiring continuous online model adaptation.
Resumo:
This paper presents a new active learning query strategy for information extraction, called Domain Knowledge Informativeness (DKI). Active learning is often used to reduce the amount of annotation effort required to obtain training data for machine learning algorithms. A key component of an active learning approach is the query strategy, which is used to iteratively select samples for annotation. Knowledge resources have been used in information extraction as a means to derive additional features for sample representation. DKI is, however, the first query strategy that exploits such resources to inform sample selection. To evaluate the merits of DKI, in particular with respect to the reduction in annotation effort that the new query strategy allows to achieve, we conduct a comprehensive empirical comparison of active learning query strategies for information extraction within the clinical domain. The clinical domain was chosen for this work because of the availability of extensive structured knowledge resources which have often been exploited for feature generation. In addition, the clinical domain offers a compelling use case for active learning because of the necessary high costs and hurdles associated with obtaining annotations in this domain. Our experimental findings demonstrated that 1) amongst existing query strategies, the ones based on the classification model’s confidence are a better choice for clinical data as they perform equally well with a much lighter computational load, and 2) significant reductions in annotation effort are achievable by exploiting knowledge resources within active learning query strategies, with up to 14% less tokens and concepts to manually annotate than with state-of-the-art query strategies.
Resumo:
Aerial surveys conducted using manned or unmanned aircraft with customized camera payloads can generate a large number of images. Manual review of these images to extract data is prohibitive in terms of time and financial resources, thus providing strong incentive to automate this process using computer vision systems. There are potential applications for these automated systems in areas such as surveillance and monitoring, precision agriculture, law enforcement, asset inspection, and wildlife assessment. In this paper, we present an efficient machine learning system for automating the detection of marine species in aerial imagery. The effectiveness of our approach can be credited to the combination of a well-suited region proposal method and the use of Deep Convolutional Neural Networks (DCNNs). In comparison to previous algorithms designed for the same purpose, we have been able to dramatically improve recall to more than 80% and improve precision to 27% by using DCNNs as the core approach.
Resumo:
Although robotics research has seen advances over the last decades robots are still not in widespread use outside industrial applications. Yet a range of proposed scenarios have robots working together, helping and coexisting with humans in daily life. In all these a clear need to deal with a more unstructured, changing environment arises. I herein present a system that aims to overcome the limitations of highly complex robotic systems, in terms of autonomy and adaptation. The main focus of research is to investigate the use of visual feedback for improving reaching and grasping capabilities of complex robots. To facilitate this a combined integration of computer vision and machine learning techniques is employed. From a robot vision point of view the combination of domain knowledge from both imaging processing and machine learning techniques, can expand the capabilities of robots. I present a novel framework called Cartesian Genetic Programming for Image Processing (CGP-IP). CGP-IP can be trained to detect objects in the incoming camera streams and successfully demonstrated on many different problem domains. The approach requires only a few training images (it was tested with 5 to 10 images per experiment) is fast, scalable and robust yet requires very small training sets. Additionally, it can generate human readable programs that can be further customized and tuned. While CGP-IP is a supervised-learning technique, I show an integration on the iCub, that allows for the autonomous learning of object detection and identification. Finally this dissertation includes two proof-of-concepts that integrate the motion and action sides. First, reactive reaching and grasping is shown. It allows the robot to avoid obstacles detected in the visual stream, while reaching for the intended target object. Furthermore the integration enables us to use the robot in non-static environments, i.e. the reaching is adapted on-the- fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. The second integration highlights the capabilities of these frameworks, by improving the visual detection by performing object manipulation actions.
Resumo:
This paper introduces a machine learning based system for controlling a robotic manipulator with visual perception only. The capability to autonomously learn robot controllers solely from raw-pixel images and without any prior knowledge of configuration is shown for the first time. We build upon the success of recent deep reinforcement learning and develop a system for learning target reaching with a three-joint robot manipulator using external visual observation. A Deep Q Network (DQN) was demonstrated to perform target reaching after training in simulation. Transferring the network to real hardware and real observation in a naive approach failed, but experiments show that the network works when replacing camera images with synthetic images.
Resumo:
Data-driven approaches such as Gaussian Process (GP) regression have been used extensively in recent robotics literature to achieve estimation by learning from experience. To ensure satisfactory performance, in most cases, multiple learning inputs are required. Intuitively, adding new inputs can often contribute to better estimation accuracy, however, it may come at the cost of a new sensor, larger training dataset and/or more complex learning, some- times for limited benefits. Therefore, it is crucial to have a systematic procedure to determine the actual impact each input has on the estimation performance. To address this issue, in this paper we propose to analyse the impact of each input on the estimate using a variance-based sensitivity analysis method. We propose an approach built on Analysis of Variance (ANOVA) decomposition, which can characterise how the prediction changes as one or more of the input changes, and also quantify the prediction uncertainty as attributed from each of the inputs in the framework of dependent inputs. We apply the proposed approach to a terrain-traversability estimation method we proposed in prior work, which is based on multi-task GP regression, and we validate this implementation experimentally using a rover on a Mars-analogue terrain.
Resumo:
With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.
Resumo:
Experience plays an important role in building management. “How often will this asset need repair?” or “How much time is this repair going to take?” are types of questions that project and facility managers face daily in planning activities. Failure or success in developing good schedules, budgets and other project management tasks depend on the project manager's ability to obtain reliable information to be able to answer these types of questions. Young practitioners tend to rely on information that is based on regional averages and provided by publishing companies. This is in contrast to experienced project managers who tend to rely heavily on personal experience. Another aspect of building management is that many practitioners are seeking to improve available scheduling algorithms, estimating spreadsheets and other project management tools. Such “micro-scale” levels of research are important in providing the required tools for the project manager's tasks. However, even with such tools, low quality input information will produce inaccurate schedules and budgets as output. Thus, it is also important to have a broad approach to research at a more “macro-scale.” Recent trends show that the Architectural, Engineering, Construction (AEC) industry is experiencing explosive growth in its capabilities to generate and collect data. There is a great deal of valuable knowledge that can be obtained from the appropriate use of this data and therefore the need has arisen to analyse this increasing amount of available data. Data Mining can be applied as a powerful tool to extract relevant and useful information from this sea of data. Knowledge Discovery in Databases (KDD) and Data Mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so large amounts of project data may be analysed. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. The project involves the development of a prototype tool to support facility managers, building owners and designers. This final report presents the AIMMTM prototype system and documents how and what data mining techniques can be applied, the results of their application and the benefits gained from the system. The AIMMTM system is capable of searching for useful patterns of knowledge and correlations within the existing building maintenance data to support decision making about future maintenance operations. The application of the AIMMTM prototype system on building models and their maintenance data (supplied by industry partners) utilises various data mining algorithms and the maintenance data is analysed using interactive visual tools. The application of the AIMMTM prototype system to help in improving maintenance management and building life cycle includes: (i) data preparation and cleaning, (ii) integrating meaningful domain attributes, (iii) performing extensive data mining experiments in which visual analysis (using stacked histograms), classification and clustering techniques, associative rule mining algorithm such as “Apriori” and (iv) filtering and refining data mining results, including the potential implications of these results for improving maintenance management. Maintenance data of a variety of asset types were selected for demonstration with the aim of discovering meaningful patterns to assist facility managers in strategic planning and provide a knowledge base to help shape future requirements and design briefing. Utilising the prototype system developed here, positive and interesting results regarding patterns and structures of data have been obtained.