867 resultados para least square-support vector machine
Resumo:
Hand gesture recognition based on surface electromyography (sEMG) signals is a promising approach for the development of intuitive human-machine interfaces (HMIs) in domains such as robotics and prosthetics. The sEMG signal arises from the muscles' electrical activity, and can thus be used to recognize hand gestures. The decoding from sEMG signals to actual control signals is non-trivial; typically, control systems map sEMG patterns into a set of gestures using machine learning, failing to incorporate any physiological insight. This master thesis aims at developing a bio-inspired hand gesture recognition system based on neuromuscular spike extraction rather than on simple pattern recognition. The system relies on a decomposition algorithm based on independent component analysis (ICA) that decomposes the sEMG signal into its constituent motor unit spike trains, which are then forwarded to a machine learning classifier. Since ICA does not guarantee a consistent motor unit ordering across different sessions, 3 approaches are proposed: 2 ordering criteria based on firing rate and negative entropy, and a re-calibration approach that allows the decomposition model to retain information about previous sessions. Using a multilayer perceptron (MLP), the latter approach results in an accuracy up to 99.4% in a 1-subject, 1-degree of freedom scenario. Afterwards, the decomposition and classification pipeline for inference is parallelized and profiled on the PULP platform, achieving a latency < 50 ms and an energy consumption < 1 mJ. Both the classification models tested (a support vector machine and a lightweight MLP) yielded an accuracy > 92% in a 1-subject, 5-classes (4 gestures and rest) scenario. These results prove that the proposed system is suitable for real-time execution on embedded platforms and also capable of matching the accuracy of state-of-the-art approaches, while also giving some physiological insight on the neuromuscular spikes underlying the sEMG.
Resumo:
La segmentazione prevede la partizione di un'immagine in aree strutturalmente o semanticamente coerenti. Nell'imaging medico, è utilizzata per identificare, contornandole, Regioni di Interesse (ROI) clinico, quali lesioni tumorali, oggetto di approfondimento tramite analisi semiautomatiche e automatiche, o bersaglio di trattamenti localizzati. La segmentazione di lesioni tumorali, assistita o automatica, consiste nell’individuazione di pixel o voxel, in immagini o volumi, appartenenti al tumore. La tecnica assistita prevede che il medico disegni la ROI, mentre quella automatica è svolta da software addestrati, tra cui i sistemi Computer Aided Detection (CAD). Mediante tecniche di visione artificiale, dalle ROI si estraggono caratteristiche numeriche, feature, con valore diagnostico, predittivo, o prognostico. L’obiettivo di questa Tesi è progettare e sviluppare un software di segmentazione assistita che permetta al medico di disegnare in modo semplice ed efficace una o più ROI in maniera organizzata e strutturata per futura elaborazione ed analisi, nonché visualizzazione. Partendo da Aliza, applicativo open-source, visualizzatore di esami radiologici in formato DICOM, è stata estesa l’interfaccia grafica per gestire disegno, organizzazione e memorizzazione automatica delle ROI. Inoltre, è stata implementata una procedura automatica di elaborazione ed analisi di ROI disegnate su lesioni tumorali prostatiche, per predire, di ognuna, la probabilità di cancro clinicamente non-significativo e significativo (con prognosi peggiore). Per tale scopo, è stato addestrato un classificatore lineare basato su Support Vector Machine, su una popolazione di 89 pazienti con 117 lesioni (56 clinicamente significative), ottenendo, in test, accuratezza = 77%, sensibilità = 86% e specificità = 69%. Il sistema sviluppato assiste il radiologo, fornendo una seconda opinione, non vincolante, adiuvante nella definizione del quadro clinico e della prognosi, nonché delle scelte terapeutiche.
Resumo:
Propolis is a chemically complex biomass produced by honeybees (Apis mellifera) from plant resins added of salivary enzymes, beeswax, and pollen. The biological activities described for propolis were also identified for donor plants resin, but a big challenge for the standardization of the chemical composition and biological effects of propolis remains on a better understanding of the influence of seasonality on the chemical constituents of that raw material. Since propolis quality depends, among other variables, on the local flora which is strongly influenced by (a)biotic factors over the seasons, to unravel the harvest season effect on the propolis chemical profile is an issue of recognized importance. For that, fast, cheap, and robust analytical techniques seem to be the best choice for large scale quality control processes in the most demanding markets, e.g., human health applications. For that, UV-Visible (UV-Vis) scanning spectrophotometry of hydroalcoholic extracts (HE) of seventy-three propolis samples, collected over the seasons in 2014 (summer, spring, autumn, and winter) and 2015 (summer and autumn) in Southern Brazil was adopted. Further machine learning and chemometrics techniques were applied to the UV-Vis dataset aiming to gain insights as to the seasonality effect on the claimed chemical heterogeneity of propolis samples determined by changes in the flora of the geographic region under study. Descriptive and classification models were built following a chemometric approach, i.e. principal component analysis (PCA) and hierarchical clustering analysis (HCA) supported by scripts written in the R language. The UV-Vis profiles associated with chemometric analysis allowed identifying a typical pattern in propolis samples collected in the summer. Importantly, the discrimination based on PCA could be improved by using the dataset of the fingerprint region of phenolic compounds ( = 280-400m), suggesting that besides the biological activities of those secondary metabolites, they also play a relevant role for the discrimination and classification of that complex matrix through bioinformatics tools. Finally, a series of machine learning approaches, e.g., partial least square-discriminant analysis (PLS-DA), k-Nearest Neighbors (kNN), and Decision Trees showed to be complementary to PCA and HCA, allowing to obtain relevant information as to the sample discrimination.
Resumo:
Computational intelligent support for decision making is becoming increasingly popular and essential among medical professionals. Also, with the modern medical devices being capable to communicate with ICT, created models can easily find practical translation into software. Machine learning solutions for medicine range from the robust but opaque paradigms of support vector machines and neural networks to the also performant, yet more comprehensible, decision trees and rule-based models. So how can such different techniques be combined such that the professional obtains the whole spectrum of their particular advantages? The presented approaches have been conceived for various medical problems, while permanently bearing in mind the balance between good accuracy and understandable interpretation of the decision in order to truly establish a trustworthy ‘artificial’ second opinion for the medical expert.
Resumo:
This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.
Resumo:
BACKGROUND Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. METHODS It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. RESULTS Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. CONCLUSIONS All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).
Resumo:
Avalanche forecasting is a complex process involving the assimilation of multiple data sources to make predictions over varying spatial and temporal resolutions. Numerically assisted forecasting often uses nearest neighbour methods (NN), which are known to have limitations when dealing with high dimensional data. We apply Support Vector Machines to a dataset from Lochaber, Scotland to assess their applicability in avalanche forecasting. Support Vector Machines (SVMs) belong to a family of theoretically based techniques from machine learning and are designed to deal with high dimensional data. Initial experiments showed that SVMs gave results which were comparable with NN for categorical and probabilistic forecasts. Experiments utilising the ability of SVMs to deal with high dimensionality in producing a spatial forecast show promise, but require further work.
Resumo:
The book presents the state of the art in machine learning algorithms (artificial neural networks of different architectures, support vector machines, etc.) as applied to the classification and mapping of spatially distributed environmental data. Basic geostatistical algorithms are presented as well. New trends in machine learning and their application to spatial data are given, and real case studies based on environmental and pollution data are carried out. The book provides a CD-ROM with the Machine Learning Office software, including sample sets of data, that will allow both students and researchers to put the concepts rapidly to practice.
Resumo:
Spatial data analysis mapping and visualization is of great importance in various fields: environment, pollution, natural hazards and risks, epidemiology, spatial econometrics, etc. A basic task of spatial mapping is to make predictions based on some empirical data (measurements). A number of state-of-the-art methods can be used for the task: deterministic interpolations, methods of geostatistics: the family of kriging estimators (Deutsch and Journel, 1997), machine learning algorithms such as artificial neural networks (ANN) of different architectures, hybrid ANN-geostatistics models (Kanevski and Maignan, 2004; Kanevski et al., 1996), etc. All the methods mentioned above can be used for solving the problem of spatial data mapping. Environmental empirical data are always contaminated/corrupted by noise, and often with noise of unknown nature. That's one of the reasons why deterministic models can be inconsistent, since they treat the measurements as values of some unknown function that should be interpolated. Kriging estimators treat the measurements as the realization of some spatial randomn process. To obtain the estimation with kriging one has to model the spatial structure of the data: spatial correlation function or (semi-)variogram. This task can be complicated if there is not sufficient number of measurements and variogram is sensitive to outliers and extremes. ANN is a powerful tool, but it also suffers from the number of reasons. of a special type ? multiplayer perceptrons ? are often used as a detrending tool in hybrid (ANN+geostatistics) models (Kanevski and Maignank, 2004). Therefore, development and adaptation of the method that would be nonlinear and robust to noise in measurements, would deal with the small empirical datasets and which has solid mathematical background is of great importance. The present paper deals with such model, based on Statistical Learning Theory (SLT) - Support Vector Regression. SLT is a general mathematical framework devoted to the problem of estimation of the dependencies from empirical data (Hastie et al, 2004; Vapnik, 1998). SLT models for classification - Support Vector Machines - have shown good results on different machine learning tasks. The results of SVM classification of spatial data are also promising (Kanevski et al, 2002). The properties of SVM for regression - Support Vector Regression (SVR) are less studied. First results of the application of SVR for spatial mapping of physical quantities were obtained by the authorsin for mapping of medium porosity (Kanevski et al, 1999), and for mapping of radioactively contaminated territories (Kanevski and Canu, 2000). The present paper is devoted to further understanding of the properties of SVR model for spatial data analysis and mapping. Detailed description of the SVR theory can be found in (Cristianini and Shawe-Taylor, 2000; Smola, 1996) and basic equations for the nonlinear modeling are given in section 2. Section 3 discusses the application of SVR for spatial data mapping on the real case study - soil pollution by Cs137 radionuclide. Section 4 discusses the properties of the modelapplied to noised data or data with outliers.
Resumo:
The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
The subject of the thesis is automatic sentence compression with machine learning, so that the compressed sentences remain both grammatical and retain their essential meaning. There are multiple possible uses for the compression of natural language sentences. In this thesis the focus is generation of television program subtitles, which often are compressed version of the original script of the program. The main part of the thesis consists of machine learning experiments for automatic sentence compression using different approaches to the problem. The machine learning methods used for this work are linear-chain conditional random fields and support vector machines. Also we take a look which automatic text analysis methods provide useful features for the task. The data used for machine learning is supplied by Lingsoft Inc. and consists of subtitles in both compressed an uncompressed form. The models are compared to a baseline system and comparisons are made both automatically and also using human evaluation, because of the potentially subjective nature of the output. The best result is achieved using a CRF - sequence classification using a rich feature set. All text analysis methods help classification and most useful method is morphological analysis. Tutkielman aihe on suomenkielisten lauseiden automaattinen tiivistäminen koneellisesti, niin että lyhennetyt lauseet säilyttävät olennaisen informaationsa ja pysyvät kieliopillisina. Luonnollisen kielen lauseiden tiivistämiselle on monta käyttötarkoitusta, mutta tässä tutkielmassa aihetta lähestytään television ohjelmien tekstittämisen kautta, johon käytännössä kuuluu alkuperäisen tekstin lyhentäminen televisioruudulle paremmin sopivaksi. Tutkielmassa kokeillaan erilaisia koneoppimismenetelmiä tekstin automaatiseen lyhentämiseen ja tarkastellaan miten hyvin erilaiset luonnollisen kielen analyysimenetelmät tuottavat informaatiota, joka auttaa näitä menetelmiä lyhentämään lauseita. Lisäksi tarkastellaan minkälainen lähestymistapa tuottaa parhaan lopputuloksen. Käytetyt koneoppimismenetelmät ovat tukivektorikone ja lineaarisen sekvenssin mallinen CRF. Koneoppimisen tukena käytetään tekstityksiä niiden eri käsittelyvaiheissa, jotka on saatu Lingsoft OY:ltä. Luotuja malleja vertaillaan Lopulta mallien lopputuloksia evaluoidaan automaattisesti ja koska teksti lopputuksena on jossain määrin subjektiivinen myös ihmisarviointiin perustuen. Vertailukohtana toimii kirjallisuudesta poimittu menetelmä. Tutkielman tuloksena paras lopputulos saadaan aikaan käyttäen CRF sekvenssi-luokittelijaa laajalla piirrejoukolla. Kaikki kokeillut teksin analyysimenetelmät auttavat luokittelussa, joista tärkeimmän panoksen antaa morfologinen analyysi.
Resumo:
Intelligent Transportation System (ITS) is a system that builds a safe, effective and integrated transportation environment based on advanced technologies. Road signs detection and recognition is an important part of ITS, which offer ways to collect the real time traffic data for processing at a central facility.This project is to implement a road sign recognition model based on AI and image analysis technologies, which applies a machine learning method, Support Vector Machines, to recognize road signs. We focus on recognizing seven categories of road sign shapes and five categories of speed limit signs. Two kinds of features, binary image and Zernike moments, are used for representing the data to the SVM for training and test. We compared and analyzed the performances of SVM recognition model using different features and different kernels. Moreover, the performances using different recognition models, SVM and Fuzzy ARTMAP, are observed.
Resumo:
In a global economy, manufacturers mainly compete with cost efficiency of production, as the price of raw materials are similar worldwide. Heavy industry has two big issues to deal with. On the one hand there is lots of data which needs to be analyzed in an effective manner, and on the other hand making big improvements via investments in cooperate structure or new machinery is neither economically nor physically viable. Machine learning offers a promising way for manufacturers to address both these problems as they are in an excellent position to employ learning techniques with their massive resource of historical production data. However, choosing modelling a strategy in this setting is far from trivial and this is the objective of this article. The article investigates characteristics of the most popular classifiers used in industry today. Support Vector Machines, Multilayer Perceptron, Decision Trees, Random Forests, and the meta-algorithms Bagging and Boosting are mainly investigated in this work. Lessons from real-world implementations of these learners are also provided together with future directions when different learners are expected to perform well. The importance of feature selection and relevant selection methods in an industrial setting are further investigated. Performance metrics have also been discussed for the sake of completion.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)