861 resultados para Support vectors machine
Resumo:
In this article, we aim at reducing the error rate of the online Tamil symbol recognition system by employing multiple experts to reevaluate certain decisions of the primary support vector machine classifier. Motivated by the relatively high percentage of occurrence of base consonants in the script, a reevaluation technique has been proposed to correct any ambiguities arising in the base consonants. Secondly, a dynamic time-warping method is proposed to automatically extract the discriminative regions for each set of confused characters. Class-specific features derived from these regions aid in reducing the degree of confusion. Thirdly, statistics of specific features are proposed for resolving any confusions in vowel modifiers. The reevaluation approaches are tested on two databases (a) the isolated Tamil symbols in the IWFHR test set, and (b) the symbols segmented from a set of 10,000 Tamil words. The recognition rate of the isolated test symbols of the IWFHR database improves by 1.9 %. For the word database, the incorporation of the reevaluation step improves the symbol recognition rate by 3.5 % (from 88.4 to 91.9 %). This, in turn, boosts the word recognition rate by 11.9 % (from 65.0 to 76.9 %). The reduction in the word error rate has been achieved using a generic approach, without the incorporation of language models.
Resumo:
Several statistical downscaling models have been developed in the past couple of decades to assess the hydrologic impacts of climate change by projecting the station-scale hydrological variables from large-scale atmospheric variables simulated by general circulation models (GCMs). This paper presents and compares different statistical downscaling models that use multiple linear regression (MLR), positive coefficient regression (PCR), stepwise regression (SR), and support vector machine (SVM) techniques for estimating monthly rainfall amounts in the state of Florida. Mean sea level pressure, air temperature, geopotential height, specific humidity, U wind, and V wind are used as the explanatory variables/predictors in the downscaling models. Data for these variables are obtained from the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis dataset and the Canadian Centre for Climate Modelling and Analysis (CCCma) Coupled Global Climate Model, version 3 (CGCM3) GCM simulations. The principal component analysis (PCA) and fuzzy c-means clustering method (FCM) are used as part of downscaling model to reduce the dimensionality of the dataset and identify the clusters in the data, respectively. Evaluation of the performances of the models using different error and statistical measures indicates that the SVM-based model performed better than all the other models in reproducing most monthly rainfall statistics at 18 sites. Output from the third-generation CGCM3 GCM for the A1B scenario was used for future projections. For the projection period 2001-10, MLR was used to relate variables at the GCM and NCEP grid scales. Use of MLR in linking the predictor variables at the GCM and NCEP grid scales yielded better reproduction of monthly rainfall statistics at most of the stations (12 out of 18) compared to those by spatial interpolation technique used in earlier studies.
Resumo:
We propose to develop a 3-D optical flow features based human action recognition system. Optical flow based features are employed here since they can capture the apparent movement in object, by design. Moreover, they can represent information hierarchically from local pixel level to global object level. In this work, 3-D optical flow based features a re extracted by combining the 2-1) optical flow based features with the depth flow features obtained from depth camera. In order to develop an action recognition system, we employ a Meta-Cognitive Neuro-Fuzzy Inference System (McFIS). The m of McFIS is to find the decision boundary separating different classes based on their respective optical flow based features. McFIS consists of a neuro-fuzzy inference system (cognitive component) and a self-regulatory learning mechanism (meta-cognitive component). During the supervised learning, self-regulatory learning mechanism monitors the knowledge of the current sample with respect to the existing knowledge in the network and controls the learning by deciding on sample deletion, sample learning or sample reserve strategies. The performance of the proposed action recognition system was evaluated on a proprietary data set consisting of eight subjects. The performance evaluation with standard support vector machine classifier and extreme learning machine indicates improved performance of McFIS is recognizing actions based of 3-D optical flow based features.
Resumo:
Wrist pulse signal contains more important information about the health status of a person and pulse signal diagnosis has been employed in oriental medicine since very long time. In this paper we have used signal processing techniques to extract information from wrist pulse signals. For this purpose we have acquired radial artery pulse signals at wrist position noninvasively for different cases of interest. The wrist pulse waveforms have been analyzed using spatial features. Results have been obtained for the case of wrist pulse signals recorded for several subjects before exercise and after exercise. It is shown that the spatial features show statistically significant changes for the two cases and hence they are effective in distinguishing the changes taking place due to exercise. Support vector machine classifier is used to classify between the groups, and a high classification accuracy of 99.71% is achieved. Thus this paper demonstrates the utility of the spatial features in studying wrist pulse signals obtained under various recording conditions. The ability of the model to distinguish changes occurring under two different recording conditions can be potentially used for health care applications.
Resumo:
Wrist pulse signals contain important information about the health of a person and hence diagnosis based on pulse signals has assumed great importance. In this paper we demonstrate the efficacy of a two term Gaussian model to extract information from pulse signals. Results have been obtained by conducting experiments on several subjects to record wrist pulse signals for the cases of before exercise and after exercise. Parameters have been extracted from the recorded signals using the model and a paired t-test is performed, which shows that the parameters are significantly different between the two groups. Further, a recursive cluster elimination based support vector machine is used to perform classification between the groups. An average classification accuracy of 99.46% is obtained, along with top classifiers. It is thus shown that the parameters of the Gaussian model show changes across groups and hence the model is effective in distinguishing the changes taking place due to the two different recording conditions. The study has potential applications in healthcare.
Resumo:
Blood travels throughout the body and thus its flow is modulated by changes in body condition. As a consequence, the wrist pulse signal contains important information about the status of the human body. In this work we have employed signal processing techniques to extract important information from these signals. Radial artery pulse pressure signals are acquired at wrist position noninvasively for several subjects for two cases of interest, viz. before and after exercise, and before and after lunch. Further analysis is performed by fitting a bi-modal Gaussian model to the data and extracting spatial features from the fit. The spatial features show statistically significant (p < 0.001) changes between the groups for both the cases, which indicates that they are effective in distinguishing the changes taking place due to exercise or food intake. Recursive cluster elimination based support vector machine classifier is used to classify between the groups. A high classification accuracy of 99.71% is achieved for the exercise case and 99.94% is achieved for the lunch case. This paper demonstrates the utility of certain spatial features in studying wrist pulse signals obtained under various experimental conditions. The ability of the spatial features in distinguishing changing body conditions can be potentially used for various healthcare applications. (C) 2015 Elsevier Ltd. All rights reserved.
Resumo:
In optical character recognition of very old books, the recognition accuracy drops mainly due to the merging or breaking of characters. In this paper, we propose the first algorithm to segment merged Kannada characters by using a hypothesis to select the positions to be cut. This method searches for the best possible positions to segment, by taking into account the support vector machine classifier's recognition score and the validity of the aspect ratio (width to height ratio) of the segments between every pair of cut positions. The hypothesis to select the cut position is based on the fact that a concave surface exists above and below the touching portion. These concave surfaces are noted down by tracing the valleys in the top contour of the image and similarly doing it for the image rotated upside-down. The cut positions are then derived as closely matching valleys of the original and the rotated images. Our proposed segmentation algorithm works well for different font styles, shapes and sizes better than the existing vertical projection profile based segmentation. The proposed algorithm has been tested on 1125 different word images, each containing multiple merged characters, from an old Kannada book and 89.6% correct segmentation is achieved and the character recognition accuracy of merged words is 91.2%. A few points of merge are still missed due to the absence of a matched valley due to the specific shapes of the particular characters meeting at the merges.
Resumo:
Imaging flow cytometry is an emerging technology that combines the statistical power of flow cytometry with spatial and quantitative morphology of digital microscopy. It allows high-throughput imaging of cells with good spatial resolution, while they are in flow. This paper proposes a general framework for the processing/classification of cells imaged using imaging flow cytometer. Each cell is localized by finding an accurate cell contour. Then, features reflecting cell size, circularity and complexity are extracted for the classification using SVM. Unlike the conventional iterative, semi-automatic segmentation algorithms such as active contour, we propose a noniterative, fully automatic graph-based cell localization. In order to evaluate the performance of the proposed framework, we have successfully classified unstained label-free leukaemia cell-lines MOLT, K562 and HL60 from video streams captured using custom fabricated cost-effective microfluidics-based imaging flow cytometer. The proposed system is a significant development in the direction of building a cost-effective cell analysis platform that would facilitate affordable mass screening camps looking cellular morphology for disease diagnosis. Lay description In this article, we propose a novel framework for processing the raw data generated using microfluidics based imaging flow cytometers. Microfluidics microscopy or microfluidics based imaging flow cytometry (mIFC) is a recent microscopy paradigm, that combines the statistical power of flow cytometry with spatial and quantitative morphology of digital microscopy, which allows us imaging cells while they are in flow. In comparison to the conventional slide-based imaging systems, mIFC is a nascent technology enabling high throughput imaging of cells and is yet to take the form of a clinical diagnostic tool. The proposed framework process the raw data generated by the mIFC systems. The framework incorporates several steps: beginning from pre-processing of the raw video frames to enhance the contents of the cell, localising the cell by a novel, fully automatic, non-iterative graph based algorithm, extraction of different quantitative morphological parameters and subsequent classification of cells. In order to evaluate the performance of the proposed framework, we have successfully classified unstained label-free leukaemia cell-lines MOLT, K562 and HL60 from video streams captured using cost-effective microfluidics based imaging flow cytometer. The cell lines of HL60, K562 and MOLT were obtained from ATCC (American Type Culture Collection) and are separately cultured in the lab. Thus, each culture contains cells from its own category alone and thereby provides the ground truth. Each cell is localised by finding a closed cell contour by defining a directed, weighted graph from the Canny edge images of the cell such that the closed contour lies along the shortest weighted path surrounding the centroid of the cell from a starting point on a good curve segment to an immediate endpoint. Once the cell is localised, morphological features reflecting size, shape and complexity of the cells are extracted and used to develop a support vector machine based classification system. We could classify the cell-lines with good accuracy and the results were quite consistent across different cross validation experiments. We hope that imaging flow cytometers equipped with the proposed framework for image processing would enable cost-effective, automated and reliable disease screening in over-loaded facilities, which cannot afford to hire skilled personnel in large numbers. Such platforms would potentially facilitate screening camps in low income group countries; thereby transforming the current health care paradigms by enabling rapid, automated diagnosis for diseases like cancer.
Resumo:
This project introduces an improvement of the vision capacity of the robot Robotino operating under ROS platform. A method for recognizing object class using binary features has been developed. The proposed method performs a binary classification of the descriptors of each training image to characterize the appearance of the object class. It presents the use of the binary descriptor based on the difference of gray intensity of the pixels in the image. It shows that binary features are suitable to represent object class in spite of the low resolution and the weak information concerning details of the object in the image. It also introduces the use of a boosting method (Adaboost) of feature selection al- lowing to eliminate redundancies and noise in order to improve the performance of the classifier. Finally, a kernel classifier SVM (Support Vector Machine) is trained with the available database and applied for predictions on new images. One possible future work is to establish a visual servo-control that is to say the reac- tion of the robot to the detection of the object.
Resumo:
nterruptions in cardiopulmonary resuscitation (CPR) compromise defibrillation success. However, CPR must be interrupted to analyze the rhythm because although current methods for rhythm analysis during CPR have high sensitivity for shockable rhythms, the specificity for nonshockable rhythms is still too low. This paper introduces a new approach to rhythm analysis during CPR that combines two strategies: a state-of-the-art CPR artifact suppression filter and a shock advice algorithm (SAA) designed to optimally classify the filtered signal. Emphasis is on designing an algorithm with high specificity. The SAA includes a detector for low electrical activity rhythms to increase the specificity, and a shock/no-shock decision algorithm based on a support vector machine classifier using slope and frequency features. For this study, 1185 shockable and 6482 nonshockable 9-s segments corrupted by CPR artifacts were obtained from 247 patients suffering out-of-hospital cardiac arrest. The segments were split into a training and a test set. For the test set, the sensitivity and specificity for rhythm analysis during CPR were 91.0% and 96.6%, respectively. This new approach shows an important increase in specificity without compromising the sensitivity when compared to previous studies.
Resumo:
Este trabalho apresenta o desenvolvimento de sistemas inteligentes aplicados ao monitoramento de estruturas aeronáuticas abordando dois modelos distintos: o primeiro é a análise e classificação de imagens de ultrassom de estruturas aeronáuticas com objetivo de apoiar decisões em reparo de estruturas aeronáuticas. Foi definido como escopo do trabalho uma seção transversal da asa da aeronave modelo Boeing 707. Após a remoção de material superficial em áreas comprometidas por corrosão, é realizada a medição da espessura ao longo da área da peça. Com base nestas medições, a Engenharia realiza a análise estrutural, observando os limites determinados pelo manual de manutenção e determina a necessidade ou não de reparo. O segundo modelo compreende o método de impedância eletromecânica. É proposto o desenvolvimento de um sistema de monitoramento de baixo custo aplicado em uma barra de alumínio aeronáutico com 10 posições de fixação de porcas e parafusos. O objetivo do sistema é avaliar, a partir das curvas de impedância extraídas do transdutor PZT fixado na barra, sua capacidade de classificar a existência ou não de um dano na estrutura e, em caso de existência do dano, indicar sua localização e seu grau de severidade. Foram utilizados os seguintes classificadores neste trabalho: máquina de vetor de suporte, redes neurais artificiais e K vizinhos mais próximos.
Resumo:
Com cada vez mais intenso desenvolvimento urbano e industrial, atualmente um desafio fundamental é eliminar ou reduzir o impacto causado pelas emissões de poluentes para a atmosfera. No ano de 2012, o Rio de Janeiro sediou a Rio +20, a Conferência das Nações Unidas sobre Desenvolvimento Sustentável, onde representantes de todo o mundo participaram. Na época, entre outros assuntos foram discutidos a economia verde e o desenvolvimento sustentável. O O3 troposférico apresenta-se como uma variável extremamente importante devido ao seu forte impacto ambiental, e conhecer o comportamento dos parâmetros que afetam a qualidade do ar de uma região, é útil para prever cenários. A química das ciências atmosféricas e meteorologia são altamente não lineares e, assim, as previsões de parâmetros de qualidade do ar são difíceis de serem determinadas. A qualidade do ar depende de emissões, de meteorologia e topografia. Os dados observados foram o dióxido de nitrogênio (NO2), monóxido de nitrogênio (NO), óxidos de nitrogênio (NOx), monóxido de carbono (CO), ozônio (O3), velocidade escalar vento (VEV), radiação solar global (RSG), temperatura (TEM), umidade relativa (UR) e foram coletados através da estação móvel de monitoramento da Secretaria do Meio Ambiente (SMAC) do Rio de Janeiro em dois locais na área metropolitana, na Pontifícia Universidade Católica (PUC-Rio) e na Universidade do Estado do Rio de Janeiro (UERJ) no ano de 2011 e 2012. Este estudo teve três objetivos: (1) analisar o comportamento das variáveis, utilizando o método de análise de componentes principais (PCA) de análise exploratória, (2) propor previsões de níveis de O3 a partir de poluentes primários e de fatores meteorológicos, comparando a eficácia dos métodos não lineares, como as redes neurais artificiais (ANN) e regressão por máquina de vetor de suporte (SVM-R), a partir de poluentes primários e de fatores meteorológicos e, finalmente, (3) realizar método de classificação de dados usando a classificação por máquina de vetor suporte (SVM-C). A técnica PCA mostrou que, para conjunto de dados da PUC as variáveis NO, NOx e VEV obtiveram um impacto maior sobre a concentração de O3 e o conjunto de dados da UERJ teve a TEM e a RSG como as variáveis mais importantes. Os resultados das técnicas de regressão não linear ANN e SVM obtidos foram muito próximos e aceitáveis para o conjunto de dados da UERJ apresentando coeficiente de determinação (R2) para a validação, 0,9122 e 0,9152 e Raiz Quadrada do Erro Médio Quadrático (RMECV) 7,66 e 7,85, respectivamente. Quanto aos conjuntos de dados PUC e PUC+UERJ, ambas as técnicas, obtiveram resultados menos satisfatórios. Para estes conjuntos de dados, a SVM mostrou resultados ligeiramente superiores, e PCA, SVM e ANN demonstraram sua robustez apresentando-se como ferramentas úteis para a compreensão, classificação e previsão de cenários da qualidade do ar
Resumo:
A partir de 2011, ocorreram e ainda ocorrerão eventos de grande repercussão para a cidade do Rio de Janeiro, como a conferência Rio+20 das Nações Unidas e eventos esportivos de grande importância mundial (Copa do Mundo de Futebol, Olimpíadas e Paraolimpíadas). Estes acontecimentos possibilitam a atração de recursos financeiros para a cidade, assim como a geração de empregos, melhorias de infraestrutura e valorização imobiliária, tanto territorial quanto predial. Ao optar por um imóvel residencial em determinado bairro, não se avalia apenas o imóvel, mas também as facilidades urbanas disponíveis na localidade. Neste contexto, foi possível definir uma interpretação qualitativa linguística inerente aos bairros da cidade do Rio de Janeiro, integrando-se três técnicas de Inteligência Computacional para a avaliação de benefícios: Lógica Fuzzy, Máquina de Vetores Suporte e Algoritmos Genéticos. A base de dados foi construída com informações da web e institutos governamentais, evidenciando o custo de imóveis residenciais, benefícios e fragilidades dos bairros da cidade. Implementou-se inicialmente a Lógica Fuzzy como um modelo não supervisionado de agrupamento através das Regras Elipsoidais pelo Princípio de Extensão com o uso da Distância de Mahalanobis, configurando-se de forma inferencial os grupos de designação linguística (Bom, Regular e Ruim) de acordo com doze características urbanas. A partir desta discriminação, foi tangível o uso da Máquina de Vetores Suporte integrado aos Algoritmos Genéticos como um método supervisionado, com o fim de buscar/selecionar o menor subconjunto das variáveis presentes no agrupamento que melhor classifique os bairros (Princípio da Parcimônia). A análise das taxas de erro possibilitou a escolha do melhor modelo de classificação com redução do espaço de variáveis, resultando em um subconjunto que contém informações sobre: IDH, quantidade de linhas de ônibus, instituições de ensino, valor m médio, espaços ao ar livre, locais de entretenimento e crimes. A modelagem que combinou as três técnicas de Inteligência Computacional hierarquizou os bairros do Rio de Janeiro com taxas de erros aceitáveis, colaborando na tomada de decisão para a compra e venda de imóveis residenciais. Quando se trata de transporte público na cidade em questão, foi possível perceber que a malha rodoviária ainda é a prioritária
Resumo:
杜鹃属(Rhododendron L.)是中国种子植物中最大的属,其现代分布和分化中心是我国西南部的横断山区和东喜马拉雅地区。我国西部、西南部的云南、四川、西藏等地共有杜鹃达450种,仅特有种就有约300种。对杜鹃属分布的深入研究是横断山区生物多样性保护不可缺少的重要部分。 由于物种分布与环境因子之间存在着紧密的联系,利用环境因子作为预测物种分布模型的变量是当前最普遍的建模思路。但是绝大多数物种分布预测模型都遇到了难以解决的“高维小样本”问题――模型在标本数据不足时无法给出合理的预测,或者模型无法处理大量的环境变量。机器学习领域的理论和实践已经证明,基于结构风险最小化原理的支持向量机(Support Vector Machine, SVM)算法非常适合“高维小样本”的分类问题。为了探索其应用在物种分布预测问题上的可能性,本文创新性的实现了基于SVM算法的物种分布预测系统。然后,本文以30个杜鹃属(Rhododendron L.)物种为检验对象,利用其标本数据和11个1km的栅格环境变量图层作为模型变量,预测其在中国的潜在分布区。本文通过全面的模型评估——专家评估,ROC (Receiver Operator Characteristic)曲线和曲线下方面积AUC (Area Under the Curve)——来比较模型的性能。试验结果表明,我们所实现的以SVM为核心的物种分布预测系统无论在计算速度还是预测效果上都远远优于当前广泛使用的GARP (Genetic Algorithm for Rule-Set Prediction)预测系统。 之后,本文进一步探讨了SVM预测系统预测效果与环境变量维数和标本点个数的关系。试验结果表明,对于只有少量标本点的物种SVM的预测结果仍然具有相当的合理性。由此可见, SVM预测系统很好的解决了以前众多模型无法克服的稀有种和标本点稀少的物种的潜在分布区模拟问题。同时本文发现大的环境维数(高维)对于物种潜在分布区的预测有着决定性的作用,因此模型处理高维问题的能力显得至关重要。 最后,我们使用中国所有可获取的杜鹃属标本数据,以及83个1km的栅格环境变量图层,对400种杜鹃属物种的潜在分布区进行预测。根据预测出来的物种潜在分布区,我们得到了中国杜鹃属物种潜在多样性分布格局,特有物种潜在多样性分布格局,濒危杜物种潜在的分布格局,各亚属物种潜在分布格局,以及不同生活型物种潜在多样性分布格局。这些分布区图不仅可以对杜鹃属起源研究提供分析验证的条件,还能为其引种、保护和新种的搜寻提供有利的空间依据。
Resumo:
An anomaly detection approach is considered for the mine hunting in sonar imagery problem. The authors exploit previous work that used dual-tree wavelets and fractal dimension to adaptively suppress sand ripples and a matched filter as an initial detector. Here, lacunarity inspired features are extracted from the remaining false positives, again using dual-tree wavelets. A one-class support vector machine is then used to learn a decision boundary, based only on these false positives. The approach exploits the large quantities of 'normal' natural background data available but avoids the difficult requirement of collecting examples of targets in order to train a classifier. © 2012 The Institution of Engineering and Technology.