750 resultados para CLASSIFIER
Resumo:
The aim of this study is to accurately distinguish Parkinson's disease (PD) participants from healthy controls using self-administered tests of gait and postural sway. Using consumer-grade smartphones with in-built accelerometers, we objectively measure and quantify key movement severity symptoms of Parkinson's disease. Specifically, we record tri-axial accelerations, and extract a range of different features based on the time and frequency-domain properties of the acceleration time series. The features quantify key characteristics of the acceleration time series, and enhance the underlying differences in the gait and postural sway accelerations between PD participants and controls. Using a random forest classifier, we demonstrate an average sensitivity of 98.5% and average specificity of 97.5% in discriminating PD participants from controls. © 2014 IEEE.
Resumo:
An experimental comparison of information features used by neural network is performed. The sensing method was used. Suboptimal classifier agreeable to the gaussian model of the training data was used as a probe. Neural nets with architectures of perceptron and feedforward net with one hidden layer were used. The experiments were carried out with spatial ultrasonic data, which are used for car’s passenger safety system neural controller learning. In this paper we show that a neural network doesn’t fully make use of gaussian components, which are first two moment coefficients of probability distribution. On the contrary, the network can find more complicated regularities inside data vectors and thus shows better results than suboptimal classifier. The parallel connection of suboptimal classifier improves work of modular neural network whereas its connection to the network input improves the specialization effect during training.
Resumo:
Formal grammars can used for describing complex repeatable structures such as DNA sequences. In this paper, we describe the structural composition of DNA sequences using a context-free stochastic L-grammar. L-grammars are a special class of parallel grammars that can model the growth of living organisms, e.g. plant development, and model the morphology of a variety of organisms. We believe that parallel grammars also can be used for modeling genetic mechanisms and sequences such as promoters. Promoters are short regulatory DNA sequences located upstream of a gene. Detection of promoters in DNA sequences is important for successful gene prediction. Promoters can be recognized by certain patterns that are conserved within a species, but there are many exceptions which makes the promoter recognition a complex problem. We replace the problem of promoter recognition by induction of context-free stochastic L-grammar rules, which are later used for the structural analysis of promoter sequences. L-grammar rules are derived automatically from the drosophila and vertebrate promoter datasets using a genetic programming technique and their fitness is evaluated using a Support Vector Machine (SVM) classifier. The artificial promoter sequences generated using the derived L- grammar rules are analyzed and compared with natural promoter sequences.
Resumo:
DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance. This study proposed a predictor called enDNA-Prot for DNA-binding protein identification by employing the ensemble learning technique. Experiential results showed that enDNA-Prot was comparable with DNA-Prot and outperformed DNAbinder and iDNA-Prot with performance improvement in the range of 3.97-9.52% in ACC and 0.08-0.19 in MCC. Furthermore, when the benchmark dataset was expanded with negative samples, the performance of enDNA-Prot outperformed the three existing methods by 2.83-16.63% in terms of ACC and 0.02-0.16 in terms of MCC. It indicated that enDNA-Prot is an effective method for DNA-binding protein identification and expanding training dataset with negative samples can improve its performance. For the convenience of the vast majority of experimental scientists, we developed a user-friendly web-server for enDNA-Prot which is freely accessible to the public. © 2014 Ruifeng Xu et al.
Resumo:
* Работа выполнена при поддержке РФФИ, гранты 07-01-00331-a и 08-01-00944-a
Resumo:
This paper examines the application of commercial and non-invasive electroencephalography (EEG)-based brain-computer (BCIs) interfaces with serious games. Two different EEG-based BCI devices were used to fully control the same serious game. The first device (NeuroSky MindSet) uses only a single dry electrode and requires no calibration. The second device (Emotiv EPOC) uses 14 wet sensors requiring additional training of a classifier. User testing was performed on both devices with sixty-two participants measuring the player experience as well as key aspects of serious games, primarily learnability, satisfaction, performance and effort. Recorded feedback indicates that the current state of BCIs can be used in the future as alternative game interfaces after familiarisation and in some cases calibration. Comparative analysis showed significant differences between the two devices. The first device provides more satisfaction to the players whereas the second device is more effective in terms of adaptation and interaction with the serious game.
Resumo:
Short text messages a.k.a Microposts (e.g. Tweets) have proven to be an effective channel for revealing information about trends and events, ranging from those related to Disaster (e.g. hurricane Sandy) to those related to Violence (e.g. Egyptian revolution). Being informed about such events as they occur could be extremely important to authorities and emergency professionals by allowing such parties to immediately respond. In this work we study the problem of topic classification (TC) of Microposts, which aims to automatically classify short messages based on the subject(s) discussed in them. The accurate TC of Microposts however is a challenging task since the limited number of tokens in a post often implies a lack of sufficient contextual information. In order to provide contextual information to Microposts, we present and evaluate several graph structures surrounding concepts present in linked knowledge sources (KSs). Traditional TC techniques enrich the content of Microposts with features extracted only from the Microposts content. In contrast our approach relies on the generation of different weighted semantic meta-graphs extracted from linked KSs. We introduce a new semantic graph, called category meta-graph. This novel meta-graph provides a more fine grained categorisation of concepts providing a set of novel semantic features. Our findings show that such category meta-graph features effectively improve the performance of a topic classifier of Microposts. Furthermore our goal is also to understand which semantic feature contributes to the performance of a topic classifier. For this reason we propose an approach for automatic estimation of accuracy loss of a topic classifier on new, unseen Microposts. We introduce and evaluate novel topic similarity measures, which capture the similarity between the KS documents and Microposts at a conceptual level, considering the enriched representation of these documents. Extensive evaluation in the context of Emergency Response (ER) and Violence Detection (VD) revealed that our approach outperforms previous approaches using single KS without linked data and Twitter data only up to 31.4% in terms of F1 measure. Our main findings indicate that the new category graph contains useful information for TC and achieves comparable results to previously used semantic graphs. Furthermore our results also indicate that the accuracy of a topic classifier can be accurately predicted using the enhanced text representation, outperforming previous approaches considering content-based similarity measures. © 2014 Elsevier B.V. All rights reserved.
Resumo:
This article presents the principal results of the doctoral thesis “Recognition of neume notation in historical documents” by Lasko Laskov (Institute of Mathematics and Informatics at Bulgarian Academy of Sciences), successfully defended before the Specialized Academic Council for Informatics and Mathematical Modelling on 07 June 2010.
Resumo:
his article presents some of the results of the Ph.D. thesis Class Association Rule Mining Using MultiDimensional Numbered Information Spaces by Iliya Mitov (Institute of Mathematics and Informatics, BAS), successfully defended at Hasselt University, Faculty of Science on 15 November 2011 in Belgium
Resumo:
Report published in the Proceedings of the National Conference on "Education in the Information Society", Plovdiv, May, 2013
Resumo:
Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014
Resumo:
Topic classification (TC) of short text messages offers an effective and fast way to reveal events happening around the world ranging from those related to Disaster (e.g. Sandy hurricane) to those related to Violence (e.g. Egypt revolution). Previous approaches to TC have mostly focused on exploiting individual knowledge sources (KS) (e.g. DBpedia or Freebase) without considering the graph structures that surround concepts present in KSs when detecting the topics of Tweets. In this paper we introduce a novel approach for harnessing such graph structures from multiple linked KSs, by: (i) building a conceptual representation of the KSs, (ii) leveraging contextual information about concepts by exploiting semantic concept graphs, and (iii) providing a principled way for the combination of KSs. Experiments evaluating our TC classifier in the context of Violence detection (VD) and Emergency Responses (ER) show promising results that significantly outperform various baseline models including an approach using a single KS without linked data and an approach using only Tweets. Copyright 2013 ACM.
Resumo:
The focus of this thesis is the extension of topographic visualisation mappings to allow for the incorporation of uncertainty. Few visualisation algorithms in the literature are capable of mapping uncertain data with fewer able to represent observation uncertainties in visualisations. As such, modifications are made to NeuroScale, Locally Linear Embedding, Isomap and Laplacian Eigenmaps to incorporate uncertainty in the observation and visualisation spaces. The proposed mappings are then called Normally-distributed NeuroScale (N-NS), T-distributed NeuroScale (T-NS), Probabilistic LLE (PLLE), Probabilistic Isomap (PIso) and Probabilistic Weighted Neighbourhood Mapping (PWNM). These algorithms generate a probabilistic visualisation space with each latent visualised point transformed to a multivariate Gaussian or T-distribution, using a feed-forward RBF network. Two types of uncertainty are then characterised dependent on the data and mapping procedure. Data dependent uncertainty is the inherent observation uncertainty. Whereas, mapping uncertainty is defined by the Fisher Information of a visualised distribution. This indicates how well the data has been interpolated, offering a level of ‘surprise’ for each observation. These new probabilistic mappings are tested on three datasets of vectorial observations and three datasets of real world time series observations for anomaly detection. In order to visualise the time series data, a method for analysing observed signals and noise distributions, Residual Modelling, is introduced. The performance of the new algorithms on the tested datasets is compared qualitatively with the latent space generated by the Gaussian Process Latent Variable Model (GPLVM). A quantitative comparison using existing evaluation measures from the literature allows performance of each mapping function to be compared. Finally, the mapping uncertainty measure is combined with NeuroScale to build a deep learning classifier, the Cascading RBF. This new structure is tested on the MNist dataset achieving world record performance whilst avoiding the flaws seen in other Deep Learning Machines.
Resumo:
Diabetes patients might suffer from an unhealthy life, long-term treatment and chronic complicated diseases. The decreasing hospitalization rate is a crucial problem for health care centers. This study combines the bagging method with base classifier decision tree and costs-sensitive analysis for diabetes patients' classification purpose. Real patients' data collected from a regional hospital in Thailand were analyzed. The relevance factors were selected and used to construct base classifier decision tree models to classify diabetes and non-diabetes patients. The bagging method was then applied to improve accuracy. Finally, asymmetric classification cost matrices were used to give more alternative models for diabetes data analysis.
Resumo:
Most research in the area of emotion detection in written text focused on detecting explicit expressions of emotions in text. In this paper, we present a rule-based pipeline approach for detecting implicit emotions in written text without emotion-bearing words based on the OCC Model. We have evaluated our approach on three different datasets with five emotion categories. Our results show that the proposed approach outperforms the lexicon matching method consistently across all the three datasets by a large margin of 17–30% in F-measure and gives competitive performance compared to a supervised classifier. In particular, when dealing with formal text which follows grammatical rules strictly, our approach gives an average F-measure of 82.7% on “Happy”, “Angry-Disgust” and “Sad”, even outperforming the supervised baseline by nearly 17% in F-measure. Our preliminary results show the feasibility of the approach for the task of implicit emotion detection in written text.