11 resultados para ensemble classifiers

em Aston University Research Archive


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Feature selection is important in medical field for many reasons. However, selecting important variables is a difficult task with the presence of censoring that is a unique feature in survival data analysis. This paper proposed an approach to deal with the censoring problem in endovascular aortic repair survival data through Bayesian networks. It was merged and embedded with a hybrid feature selection process that combines cox's univariate analysis with machine learning approaches such as ensemble artificial neural networks to select the most relevant predictive variables. The proposed algorithm was compared with common survival variable selection approaches such as; least absolute shrinkage and selection operator LASSO, and Akaike information criterion AIC methods. The results showed that it was capable of dealing with high censoring in the datasets. Moreover, ensemble classifiers increased the area under the roc curves of the two datasets collected from two centers located in United Kingdom separately. Furthermore, ensembles constructed with center 1 enhanced the concordance index of center 2 prediction compared to the model built with a single network. Although the size of the final reduced model using the neural networks and its ensembles is greater than other methods, the model outperformed the others in both concordance index and sensitivity for center 2 prediction. This indicates the reduced model is more powerful for cross center prediction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Using techniques from Statistical Physics, the annealed VC entropy for hyperplanes in high dimensional spaces is calculated as a function of the margin for a spherical Gaussian distribution of inputs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We apply methods of Statistical Mechanics to study the generalization performance of Support vector Machines in large data spaces.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Electrocardiography (ECG) has been recently proposed as biometric trait for identification purposes. Intra-individual variations of ECG might affect identification performance. These variations are mainly due to Heart Rate Variability (HRV). In particular, HRV causes changes in the QT intervals along the ECG waveforms. This work is aimed at analysing the influence of seven QT interval correction methods (based on population models) on the performance of ECG-fiducial-based identification systems. In addition, we have also considered the influence of training set size, classifier, classifier ensemble as well as the number of consecutive heartbeats in a majority voting scheme. The ECG signals used in this study were collected from thirty-nine subjects within the Physionet open access database. Public domain software was used for fiducial points detection. Results suggested that QT correction is indeed required to improve the performance. However, there is no clear choice among the seven explored approaches for QT correction (identification rate between 0.97 and 0.99). MultiLayer Perceptron and Support Vector Machine seemed to have better generalization capabilities, in terms of classification performance, with respect to Decision Tree-based classifiers. No such strong influence of the training-set size and the number of consecutive heartbeats has been observed on the majority voting scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance. This study proposed a predictor called enDNA-Prot for DNA-binding protein identification by employing the ensemble learning technique. Experiential results showed that enDNA-Prot was comparable with DNA-Prot and outperformed DNAbinder and iDNA-Prot with performance improvement in the range of 3.97-9.52% in ACC and 0.08-0.19 in MCC. Furthermore, when the benchmark dataset was expanded with negative samples, the performance of enDNA-Prot outperformed the three existing methods by 2.83-16.63% in terms of ACC and 0.02-0.16 in terms of MCC. It indicated that enDNA-Prot is an effective method for DNA-binding protein identification and expanding training dataset with negative samples can improve its performance. For the convenience of the vast majority of experimental scientists, we developed a user-friendly web-server for enDNA-Prot which is freely accessible to the public. © 2014 Ruifeng Xu et al.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this study is to evaluate the application of ensemble averaging to the analysis of electromyography recordings under whole body vibratory stimulation. Recordings from Rectus Femoris, collected during vibratory stimulation at different frequencies, are used. Each signal is subdivided in intervals, which time duration is related to the vibration frequency. Finally the average of the segmented intervals is performed. By using this method for the majority of the recordings the periodic components emerge. The autocorrelation of few seconds of signals confirms the presence of a pseudosinusoidal components strictly related to the soft tissues oscillations caused by the mechanical waves. © 2014 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Social media has become an effective channel for communicating both trends and public opinion on current events. However the automatic topic classification of social media content pose various challenges. Topic classification is a common technique used for automatically capturing themes that emerge from social media streams. However, such techniques are sensitive to the evolution of topics when new event-dependent vocabularies start to emerge (e.g., Crimea becoming relevant to War Conflict during the Ukraine crisis in 2014). Therefore, traditional supervised classification methods which rely on labelled data could rapidly become outdated. In this paper we propose a novel transfer learning approach to address the classification task of new data when the only available labelled data belong to a previous epoch. This approach relies on the incorporation of knowledge from DBpedia graphs. Our findings show promising results in understanding how features age, and how semantic features can support the evolution of topic classifiers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: Influenza A viral heterogeneity remains a significant threat due to unpredictable antigenic drift in seasonal influenza and antigenic shifts caused by the emergence of novel subtypes. Annual review of multivalent influenza vaccines targets strains of influenza A and B likely to be predominant in future influenza seasons. This does not induce broad, cross protective immunity against emergent subtypes. Better strategies are needed to prevent future pandemics. Cross-protection can be achieved by activating CD8+ and CD4+ T cells against highly-conserved regions of the influenza genome. We combine available experimental data with informatics-based immunological predictions to help design vaccines potentially able to induce cross-protective T-cells against multiple influenza subtypes. Results: To exemplify our approach we designed two epitope ensemble vaccines comprising highly-conserved and experimentally-verified immunogenic influenza A epitopes as putative non-seasonal influenza vaccines; one specifically targets the US population and the other is a universal vaccine. The USA-specific vaccine comprised 6 CD8+ T cell epitopes (GILGFVFTL, FMYSDFHFI, GMDPRMCSL, SVKEKDMTK, FYIQMCTEL, DTVNRTHQY) and 3 CD4+ epitopes (KGILGFVFTLTVPSE, EYIMKGVYINTALLN, ILGFVFTLTVPSERG). The universal vaccine comprised 8 CD8+ epitopes: (FMYSDFHFI, GILGFVFTL, ILRGSVAHK, FYIQMCTEL, ILKGKFQTA, YYLEKANKI, VSDGGPNLY, YSHGTGTGY) and the same 3 CD4+ epitopes. Our USA-specific vaccine has a population protection coverage (portion of the population potentially responsive to one or more component epitopes of the vaccine, PPC) of over 96% and 95% coverage of observed influenza subtypes. The universal vaccine has a PPC value of over 97% and 88% coverage of observed subtypes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study explored the effects on speech intelligibility of across-formant differences in fundamental frequency (ΔF0) and F0 contour. Sentence-length speech analogues were presented dichotically (left=F1+F3; right=F2), either alone or—because competition usually reveals grouping cues most clearly—accompanied in the left ear by a competitor for F2 (F2C) that listeners must reject to optimize recognition. F2C was created by inverting the F2 frequency contour. In experiment 1, all left-ear formants shared the same constant F0 and ΔF0F2 was 0 or ±4 semitones. In experiment 2, all left-ear formants shared the natural F0 contour and that for F2 was natural, constant, exaggerated, or inverted. Adding F2C lowered keyword scores, presumably because of informational masking. The results for experiment 1 were complicated by effects associated with the direction of ΔF0F2; this problem was avoided in experiment 2 because all four F0 contours had the same geometric mean frequency. When the target formants were presented alone, scores were relatively high and did not depend on the F0F2 contour. F2C impact was greater when F2 had a different F0 contour from the other formants. This effect was a direct consequence of the associated ΔF0; the F0F2 contour per se did not influence competitor impact.