26 resultados para Support vector machines


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis studies survival analysis techniques dealing with censoring to produce predictive tools that predict the risk of endovascular aortic aneurysm repair (EVAR) re-intervention. Censoring indicates that some patients do not continue follow up, so their outcome class is unknown. Methods dealing with censoring have drawbacks and cannot handle the high censoring of the two EVAR datasets collected. Therefore, this thesis presents a new solution to high censoring by modifying an approach that was incapable of differentiating between risks groups of aortic complications. Feature selection (FS) becomes complicated with censoring. Most survival FS methods depends on Cox's model, however machine learning classifiers (MLC) are preferred. Few methods adopted MLC to perform survival FS, but they cannot be used with high censoring. This thesis proposes two FS methods which use MLC to evaluate features. The two FS methods use the new solution to deal with censoring. They combine factor analysis with greedy stepwise FS search which allows eliminated features to enter the FS process. The first FS method searches for the best neural networks' configuration and subset of features. The second approach combines support vector machines, neural networks, and K nearest neighbor classifiers using simple and weighted majority voting to construct a multiple classifier system (MCS) for improving the performance of individual classifiers. It presents a new hybrid FS process by using MCS as a wrapper method and merging it with the iterated feature ranking filter method to further reduce the features. The proposed techniques outperformed FS methods based on Cox's model such as; Akaike and Bayesian information criteria, and least absolute shrinkage and selector operator in the log-rank test's p-values, sensitivity, and concordance. This proves that the proposed techniques are more powerful in correctly predicting the risk of re-intervention. Consequently, they enable doctors to set patients’ appropriate future observation plan.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Using techniques from Statistical Physics, the annealed VC entropy for hyperplanes in high dimensional spaces is calculated as a function of the margin for a spherical Gaussian distribution of inputs.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present an assessment of the practical value of existing traditional and non-standard measures for discriminating healthy people from people with Parkinson's disease (PD) by detecting dysphonia. We introduce a new measure of dysphonia, Pitch Period Entropy (PPE), which is robust to many uncontrollable confounding effects including noisy acoustic environments and normal, healthy variations in voice frequency. We collected sustained phonations from 31 people, 23 with PD. We then selected 10 highly uncorrelated measures, and an exhaustive search of all possible combinations of these measures finds four that in combination lead to overall correct classification performance of 91.4%, using a kernel support vector machine. In conclusion, we find that non-standard methods in combination with traditional harmonics-to-noise ratios are best able to separate healthy from PD subjects. The selected non-standard methods are robust to many uncontrollable variations in acoustic environment and individual subjects, and are thus well-suited to telemonitoring applications.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

MOTIVATION: G protein-coupled receptors (GPCRs) play an important role in many physiological systems by transducing an extracellular signal into an intracellular response. Over 50% of all marketed drugs are targeted towards a GPCR. There is considerable interest in developing an algorithm that could effectively predict the function of a GPCR from its primary sequence. Such an algorithm is useful not only in identifying novel GPCR sequences but in characterizing the interrelationships between known GPCRs. RESULTS: An alignment-free approach to GPCR classification has been developed using techniques drawn from data mining and proteochemometrics. A dataset of over 8000 sequences was constructed to train the algorithm. This represents one of the largest GPCR datasets currently available. A predictive algorithm was developed based upon the simplest reasonable numerical representation of the protein's physicochemical properties. A selective top-down approach was developed, which used a hierarchical classifier to assign sequences to subdivisions within the GPCR hierarchy. The predictive performance of the algorithm was assessed against several standard data mining classifiers and further validated against Support Vector Machine-based GPCR prediction servers. The selective top-down approach achieves significantly higher accuracy than standard data mining methods in almost all cases.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Electrocardiography (ECG) has been recently proposed as biometric trait for identification purposes. Intra-individual variations of ECG might affect identification performance. These variations are mainly due to Heart Rate Variability (HRV). In particular, HRV causes changes in the QT intervals along the ECG waveforms. This work is aimed at analysing the influence of seven QT interval correction methods (based on population models) on the performance of ECG-fiducial-based identification systems. In addition, we have also considered the influence of training set size, classifier, classifier ensemble as well as the number of consecutive heartbeats in a majority voting scheme. The ECG signals used in this study were collected from thirty-nine subjects within the Physionet open access database. Public domain software was used for fiducial points detection. Results suggested that QT correction is indeed required to improve the performance. However, there is no clear choice among the seven explored approaches for QT correction (identification rate between 0.97 and 0.99). MultiLayer Perceptron and Support Vector Machine seemed to have better generalization capabilities, in terms of classification performance, with respect to Decision Tree-based classifiers. No such strong influence of the training-set size and the number of consecutive heartbeats has been observed on the majority voting scheme.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A novel versatile digital signal processing (DSP)-based equalizer using support vector machine regression (SVR) is proposed for 16-quadrature amplitude modulated (16-QAM) coherent optical orthogonal frequency-division multiplexing (CO-OFDM) and experimentally compared to traditional DSP-based deterministic fiber-induced nonlinearity equalizers (NLEs), namely the full-field digital back-propagation (DBP) and the inverse Volterra series transfer function-based NLE (V-NLE). For a 40 Gb/s 16-QAM CO-OFDM at 2000 km, SVR-NLE extends the optimum launched optical power (LOP) by 4 dB compared to V-NLE by means of reduction of fiber nonlinearity. In comparison to full-field DBP at a LOP of 6 dBm, SVR-NLE outperforms by ∼1 dB in Q-factor. In addition, SVR-NLE is the most computational efficient DSP-NLE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Identifying biological markers to aid diagnosis of bipolar disorder (BD) is critically important. To be considered a possible biological marker, neural patterns in BD should be discriminant from those in healthy individuals (HI). We examined patterns of neuromagnetic responses revealed by magnetoencephalography (MEG) during implicit emotion-processing using emotional (happy, fearful, sad) and neutral facial expressions, in sixteen BD and sixteen age- and gender-matched healthy individuals. Methods: Neuromagnetic data were recorded using a 306-channel whole-head MEG ELEKTA Neuromag System, and preprocessed using Signal Space Separation as implemented in MaxFilter (ELEKTA). Custom Matlab programs removed EOG and ECG signals from filtered MEG data, and computed means of epoched data (0-250ms, 250-500ms, 500-750ms). A generalized linear model with three factors (individual, emotion intensity and time) compared BD and HI. A principal component analysis of normalized mean channel data in selected brain regions identified principal components that explained 95% of data variation. These components were used in a quadratic support vector machine (SVM) pattern classifier. SVM classifier performance was assessed using the leave-one-out approach. Results: BD and HI showed significantly different patterns of activation for 0-250ms within both left occipital and temporal regions, specifically for neutral facial expressions. PCA analysis revealed significant differences between BD and HI for mild fearful, happy, and sad facial expressions within 250-500ms. SVM quadratic classifier showed greatest accuracy (84%) and sensitivity (92%) for neutral faces, in left occipital regions within 500-750ms. Conclusions: MEG responses may be used in the search for disease specific neural markers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The accuracy of a map is dependent on the reference dataset used in its construction. Classification analyses used in thematic mapping can, for example, be sensitive to a range of sampling and data quality concerns. With particular focus on the latter, the effects of reference data quality on land cover classifications from airborne thematic mapper data are explored. Variations in sampling intensity and effort are highlighted in a dataset that is widely used in mapping and modelling studies; these may need accounting for in analyses. The quality of the labelling in the reference dataset was also a key variable influencing mapping accuracy. Accuracy varied with the amount and nature of mislabelled training cases with the nature of the effects varying between classifiers. The largest impacts on accuracy occurred when mislabelling involved confusion between similar classes. Accuracy was also typically negatively related to the magnitude of mislabelled cases and the support vector machine (SVM), which has been claimed to be relatively insensitive to training data error, was the most sensitive of the set of classifiers investigated, with overall classification accuracy declining by 8% (significant at 95% level of confidence) with the use of a training set containing 20% mislabelled cases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The thesis describes an investigation into methods for the specification, design and implementation of computer control systems for flexible manufacturing machines comprising multiple, independent, electromechanically-driven mechanisms. An analysis is made of the elements of conventional mechanically-coupled machines in order that the operational functions of these elements may be identified. This analysis is used to define the scope of requirements necessary to specify the format, function and operation of a flexible, independently driven mechanism machine. A discussion of how this type of machine can accommodate modern manufacturing needs of high-speed and flexibility is presented. A sequential method of capturing requirements for such machines is detailed based on a hierarchical partitioning of machine requirements from product to independent drive mechanism. A classification of mechanisms using notations, including Data flow diagrams and Petri-nets, is described which supports capture and allows validation of requirements. A generic design for a modular, IDM machine controller is derived based upon hierarchy of control identified in these machines. A two mechanism experimental machine is detailed which is used to demonstrate the application of the specification, design and implementation techniques. A computer controller prototype and a fully flexible implementation for the IDM machine, based on Petri-net models described using the concurrent programming language Occam, is detailed. The ability of this modular computer controller to support flexible, safe and fault-tolerant operation of the two intermittent motion, discrete-synchronisation independent drive mechanisms is presented. The application of the machine development methodology to industrial projects is established.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, learning word vector representations has attracted much interest in Natural Language Processing. Word representations or embeddings learned using unsupervised methods help addressing the problem of traditional bag-of-word approaches which fail to capture contextual semantics. In this paper we go beyond the vector representations at the word level and propose a novel framework that learns higher-level feature representations of n-grams, phrases and sentences using a deep neural network built from stacked Convolutional Restricted Boltzmann Machines (CRBMs). These representations have been shown to map syntactically and semantically related n-grams to closeby locations in the hidden feature space. We have experimented to additionally incorporate these higher-level features into supervised classifier training for two sentiment analysis tasks: subjectivity classification and sentiment classification. Our results have demonstrated the success of our proposed framework with 4% improvement in accuracy observed for subjectivity classification and improved the results achieved for sentiment classification over models trained without our higher level features.