102 resultados para Support vector regression

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most studies involving statistical time series analysis rely on assumptions of linearity, which by its simplicity facilitates parameter interpretation and estimation. However, the linearity assumption may be too restrictive for many practical applications. The implementation of nonlinear models in time series analysis involves the estimation of a large set of parameters, frequently leading to overfitting problems. In this article, a predictability coefficient is estimated using a combination of nonlinear autoregressive models and the use of support vector regression in this model is explored. We illustrate the usefulness and interpretability of results by using electroencephalographic records of an epileptic patient.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Age-related changes in running kinematics have been reported in the literature using classical inferential statistics. However, this approach has been hampered by the increased number of biomechanical gait variables reported and subsequently the lack of differences presented in these studies. Data mining techniques have been applied in recent biomedical studies to solve this problem using a more general approach. In the present work, we re-analyzed lower extremity running kinematic data of 17 young and 17 elderly male runners using the Support Vector Machine (SVM) classification approach. In total, 31 kinematic variables were extracted to train the classification algorithm and test the generalized performance. The results revealed different accuracy rates across three different kernel methods adopted in the classifier, with the linear kernel performing the best. A subsequent forward feature selection algorithm demonstrated that with only six features, the linear kernel SVM achieved 100% classification performance rate, showing that these features provided powerful combined information to distinguish age groups. The results of the present work demonstrate potential in applying this approach to improve knowledge about the age-related differences in running gait biomechanics and encourages the use of the SVM in other clinical contexts. (C) 2010 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

State of Sao Paulo Research Foundation (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The application of functional magnetic resonance imaging (fMRI) in neuroscience studies has increased enormously in the last decade. Although primarily used to map brain regions activated by specific stimuli, many studies have shown that fMRI can also be useful in identifying interactions between brain regions (functional and effective connectivity). Despite the widespread use of fMRI as a research tool, clinical applications of brain connectivity as studied by fMRI are not well established. One possible explanation is the lack of normal pattern, and intersubject variability-two variables that are still largely uncharacterized in most patient populations of interest. In the current study, we combine the identification of functional connectivity networks extracted by using Spearman partial correlation with the use of a one-class support vector machine in order construct a normative database. An application of this approach is illustrated using an fMRI dataset of 43 healthy Subjects performing a visual working memory task. In addition, the relationships between the results obtained and behavioral data are explored. Hum Brain Mapp 30:1068-1076, 2009. (C) 2008 Wiley-Liss. Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Functional magnetic resonance imaging (fMRI) is currently one of the most widely used methods for studying human brain function in vivo. Although many different approaches to fMRI analysis are available, the most widely used methods employ so called ""mass-univariate"" modeling of responses in a voxel-by-voxel fashion to construct activation maps. However, it is well known that many brain processes involve networks of interacting regions and for this reason multivariate analyses might seem to be attractive alternatives to univariate approaches. The current paper focuses on one multivariate application of statistical learning theory: the statistical discrimination maps (SDM) based on support vector machine, and seeks to establish some possible interpretations when the results differ from univariate `approaches. In fact, when there are changes not only on the activation level of two conditions but also on functional connectivity, SDM seems more informative. We addressed this question using both simulations and applications to real data. We have shown that the combined use of univariate approaches and SDM yields significant new insights into brain activations not available using univariate methods alone. In the application to a visual working memory fMRI data, we demonstrated that the interaction among brain regions play a role in SDM`s power to detect discriminative voxels. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes an improved voice activity detection (VAD) algorithm using wavelet and support vector machine (SVM) for European Telecommunication Standards Institution (ETS1) adaptive multi-rate (AMR) narrow-band (NB) and wide-band (WB) speech codecs. First, based on the wavelet transform, the original IIR filter bank and pitch/tone detector are implemented, respectively, via the wavelet filter bank and the wavelet-based pitch/tone detection algorithm. The wavelet filter bank can divide input speech signal into several frequency bands so that the signal power level at each sub-band can be calculated. In addition, the background noise level can be estimated in each sub-band by using the wavelet de-noising method. The wavelet filter bank is also derived to detect correlated complex signals like music. Then the proposed algorithm can apply SVM to train an optimized non-linear VAD decision rule involving the sub-band power, noise level, pitch period, tone flag, and complex signals warning flag of input speech signals. By the use of the trained SVM, the proposed VAD algorithm can produce more accurate detection results. Various experimental results carried out from the Aurora speech database with different noise conditions show that the proposed algorithm gives considerable VAD performances superior to the AMR-NB VAD Options 1 and 2, and AMR-WB VAD. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To develop a model to predict the bleeding source and identify the cohort amongst patients with acute gastrointestinal bleeding (GIB) who require urgent intervention, including endoscopy. Patients with acute GIB, an unpredictable event, are most commonly evaluated and managed by non-gastroenterologists. Rapid and consistently reliable risk stratification of patients with acute GIB for urgent endoscopy may potentially improve outcomes amongst such patients by targeting scarce health-care resources to those who need it the most. Design and methods: Using ICD-9 codes for acute GIB, 189 patients with acute GIB and all. available data variables required to develop and test models were identified from a hospital medical records database. Data on 122 patients was utilized for development of the model and on 67 patients utilized to perform comparative analysis of the models. Clinical data such as presenting signs and symptoms, demographic data, presence of co-morbidities, laboratory data and corresponding endoscopic diagnosis and outcomes were collected. Clinical data and endoscopic diagnosis collected for each patient was utilized to retrospectively ascertain optimal management for each patient. Clinical presentations and corresponding treatment was utilized as training examples. Eight mathematical models including artificial neural network (ANN), support vector machine (SVM), k-nearest neighbor, linear discriminant analysis (LDA), shrunken centroid (SC), random forest (RF), logistic regression, and boosting were trained and tested. The performance of these models was compared using standard statistical analysis and ROC curves. Results: Overall the random forest model best predicted the source, need for resuscitation, and disposition with accuracies of approximately 80% or higher (accuracy for endoscopy was greater than 75%). The area under ROC curve for RF was greater than 0.85, indicating excellent performance by the random forest model Conclusion: While most mathematical models are effective as a decision support system for evaluation and management of patients with acute GIB, in our testing, the RF model consistently demonstrated the best performance. Amongst patients presenting with acute GIB, mathematical models may facilitate the identification of the source of GIB, need for intervention and allow optimization of care and healthcare resource allocation; these however require further validation. (c) 2007 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective: We carry out a systematic assessment on a suite of kernel-based learning machines while coping with the task of epilepsy diagnosis through automatic electroencephalogram (EEG) signal classification. Methods and materials: The kernel machines investigated include the standard support vector machine (SVM), the least squares SVM, the Lagrangian SVM, the smooth SVM, the proximal SVM, and the relevance vector machine. An extensive series of experiments was conducted on publicly available data, whose clinical EEG recordings were obtained from five normal subjects and five epileptic patients. The performance levels delivered by the different kernel machines are contrasted in terms of the criteria of predictive accuracy, sensitivity to the kernel function/parameter value, and sensitivity to the type of features extracted from the signal. For this purpose, 26 values for the kernel parameter (radius) of two well-known kernel functions (namely. Gaussian and exponential radial basis functions) were considered as well as 21 types of features extracted from the EEG signal, including statistical values derived from the discrete wavelet transform, Lyapunov exponents, and combinations thereof. Results: We first quantitatively assess the impact of the choice of the wavelet basis on the quality of the features extracted. Four wavelet basis functions were considered in this study. Then, we provide the average accuracy (i.e., cross-validation error) values delivered by 252 kernel machine configurations; in particular, 40%/35% of the best-calibrated models of the standard and least squares SVMs reached 100% accuracy rate for the two kernel functions considered. Moreover, we show the sensitivity profiles exhibited by a large sample of the configurations whereby one can visually inspect their levels of sensitiveness to the type of feature and to the kernel function/parameter value. Conclusions: Overall, the results evidence that all kernel machines are competitive in terms of accuracy, with the standard and least squares SVMs prevailing more consistently. Moreover, the choice of the kernel function and parameter value as well as the choice of the feature extractor are critical decisions to be taken, albeit the choice of the wavelet family seems not to be so relevant. Also, the statistical values calculated over the Lyapunov exponents were good sources of signal representation, but not as informative as their wavelet counterparts. Finally, a typical sensitivity profile has emerged among all types of machines, involving some regions of stability separated by zones of sharp variation, with some kernel parameter values clearly associated with better accuracy rates (zones of optimality). (C) 2011 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently, we have built a classification model that is capable of assigning a given sesquiterpene lactone (STL) into exactly one tribe of the plant family Asteraceae from which the STL has been isolated. Although many plant species are able to biosynthesize a set of peculiar compounds, the occurrence of the same secondary metabolites in more than one tribe of Asteraceae is frequent. Building on our previous work, in this paper, we explore the possibility of assigning an STL to more than one tribe (class) simultaneously. When an object may belong to more than one class simultaneously, it is called multilabeled. In this work, we present a general overview of the techniques available to examine multilabeled data. The problem of evaluating the performance of a multilabeled classifier is discussed. Two particular multilabeled classification methods-cross-training with support vector machines (ct-SVM) and multilabeled k-nearest neighbors (M-L-kNN)were applied to the classification of the STLs into seven tribes from the plant family Asteraceae. The results are compared to a single-label classification and are analyzed from a chemotaxonomic point of view. The multilabeled approach allowed us to (1) model the reality as closely as possible, (2) improve our understanding of the relationship between the secondary metabolite profiles of different Asteraceae tribes, and (3) significantly decrease the number of plant sources to be considered for finding a certain STL. The presented classification models are useful for the targeted collection of plants with the objective of finding plant sources of natural compounds that are biologically active or possess other specific properties of interest.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

There is not a specific test to diagnose Alzheimer`s disease (AD). Its diagnosis should be based upon clinical history, neuropsychological and laboratory tests, neuroimaging and electroencephalography (EEG). Therefore, new approaches are necessary to enable earlier and more accurate diagnosis and to follow treatment results. In this study we used a Machine Learning (ML) technique, named Support Vector Machine (SVM), to search patterns in EEG epochs to differentiate AD patients from controls. As a result, we developed a quantitative EEG (qEEG) processing method for automatic differentiation of patients with AD from normal individuals, as a complement to the diagnosis of probable dementia. We studied EEGs from 19 normal subjects (14 females/5 males, mean age 71.6 years) and 16 probable mild to moderate symptoms AD patients (14 females/2 males, mean age 73.4 years. The results obtained from analysis of EEG epochs were accuracy 79.9% and sensitivity 83.2%. The analysis considering the diagnosis of each individual patient reached 87.0% accuracy and 91.7% sensitivity.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Here, we examine morphological changes in cortical thickness of patients with Alzheimer`s disease (AD) using image analysis algorithms for brain structure segmentation and study automatic classification of AD patients using cortical and volumetric data. Cortical thickness of AD patients (n = 14) was measured using MRI cortical surface-based analysis and compared with healthy subjects (n = 20). Data was analyzed using an automated algorithm for tissue segmentation and classification. A Support Vector Machine (SVM) was applied over the volumetric measurements of subcortical and cortical structures to separate AD patients from controls. The group analysis showed cortical thickness reduction in the superior temporal lobe, parahippocampal gyrus, and enthorhinal cortex in both hemispheres. We also found cortical thinning in the isthmus of cingulate gyrus and middle temporal gyrus at the right hemisphere, as well as a reduction of the cortical mantle in areas previously shown to be associated with AD. We also confirmed that automatic classification algorithms (SVM) could be helpful to distinguish AD patients from healthy controls. Moreover, the same areas implicated in the pathogenesis of AD were the main parameters driving the classification algorithm. While the patient sample used in this study was relatively small, we expect that using a database of regional volumes derived from MRI scans of a large number of subjects will increase the SVM power of AD patient identification.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Pattern recognition methods have been successfully applied in several functional neuroimaging studies. These methods can be used to infer cognitive states, so-called brain decoding. Using such approaches, it is possible to predict the mental state of a subject or a stimulus class by analyzing the spatial distribution of neural responses. In addition it is possible to identify the regions of the brain containing the information that underlies the classification. The Support Vector Machine (SVM) is one of the most popular methods used to carry out this type of analysis. The aim of the current study is the evaluation of SVM and Maximum uncertainty Linear Discrimination Analysis (MLDA) in extracting the voxels containing discriminative information for the prediction of mental states. The comparison has been carried out using fMRI data from 41 healthy control subjects who participated in two experiments, one involving visual-auditory stimulation and the other based on bimanual fingertapping sequences. The results suggest that MLDA uses significantly more voxels containing discriminative information (related to different experimental conditions) to classify the data. On the other hand, SVM is more parsimonious and uses less voxels to achieve similar classification accuracies. In conclusion, MLDA is mostly focused on extracting all discriminative information available, while SVM extracts the information which is sufficient for classification. (C) 2009 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Various popular machine learning techniques, like support vector machines, are originally conceived for the solution of two-class (binary) classification problems. However, a large number of real problems present more than two classes. A common approach to generalize binary learning techniques to solve problems with more than two classes, also known as multiclass classification problems, consists of hierarchically decomposing the multiclass problem into multiple binary sub-problems, whose outputs are combined to define the predicted class. This strategy results in a tree of binary classifiers, where each internal node corresponds to a binary classifier distinguishing two groups of classes and the leaf nodes correspond to the problem classes. This paper investigates how measures of the separability between classes can be employed in the construction of binary-tree-based multiclass classifiers, adapting the decompositions performed to each particular multiclass problem. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes a filter-based algorithm for feature selection. The filter is based on the partitioning of the set of features into clusters. The number of clusters, and consequently the cardinality of the subset of selected features, is automatically estimated from data. The computational complexity of the proposed algorithm is also investigated. A variant of this filter that considers feature-class correlations is also proposed for classification problems. Empirical results involving ten datasets illustrate the performance of the developed algorithm, which in general has obtained competitive results in terms of classification accuracy when compared to state of the art algorithms that find clusters of features. We show that, if computational efficiency is an important issue, then the proposed filter May be preferred over their counterparts, thus becoming eligible to join a pool of feature selection algorithms to be used in practice. As an additional contribution of this work, a theoretical framework is used to formally analyze some properties of feature selection methods that rely on finding clusters of features. (C) 2011 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.