912 resultados para support vector regression


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Hundreds of Terabytes of CMS (Compact Muon Solenoid) data are being accumulated for storage day by day at the University of Nebraska-Lincoln, which is one of the eight US CMS Tier-2 sites. Managing this data includes retaining useful CMS data sets and clearing storage space for newly arriving data by deleting less useful data sets. This is an important task that is currently being done manually and it requires a large amount of time. The overall objective of this study was to develop a methodology to help identify the data sets to be deleted when there is a requirement for storage space. CMS data is stored using HDFS (Hadoop Distributed File System). HDFS logs give information regarding file access operations. Hadoop MapReduce was used to feed information in these logs to Support Vector Machines (SVMs), a machine learning algorithm applicable to classification and regression which is used in this Thesis to develop a classifier. Time elapsed in data set classification by this method is dependent on the size of the input HDFS log file since the algorithmic complexities of Hadoop MapReduce algorithms here are O(n). The SVM methodology produces a list of data sets for deletion along with their respective sizes. This methodology was also compared with a heuristic called Retention Cost which was calculated using size of the data set and the time since its last access to help decide how useful a data set is. Accuracies of both were compared by calculating the percentage of data sets predicted for deletion which were accessed at a later instance of time. Our methodology using SVMs proved to be more accurate than using the Retention Cost heuristic. This methodology could be used to solve similar problems involving other large data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Support Vector Machines (SVMs) have achieved very good performance on different learning problems. However, the success of SVMs depends on the adequate choice of the values of a number of parameters (e.g., the kernel and regularization parameters). In the current work, we propose the combination of meta-learning and search algorithms to deal with the problem of SVM parameter selection. In this combination, given a new problem to be solved, meta-learning is employed to recommend SVM parameter values based on parameter configurations that have been successfully adopted in previous similar problems. The parameter values returned by meta-learning are then used as initial search points by a search technique, which will further explore the parameter space. In this proposal, we envisioned that the initial solutions provided by meta-learning are located in good regions of the search space (i.e. they are closer to optimum solutions). Hence, the search algorithm would need to evaluate a lower number of candidate solutions when looking for an adequate solution. In this work, we investigate the combination of meta-learning with two search algorithms: Particle Swarm Optimization and Tabu Search. The implemented hybrid algorithms were used to select the values of two SVM parameters in the regression domain. These combinations were compared with the use of the search algorithms without meta-learning. The experimental results on a set of 40 regression problems showed that, on average, the proposed hybrid methods obtained lower error rates when compared to their components applied in isolation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When classifying a signal, ideally we want our classifier to trigger a large response when it encounters a positive example and have little to no response for all other examples. Unfortunately in practice this does not occur with responses fluctuating, often causing false alarms. There exists a myriad of reasons why this is the case, most notably not incorporating the dynamics of the signal into the classification. In facial expression recognition, this has been highlighted as one major research question. In this paper we present a novel technique which incorporates the dynamics of the signal which can produce a strong response when the peak expression is found and essentially suppresses all other responses as much as possible. We conducted preliminary experiments on the extended Cohn-Kanade (CK+) database which shows its benefits. The ability to automatically and accurately recognize facial expressions of drivers is highly relevant to the automobile. For example, the early recognition of “surprise” could indicate that an accident is about to occur; and various safeguards could immediately be deployed to avoid or minimize injury and damage. In this paper, we conducted initial experiments on the extended Cohn-Kanade (CK+) database which shows its benefits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we presented an automatic system for precise urban road model reconstruction based on aerial images with high spatial resolution. The proposed approach consists of two steps: i) road surface detection and ii) road pavement marking extraction. In the first step, support vector machine (SVM) was utilized to classify the images into two categories: road and non-road. In the second step, road lane markings are further extracted on the generated road surface based on 2D Gabor filters. The experiments using several pan-sharpened aerial images of Brisbane, Queensland have validated the proposed method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of appropriate features to characterize an output class or object is critical for all classification problems. This paper evaluates the capability of several spectral and texture features for object-based vegetation classification at the species level using airborne high resolution multispectral imagery. Image-objects as the basic classification unit were generated through image segmentation. Statistical moments extracted from original spectral bands and vegetation index image are used as feature descriptors for image objects (i.e. tree crowns). Several state-of-art texture descriptors such as Gray-Level Co-Occurrence Matrix (GLCM), Local Binary Patterns (LBP) and its extensions are also extracted for comparison purpose. Support Vector Machine (SVM) is employed for classification in the object-feature space. The experimental results showed that incorporating spectral vegetation indices can improve the classification accuracy and obtained better results than in original spectral bands, and using moments of Ratio Vegetation Index obtained the highest average classification accuracy in our experiment. The experiments also indicate that the spectral moment features also outperform or can at least compare with the state-of-art texture descriptors in terms of classification accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Electrocardiogram (ECG) is an important bio-signal representing the sum total of millions of cardiac cell depolarization potentials. It contains important insight into the state of health and nature of the disease afflicting the heart. Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart by the sympathetic and parasympathetic branches of the autonomic nervous system. The HRV signal can be used as a base signal to observe the heart's functioning. These signals are non-linear and non-stationary in nature. So, higher order spectral (HOS) analysis, which is more suitable for non-linear systems and is robust to noise, was used. An automated intelligent system for the identification of cardiac health is very useful in healthcare technology. In this work, we have extracted seven features from the heart rate signals using HOS and fed them to a support vector machine (SVM) for classification. Our performance evaluation protocol uses 330 subjects consisting of five different kinds of cardiac disease conditions. We demonstrate a sensitivity of 90% for the classifier with a specificity of 87.93%. Our system is ready to run on larger data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS–SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS–SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65–85% for hybrid PLS–SVM model respectively. Also it was found that the hybrid PLS–SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS–SVM model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart by the sympathetic and parasympathetic branches of the autonomic nervous system. HRV analysis is an important tool to observe the heart’s ability to respond to normal regulatory impulses that affect its rhythm. Like many bio-signals, HRV signals are non-linear in nature. Higher order spectral analysis (HOS) is known to be a good tool for the analysis of non-linear systems and provides good noise immunity. A computer-based arrhythmia detection system of cardiac states is very useful in diagnostics and disease management. In this work, we studied the identification of the HRV signals using features derived from HOS. These features were fed to the support vector machine (SVM) for classification. Our proposed system can classify the normal and other four classes of arrhythmia with an average accuracy of more than 85%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Calls from 14 species of bat were classified to genus and species using discriminant function analysis (DFA), support vector machines (SVM) and ensembles of neural networks (ENN). Both SVMs and ENNs outperformed DFA for every species while ENNs (mean identification rate – 97%) consistently outperformed SVMs (mean identification rate – 87%). Correct classification rates produced by the ENNs varied from 91% to 100%; calls from six species were correctly identified with 100% accuracy. Calls from the five species of Myotis, a genus whose species are considered difficult to distinguish acoustically, had correct identification rates that varied from 91 – 100%. Five parameters were most important for classifying calls correctly while seven others contributed little to classification performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a highly reliable fault diagnosis approach for low-speed bearings. The proposed approach first extracts wavelet-based fault features that represent diverse symptoms of multiple low-speed bearing defects. The most useful fault features for diagnosis are then selected by utilizing a genetic algorithm (GA)-based kernel discriminative feature analysis cooperating with one-against-all multicategory support vector machines (OAA MCSVMs). Finally, each support vector machine is individually trained with its own feature vector that includes the most discriminative fault features, offering the highest classification performance. In this study, the effectiveness of the proposed GA-based kernel discriminative feature analysis and the classification ability of individually trained OAA MCSVMs are addressed in terms of average classification accuracy. In addition, the proposedGA- based kernel discriminative feature analysis is compared with four other state-of-the-art feature analysis approaches. Experimental results indicate that the proposed approach is superior to other feature analysis methodologies, yielding an average classification accuracy of 98.06% and 94.49% under rotational speeds of 50 revolutions-per-minute (RPM) and 80 RPM, respectively. Furthermore, the individually trained MCSVMs with their own optimal fault features based on the proposed GA-based kernel discriminative feature analysis outperform the standard OAA MCSVMs, showing an average accuracy of 98.66% and 95.01% for bearings under rotational speeds of 50 RPM and 80 RPM, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, downscaling models are developed using a support vector machine (SVM) for obtaining projections of monthly mean maximum and minimum temperatures (T-max and T-min) to river-basin scale. The effectiveness of the model is demonstrated through application to downscale the predictands for the catchment of the Malaprabha reservoir in India, which is considered to be a climatically sensitive region. The probable predictor variables are extracted from (1) the National Centers for Environmental Prediction (NCEP) reanalysis dataset for the period 1978-2000, and (2) the simulations from the third-generation Canadian Coupled Global Climate Model (CGCM3) for emission scenarios A1B, A2, B1 and COMMIT for the period 1978-2100. The predictor variables are classified into three groups, namely A, B and C. Large-scale atmospheric variables Such as air temperature, zonal and meridional wind velocities at 925 nib which are often used for downscaling temperature are considered as predictors in Group A. Surface flux variables such as latent heat (LH), sensible heat, shortwave radiation and longwave radiation fluxes, which control temperature of the Earth's surface are tried as plausible predictors in Group B. Group C comprises of all the predictor variables in both the Groups A and B. The scatter plots and cross-correlations are used for verifying the reliability of the simulation of the predictor variables by the CGCM3 and to Study the predictor-predictand relationships. The impact of trend in predictor variables on downscaled temperature was studied. The predictor, air temperature at 925 mb showed an increasing trend, while the rest of the predictors showed no trend. The performance of the SVM models that are developed, one for each combination of predictor group, predictand, calibration period and location-based stratification (land, land and ocean) of climate variables, was evaluated. In general, the models which use predictor variables pertaining to land surface improved the performance of SVM models for downscaling T-max and T-min