348 resultados para SVM-RFE


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Although cross-sectional diffusion tensor imaging (DTI) studies revealed significant white matter changes in mild cognitive impairment (MCI), the utility of this technique in predicting further cognitive decline is debated. Thirty-five healthy controls (HC) and 67 MCI subjects with DTI baseline data were neuropsychologically assessed at one year. Among them, there were 40 stable (sMCI; 9 single domain amnestic, 7 single domain frontal, 24 multiple domain) and 27 were progressive (pMCI; 7 single domain amnestic, 4 single domain frontal, 16 multiple domain). Fractional anisotropy (FA) and longitudinal, radial, and mean diffusivity were measured using Tract-Based Spatial Statistics. Statistics included group comparisons and individual classification of MCI cases using support vector machines (SVM). FA was significantly higher in HC compared to MCI in a distributed network including the ventral part of the corpus callosum, right temporal and frontal pathways. There were no significant group-level differences between sMCI versus pMCI or between MCI subtypes after correction for multiple comparisons. However, SVM analysis allowed for an individual classification with accuracies up to 91.4% (HC versus MCI) and 98.4% (sMCI versus pMCI). When considering the MCI subgroups separately, the minimum SVM classification accuracy for stable versus progressive cognitive decline was 97.5% in the multiple domain MCI group. SVM analysis of DTI data provided highly accurate individual classification of stable versus progressive MCI regardless of MCI subtype, indicating that this method may become an easily applicable tool for early individual detection of MCI subjects evolving to dementia.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Due to the advances in sensor networks and remote sensing technologies, the acquisition and storage rates of meteorological and climatological data increases every day and ask for novel and efficient processing algorithms. A fundamental problem of data analysis and modeling is the spatial prediction of meteorological variables in complex orography, which serves among others to extended climatological analyses, for the assimilation of data into numerical weather prediction models, for preparing inputs to hydrological models and for real time monitoring and short-term forecasting of weather.In this thesis, a new framework for spatial estimation is proposed by taking advantage of a class of algorithms emerging from the statistical learning theory. Nonparametric kernel-based methods for nonlinear data classification, regression and target detection, known as support vector machines (SVM), are adapted for mapping of meteorological variables in complex orography.With the advent of high resolution digital elevation models, the field of spatial prediction met new horizons. In fact, by exploiting image processing tools along with physical heuristics, an incredible number of terrain features which account for the topographic conditions at multiple spatial scales can be extracted. Such features are highly relevant for the mapping of meteorological variables because they control a considerable part of the spatial variability of meteorological fields in the complex Alpine orography. For instance, patterns of orographic rainfall, wind speed and cold air pools are known to be correlated with particular terrain forms, e.g. convex/concave surfaces and upwind sides of mountain slopes.Kernel-based methods are employed to learn the nonlinear statistical dependence which links the multidimensional space of geographical and topographic explanatory variables to the variable of interest, that is the wind speed as measured at the weather stations or the occurrence of orographic rainfall patterns as extracted from sequences of radar images. Compared to low dimensional models integrating only the geographical coordinates, the proposed framework opens a way to regionalize meteorological variables which are multidimensional in nature and rarely show spatial auto-correlation in the original space making the use of classical geostatistics tangled.The challenges which are explored during the thesis are manifolds. First, the complexity of models is optimized to impose appropriate smoothness properties and reduce the impact of noisy measurements. Secondly, a multiple kernel extension of SVM is considered to select the multiscale features which explain most of the spatial variability of wind speed. Then, SVM target detection methods are implemented to describe the orographic conditions which cause persistent and stationary rainfall patterns. Finally, the optimal splitting of the data is studied to estimate realistic performances and confidence intervals characterizing the uncertainty of predictions.The resulting maps of average wind speeds find applications within renewable resources assessment and opens a route to decrease the temporal scale of analysis to meet hydrological requirements. Furthermore, the maps depicting the susceptibility to orographic rainfall enhancement can be used to improve current radar-based quantitative precipitation estimation and forecasting systems and to generate stochastic ensembles of precipitation fields conditioned upon the orography.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this work we present a simulation of a recognition process with perimeter characterization of a simple plant leaves as a unique discriminating parameter. Data coding allowing for independence of leaves size and orientation may penalize performance recognition for some varieties. Border description sequences are then used, and Principal Component Analysis (PCA) is applied in order to study which is the best number of components for the classification task, implemented by means of a Support Vector Machine (SVM) System. Obtained results are satisfactory, and compared with [4] our system improves the recognition success, diminishing the variance at the same time.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the relevance of morphological operators for the classification of land use in urban scenes using submetric panchromatic imagery. A support vector machine is used for the classification. Six types of filters have been employed: opening and closing, opening and closing by reconstruction, and opening and closing top hat. The type and scale of the filters are discussed, and a feature selection algorithm called recursive feature elimination is applied to decrease the dimensionality of the input data. The analysis performed on two QuickBird panchromatic images showed that simple opening and closing operators are the most relevant for classification at such a high spatial resolution. Moreover, mixed sets combining simple and reconstruction filters provided the best performance. Tests performed on both images, having areas characterized by different architectural styles, yielded similar results for both feature selection and classification accuracy, suggesting the generalization of the feature sets highlighted.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Spatial data analysis mapping and visualization is of great importance in various fields: environment, pollution, natural hazards and risks, epidemiology, spatial econometrics, etc. A basic task of spatial mapping is to make predictions based on some empirical data (measurements). A number of state-of-the-art methods can be used for the task: deterministic interpolations, methods of geostatistics: the family of kriging estimators (Deutsch and Journel, 1997), machine learning algorithms such as artificial neural networks (ANN) of different architectures, hybrid ANN-geostatistics models (Kanevski and Maignan, 2004; Kanevski et al., 1996), etc. All the methods mentioned above can be used for solving the problem of spatial data mapping. Environmental empirical data are always contaminated/corrupted by noise, and often with noise of unknown nature. That's one of the reasons why deterministic models can be inconsistent, since they treat the measurements as values of some unknown function that should be interpolated. Kriging estimators treat the measurements as the realization of some spatial randomn process. To obtain the estimation with kriging one has to model the spatial structure of the data: spatial correlation function or (semi-)variogram. This task can be complicated if there is not sufficient number of measurements and variogram is sensitive to outliers and extremes. ANN is a powerful tool, but it also suffers from the number of reasons. of a special type ? multiplayer perceptrons ? are often used as a detrending tool in hybrid (ANN+geostatistics) models (Kanevski and Maignank, 2004). Therefore, development and adaptation of the method that would be nonlinear and robust to noise in measurements, would deal with the small empirical datasets and which has solid mathematical background is of great importance. The present paper deals with such model, based on Statistical Learning Theory (SLT) - Support Vector Regression. SLT is a general mathematical framework devoted to the problem of estimation of the dependencies from empirical data (Hastie et al, 2004; Vapnik, 1998). SLT models for classification - Support Vector Machines - have shown good results on different machine learning tasks. The results of SVM classification of spatial data are also promising (Kanevski et al, 2002). The properties of SVM for regression - Support Vector Regression (SVR) are less studied. First results of the application of SVR for spatial mapping of physical quantities were obtained by the authorsin for mapping of medium porosity (Kanevski et al, 1999), and for mapping of radioactively contaminated territories (Kanevski and Canu, 2000). The present paper is devoted to further understanding of the properties of SVR model for spatial data analysis and mapping. Detailed description of the SVR theory can be found in (Cristianini and Shawe-Taylor, 2000; Smola, 1996) and basic equations for the nonlinear modeling are given in section 2. Section 3 discusses the application of SVR for spatial data mapping on the real case study - soil pollution by Cs137 radionuclide. Section 4 discusses the properties of the modelapplied to noised data or data with outliers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Raman spectroscopy combined with chemometrics has recently become a widespread technique for the analysis of pharmaceutical solid forms. The application presented in this paper is the investigation of counterfeit medicines. This increasingly serious issue involves networks that are an integral part of industrialized organized crime. Efficient analytical tools are consequently required to fight against it. Quick and reliable authentication means are needed to allow the deployment of measures from the company and the authorities. For this purpose a method in two steps has been implemented here. The first step enables the identification of pharmaceutical tablets and capsules and the detection of their counterfeits. A nonlinear classification method, the Support Vector Machines (SVM), is computed together with a correlation with the database and the detection of Active Pharmaceutical Ingredient (API) peaks in the suspect product. If a counterfeit is detected, the second step allows its chemical profiling among former counterfeits in a forensic intelligence perspective. For this second step a classification based on Principal Component Analysis (PCA) and correlation distance measurements is applied to the Raman spectra of the counterfeits.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The decision-making process regarding drug dose, regularly used in everyday medical practice, is critical to patients' health and recovery. It is a challenging process, especially for a drug with narrow therapeutic ranges, in which a medical doctor decides the quantity (dose amount) and frequency (dose interval) on the basis of a set of available patient features and doctor's clinical experience (a priori adaptation). Computer support in drug dose administration makes the prescription procedure faster, more accurate, objective, and less expensive, with a tendency to reduce the number of invasive procedures. This paper presents an advanced integrated Drug Administration Decision Support System (DADSS) to help clinicians/patients with the dose computing. Based on a support vector machine (SVM) algorithm, enhanced with the random sample consensus technique, this system is able to predict the drug concentration values and computes the ideal dose amount and dose interval for a new patient. With an extension to combine the SVM method and the explicit analytical model, the advanced integrated DADSS system is able to compute drug concentration-to-time curves for a patient under different conditions. A feedback loop is enabled to update the curve with a new measured concentration value to make it more personalized (a posteriori adaptation).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El principal objectiu d’aquest projecte és aconseguir classificar diferents vídeos d’esports segons la seva categoria. Els cercadors de text creen un vocabulari segons el significat de les diferents paraules per tal de poder identificar un document. En aquest projecte es va fer el mateix però mitjançant paraules visuals. Per exemple, es van intentar englobar com a una única paraula les diferents rodes que apareixien en els cotxes de rally. A partir de la freqüència amb què apareixien les paraules dels diferents grups dins d’una imatge vàrem crear histogrames de vocabulari que ens permetien tenir una descripció de la imatge. Per classificar un vídeo es van utilitzar els histogrames que descrivien els seus fotogrames. Com que cada histograma es podia considerar un vector de valors enters vàrem optar per utilitzar una màquina classificadora de vectors: una Support vector machine o SVM

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Resumo: O objetivo deste trabalho foi avaliar o desempenho dos classificadores digitais SVM e K-NN para a classificação orientada a objeto em imagens Landsat-8, aplicados ao mapeamento de uso e cobertura do solo da Alta Bacia do Rio Piracicaba-Jaguari, MG. A etapa de pré-processamento contou com a conversão radiométrica e a minimização dos efeitos atmosféricos. Em seguida, foi feita a fusão das bandas multiespectrais (30 m) com a banda pancromática (15 m). Com base em composições RGB e inspeções de campo, definiram-se 15 classes de uso e cobertura do solo. Para a segmentação de bordas, aplicaram-se os limiares 10 e 60 para as configurações de segmentação e união no aplicativo ENVI. A classificação foi feita usando SVM e K-NN. Ambos os classificadores apresentaram elevados valores de índice Kappa (k): 0,92 para SVM e 0,86 para K-NN, significativamente diferentes entre si a 95% de probabilidade. Uma significativa melhoria foi observada para SVM, na classificação correta de diferentes tipologias florestais. A classificação orientada a objetos é amplamente aplicada em imagens de alta resolução espacial; no entanto, os resultados obtidos no presente trabalho mostram a robustez do método também para imagens de média resolução espacial.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The main objective of this study was todo a statistical analysis of ecological type from optical satellite data, using Tipping's sparse Bayesian algorithm. This thesis uses "the Relevence Vector Machine" algorithm in ecological classification betweenforestland and wetland. Further this bi-classification technique was used to do classification of many other different species of trees and produces hierarchical classification of entire subclasses given as a target class. Also, we carried out an attempt to use airborne image of same forest area. Combining it with image analysis, using different image processing operation, we tried to extract good features and later used them to perform classification of forestland and wetland.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a novel image classification scheme for benthic coral reef images that can be applied to both single image and composite mosaic datasets. The proposed method can be configured to the characteristics (e.g., the size of the dataset, number of classes, resolution of the samples, color information availability, class types, etc.) of individual datasets. The proposed method uses completed local binary pattern (CLBP), grey level co-occurrence matrix (GLCM), Gabor filter response, and opponent angle and hue channel color histograms as feature descriptors. For classification, either k-nearest neighbor (KNN), neural network (NN), support vector machine (SVM) or probability density weighted mean distance (PDWMD) is used. The combination of features and classifiers that attains the best results is presented together with the guidelines for selection. The accuracy and efficiency of our proposed method are compared with other state-of-the-art techniques using three benthic and three texture datasets. The proposed method achieves the highest overall classification accuracy of any of the tested methods and has moderate execution time. Finally, the proposed classification scheme is applied to a large-scale image mosaic of the Red Sea to create a completely classified thematic map of the reef benthos

Relevância:

10.00% 10.00%

Publicador:

Resumo:

OBJECTIVES: The aim of this study was to investigate pathological mechanisms underlying brain tissue alterations in mild cognitive impairment (MCI) using multi-contrast 3 T magnetic resonance imaging (MRI). METHODS: Forty-two MCI patients and 77 healthy controls (HC) underwent T1/T2* relaxometry as well as Magnetization Transfer (MT) MRI. Between-groups comparisons in MRI metrics were performed using permutation-based tests. Using MRI data, a generalized linear model (GLM) was computed to predict clinical performance and a support-vector machine (SVM) classification was used to classify MCI and HC subjects. RESULTS: Multi-parametric MRI data showed microstructural brain alterations in MCI patients vs HC that might be interpreted as: (i) a broad loss of myelin/cellular proteins and tissue microstructure in the hippocampus (p ≤ 0.01) and global white matter (p < 0.05); and (ii) iron accumulation in the pallidus nucleus (p ≤ 0.05). MRI metrics accurately predicted memory and executive performances in patients (p ≤ 0.005). SVM classification reached an accuracy of 75% to separate MCI and HC, and performed best using both volumes and T1/T2*/MT metrics. CONCLUSION: Multi-contrast MRI appears to be a promising approach to infer pathophysiological mechanisms leading to brain tissue alterations in MCI. Likewise, parametric MRI data provide powerful correlates of cognitive deficits and improve automatic disease classification based on morphometric features.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this present work, we are proposing a characteristics reduction system for a facial biometric identification system, using transformed domains such as discrete cosine transformed (DCT) and discrete wavelets transformed (DWT) as parameterization; and Support Vector Machines (SVM) and Neural Network (NN) as classifiers. The size reduction has been done with Principal Component Analysis (PCA) and with Independent Component Analysis (ICA). This system presents a similar success results for both DWT-SVM system and DWT-PCA-SVM system, about 98%. The computational load is improved on training mode due to the decreasing of input’s size and less complexity of the classifier.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract This work studies the multi-label classification of turns in simple English Wikipedia talk pages into dialog acts. The treated dataset was created and multi-labeled by (Ferschke et al., 2012). The first part analyses dependences between labels, in order to examine the annotation coherence and to determine a classification method. Then, a multi-label classification is computed, after transforming the problem into binary relevance. Regarding features, whereas (Ferschke et al., 2012) use features such as uni-, bi-, and trigrams, time distance between turns or the indentation level of the turn, other features are considered here: lemmas, part-of-speech tags and the meaning of verbs (according to WordNet). The dataset authors applied approaches such as Naive Bayes or Support Vector Machines. The present paper proposes, as an alternative, to use Schoenberg transformations which, following the example of kernel methods, transform original Euclidean distances into other Euclidean distances, in a space of high dimensionality. Résumé Ce travail étudie la classification supervisée multi-étiquette en actes de dialogue des tours de parole des contributeurs aux pages de discussion de Simple English Wikipedia (Wikipédia en anglais simple). Le jeu de données considéré a été créé et multi-étiqueté par (Ferschke et al., 2012). Une première partie analyse les relations entre les étiquettes pour examiner la cohérence des annotations et pour déterminer une méthode de classification. Ensuite, une classification supervisée multi-étiquette est effectuée, après recodage binaire des étiquettes. Concernant les variables, alors que (Ferschke et al., 2012) utilisent des caractéristiques telles que les uni-, bi- et trigrammes, le temps entre les tours de parole ou l'indentation d'un tour de parole, d'autres descripteurs sont considérés ici : les lemmes, les catégories morphosyntaxiques et le sens des verbes (selon WordNet). Les auteurs du jeu de données ont employé des approches telles que le Naive Bayes ou les Séparateurs à Vastes Marges (SVM) pour la classification. Cet article propose, de façon alternative, d'utiliser et d'étendre l'analyse discriminante linéaire aux transformations de Schoenberg qui, à l'instar des méthodes à noyau, transforment les distances euclidiennes originales en d'autres distances euclidiennes, dans un espace de haute dimensionnalité.