873 resultados para Support Vector Machines and Naive Bayes Classifier
Resumo:
In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).
Resumo:
This article presents an experimental study about the classification ability of several classifiers for multi-classclassification of cannabis seedlings. As the cultivation of drug type cannabis is forbidden in Switzerland lawenforcement authorities regularly ask forensic laboratories to determinate the chemotype of a seized cannabisplant and then to conclude if the plantation is legal or not. This classification is mainly performed when theplant is mature as required by the EU official protocol and then the classification of cannabis seedlings is a timeconsuming and costly procedure. A previous study made by the authors has investigated this problematic [1]and showed that it is possible to differentiate between drug type (illegal) and fibre type (legal) cannabis at anearly stage of growth using gas chromatography interfaced with mass spectrometry (GC-MS) based on therelative proportions of eight major leaf compounds. The aims of the present work are on one hand to continueformer work and to optimize the methodology for the discrimination of drug- and fibre type cannabisdeveloped in the previous study and on the other hand to investigate the possibility to predict illegal cannabisvarieties. Seven classifiers for differentiating between cannabis seedlings are evaluated in this paper, namelyLinear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Nearest NeighbourClassification (NNC), Learning Vector Quantization (LVQ), Radial Basis Function Support Vector Machines(RBF SVMs), Random Forest (RF) and Artificial Neural Networks (ANN). The performance of each method wasassessed using the same analytical dataset that consists of 861 samples split into drug- and fibre type cannabiswith drug type cannabis being made up of 12 varieties (i.e. 12 classes). The results show that linear classifiersare not able to manage the distribution of classes in which some overlap areas exist for both classificationproblems. Unlike linear classifiers, NNC and RBF SVMs best differentiate cannabis samples both for 2-class and12-class classifications with average classification results up to 99% and 98%, respectively. Furthermore, RBFSVMs correctly classified into drug type cannabis the independent validation set, which consists of cannabisplants coming from police seizures. In forensic case work this study shows that the discrimination betweencannabis samples at an early stage of growth is possible with fairly high classification performance fordiscriminating between cannabis chemotypes or between drug type cannabis varieties.
Resumo:
Multisensory and sensorimotor integrations are usually considered to occur in superior colliculus and cerebral cortex, but few studies proposed the thalamus as being involved in these integrative processes. We investigated whether the organization of the thalamocortical (TC) systems for different modalities partly overlap, representing an anatomical support for multisensory and sensorimotor interplay in thalamus. In 2 macaque monkeys, 6 neuroanatomical tracers were injected in the rostral and caudal auditory cortex, posterior parietal cortex (PE/PEa in area 5), and dorsal and ventral premotor cortical areas (PMd, PMv), demonstrating the existence of overlapping territories of thalamic projections to areas of different modalities (sensory and motor). TC projections, distinct from the ones arising from specific unimodal sensory nuclei, were observed from motor thalamus to PE/PEa or auditory cortex and from sensory thalamus to PMd/PMv. The central lateral nucleus and the mediodorsal nucleus project to all injected areas, but the most significant overlap across modalities was found in the medial pulvinar nucleus. The present results demonstrate the presence of thalamic territories integrating different sensory modalities with motor attributes. Based on the divergent/convergent pattern of TC and corticothalamic projections, 4 distinct mechanisms of multisensory and sensorimotor interplay are proposed.
Resumo:
Avalanche forecasting is a complex process involving the assimilation of multiple data sources to make predictions over varying spatial and temporal resolutions. Numerically assisted forecasting often uses nearest neighbour methods (NN), which are known to have limitations when dealing with high dimensional data. We apply Support Vector Machines to a dataset from Lochaber, Scotland to assess their applicability in avalanche forecasting. Support Vector Machines (SVMs) belong to a family of theoretically based techniques from machine learning and are designed to deal with high dimensional data. Initial experiments showed that SVMs gave results which were comparable with NN for categorical and probabilistic forecasts. Experiments utilising the ability of SVMs to deal with high dimensionality in producing a spatial forecast show promise, but require further work.
Resumo:
Due to the advances in sensor networks and remote sensing technologies, the acquisition and storage rates of meteorological and climatological data increases every day and ask for novel and efficient processing algorithms. A fundamental problem of data analysis and modeling is the spatial prediction of meteorological variables in complex orography, which serves among others to extended climatological analyses, for the assimilation of data into numerical weather prediction models, for preparing inputs to hydrological models and for real time monitoring and short-term forecasting of weather.In this thesis, a new framework for spatial estimation is proposed by taking advantage of a class of algorithms emerging from the statistical learning theory. Nonparametric kernel-based methods for nonlinear data classification, regression and target detection, known as support vector machines (SVM), are adapted for mapping of meteorological variables in complex orography.With the advent of high resolution digital elevation models, the field of spatial prediction met new horizons. In fact, by exploiting image processing tools along with physical heuristics, an incredible number of terrain features which account for the topographic conditions at multiple spatial scales can be extracted. Such features are highly relevant for the mapping of meteorological variables because they control a considerable part of the spatial variability of meteorological fields in the complex Alpine orography. For instance, patterns of orographic rainfall, wind speed and cold air pools are known to be correlated with particular terrain forms, e.g. convex/concave surfaces and upwind sides of mountain slopes.Kernel-based methods are employed to learn the nonlinear statistical dependence which links the multidimensional space of geographical and topographic explanatory variables to the variable of interest, that is the wind speed as measured at the weather stations or the occurrence of orographic rainfall patterns as extracted from sequences of radar images. Compared to low dimensional models integrating only the geographical coordinates, the proposed framework opens a way to regionalize meteorological variables which are multidimensional in nature and rarely show spatial auto-correlation in the original space making the use of classical geostatistics tangled.The challenges which are explored during the thesis are manifolds. First, the complexity of models is optimized to impose appropriate smoothness properties and reduce the impact of noisy measurements. Secondly, a multiple kernel extension of SVM is considered to select the multiscale features which explain most of the spatial variability of wind speed. Then, SVM target detection methods are implemented to describe the orographic conditions which cause persistent and stationary rainfall patterns. Finally, the optimal splitting of the data is studied to estimate realistic performances and confidence intervals characterizing the uncertainty of predictions.The resulting maps of average wind speeds find applications within renewable resources assessment and opens a route to decrease the temporal scale of analysis to meet hydrological requirements. Furthermore, the maps depicting the susceptibility to orographic rainfall enhancement can be used to improve current radar-based quantitative precipitation estimation and forecasting systems and to generate stochastic ensembles of precipitation fields conditioned upon the orography.
Resumo:
This letter presents advanced classification methods for very high resolution images. Efficient multisource information, both spectral and spatial, is exploited through the use of composite kernels in support vector machines. Weighted summations of kernels accounting for separate sources of spectral and spatial information are analyzed and compared to classical approaches such as pure spectral classification or stacked approaches using all the features in a single vector. Model selection problems are addressed, as well as the importance of the different kernels in the weighted summation.
Resumo:
The present study deals with the analysis and mapping of Swiss franc interest rates. Interest rates depend on time and maturity, defining term structure of the interest rate curves (IRC). In the present study IRC are considered in a two-dimensional feature space - time and maturity. Exploratory data analysis includes a variety of tools widely used in econophysics and geostatistics. Geostatistical models and machine learning algorithms (multilayer perceptron and Support Vector Machines) were applied to produce interest rate maps. IR maps can be used for the visualisation and pattern perception purposes, to develop and to explore economical hypotheses, to produce dynamic asset-liability simulations and for financial risk assessments. The feasibility of an application of interest rates mapping approach for the IRC forecasting is considered as well. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
In this paper, mixed spectral-structural kernel machines are proposed for the classification of very-high resolution images. The simultaneous use of multispectral and structural features (computed using morphological filters) allows a significant increase in classification accuracy of remote sensing images. Subsequently, weighted summation kernel support vector machines are proposed and applied in order to take into account the multiscale nature of the scene considered. Such classifiers use the Mercer property of kernel matrices to compute a new kernel matrix accounting simultaneously for two scale parameters. Tests on a Zurich QuickBird image show the relevance of the proposed method : using the mixed spectral-structural features, the classification accuracy increases of about 5%, achieving a Kappa index of 0.97. The multikernel approach proposed provide an overall accuracy of 98.90% with related Kappa index of 0.985.
Resumo:
The quality of environmental data analysis and propagation of errors are heavily affected by the representativity of the initial sampling design [CRE 93, DEU 97, KAN 04a, LEN 06, MUL07]. Geostatistical methods such as kriging are related to field samples, whose spatial distribution is crucial for the correct detection of the phenomena. Literature about the design of environmental monitoring networks (MN) is widespread and several interesting books have recently been published [GRU 06, LEN 06, MUL 07] in order to clarify the basic principles of spatial sampling design (monitoring networks optimization) based on Support Vector Machines was proposed. Nonetheless, modelers often receive real data coming from environmental monitoring networks that suffer from problems of non-homogenity (clustering). Clustering can be related to the preferential sampling or to the impossibility of reaching certain regions.
Resumo:
This thesis presents an alternative approach to the analytical design of surface-mounted axialflux permanent-magnet machines. Emphasis has been placed on the design of axial-flux machines with a one-rotor-two-stators configuration. The design model developed in this study incorporates facilities to include both the electromagnetic design and thermal design of the machine as well as to take into consideration the complexity of the permanent-magnet shapes, which is a typical requirement for the design of high-performance permanent-magnet motors. A prototype machine with rated 5 kW output power at 300 min-1 rotation speed has been designed and constructed for the purposesof ascertaining the results obtained from the analytical design model. A comparative study of low-speed axial-flux and low-speed radial-flux permanent-magnet machines is presented. The comparative study concentrates on 55 kW machines with rotation speeds 150 min-1, 300 min-1 and 600 min-1 and is based on calculated designs. A novel comparison method is introduced. The method takes into account the mechanical constraints of the machine and enables comparison of the designed machines, with respect to the volume, efficiency and cost aspects of each machine. It is shown that an axial-flux permanent-magnet machine with one-rotor-two-stators configuration has generally a weaker efficiency than a radial-flux permanent-magnet machine if for all designs the same electric loading, air-gap flux density and current density have been applied. On the other hand, axial-flux machines are usually smaller in volume, especially when compared to radial-flux machines for which the length ratio (axial length of stator stack vs. air-gap diameter)is below 0.5. The comparison results show also that radial-flux machines with alow number of pole pairs, p < 4, outperform the corresponding axial-flux machines.
Resumo:
Least-squares support vector machines (LS-SVM) were used as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants found in powdered milk samples, using near-infrared spectroscopy. Excellent models were built using LS-SVM for determining R², RMSECV and RMSEP values. LS-SVMs show superior performance for quantifying starch, whey and sucrose in powdered milk samples in relation to PLSR. This study shows that it is possible to determine precisely the amount of one and two common adulterants simultaneously in powdered milk samples using LS-SVM and NIR spectra.
Resumo:
Multivariate models were developed using Artificial Neural Network (ANN) and Least Square - Support Vector Machines (LS-SVM) for estimating lignin siringyl/guaiacyl ratio and the contents of cellulose, hemicelluloses and lignin in eucalyptus wood by pyrolysis associated to gaseous chromatography and mass spectrometry (Py-GC/MS). The results obtained by two calibration methods were in agreement with those of reference methods. However a comparison indicated that the LS-SVM model presented better predictive capacity for the cellulose and lignin contents, while the ANN model presented was more adequate for estimating the hemicelluloses content and lignin siringyl/guaiacyl ratio.
Resumo:
The influence of metal loading and support surface functional groups (SFG) on methane dry reforming (MDR) over Ni catalysts supported on pine-sawdust derived activated carbon were studied. Using pine sawdust as the catalyst support precursor, the smallest variety and lowest concentration of SFG led to best Ni dispersion and highest catalytic activity, which increased with Ni loading up to 3 Ni atoms nm-2. At higher Ni loading, the formation of large metal aggregates was observed, consistent with a lower "apparen" surface area and a decrease in catalytic activity. The H2/CO ratio rose with increasing reaction temperature, indicating that increasingly important side reactions were taking place in addition to MDR.