118 resultados para Support Vector Machines and Naive Bayes Classifier


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates the use of ensemble of predictors in order to improve the performance of spatial prediction methods. Support vector regression (SVR), a popular method from the field of statistical machine learning, is used. Several instances of SVR are combined using different data sampling schemes (bagging and boosting). Bagging shows good performance, and proves to be more computationally efficient than training a single SVR model while reducing error. Boosting, however, does not improve results on this specific problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The book presents the state of the art in machine learning algorithms (artificial neural networks of different architectures, support vector machines, etc.) as applied to the classification and mapping of spatially distributed environmental data. Basic geostatistical algorithms are presented as well. New trends in machine learning and their application to spatial data are given, and real case studies based on environmental and pollution data are carried out. The book provides a CD-ROM with the Machine Learning Office software, including sample sets of data, that will allow both students and researchers to put the concepts rapidly to practice.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spatial data analysis mapping and visualization is of great importance in various fields: environment, pollution, natural hazards and risks, epidemiology, spatial econometrics, etc. A basic task of spatial mapping is to make predictions based on some empirical data (measurements). A number of state-of-the-art methods can be used for the task: deterministic interpolations, methods of geostatistics: the family of kriging estimators (Deutsch and Journel, 1997), machine learning algorithms such as artificial neural networks (ANN) of different architectures, hybrid ANN-geostatistics models (Kanevski and Maignan, 2004; Kanevski et al., 1996), etc. All the methods mentioned above can be used for solving the problem of spatial data mapping. Environmental empirical data are always contaminated/corrupted by noise, and often with noise of unknown nature. That's one of the reasons why deterministic models can be inconsistent, since they treat the measurements as values of some unknown function that should be interpolated. Kriging estimators treat the measurements as the realization of some spatial randomn process. To obtain the estimation with kriging one has to model the spatial structure of the data: spatial correlation function or (semi-)variogram. This task can be complicated if there is not sufficient number of measurements and variogram is sensitive to outliers and extremes. ANN is a powerful tool, but it also suffers from the number of reasons. of a special type ? multiplayer perceptrons ? are often used as a detrending tool in hybrid (ANN+geostatistics) models (Kanevski and Maignank, 2004). Therefore, development and adaptation of the method that would be nonlinear and robust to noise in measurements, would deal with the small empirical datasets and which has solid mathematical background is of great importance. The present paper deals with such model, based on Statistical Learning Theory (SLT) - Support Vector Regression. SLT is a general mathematical framework devoted to the problem of estimation of the dependencies from empirical data (Hastie et al, 2004; Vapnik, 1998). SLT models for classification - Support Vector Machines - have shown good results on different machine learning tasks. The results of SVM classification of spatial data are also promising (Kanevski et al, 2002). The properties of SVM for regression - Support Vector Regression (SVR) are less studied. First results of the application of SVR for spatial mapping of physical quantities were obtained by the authorsin for mapping of medium porosity (Kanevski et al, 1999), and for mapping of radioactively contaminated territories (Kanevski and Canu, 2000). The present paper is devoted to further understanding of the properties of SVR model for spatial data analysis and mapping. Detailed description of the SVR theory can be found in (Cristianini and Shawe-Taylor, 2000; Smola, 1996) and basic equations for the nonlinear modeling are given in section 2. Section 3 discusses the application of SVR for spatial data mapping on the real case study - soil pollution by Cs137 radionuclide. Section 4 discusses the properties of the modelapplied to noised data or data with outliers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Raman spectroscopy combined with chemometrics has recently become a widespread technique for the analysis of pharmaceutical solid forms. The application presented in this paper is the investigation of counterfeit medicines. This increasingly serious issue involves networks that are an integral part of industrialized organized crime. Efficient analytical tools are consequently required to fight against it. Quick and reliable authentication means are needed to allow the deployment of measures from the company and the authorities. For this purpose a method in two steps has been implemented here. The first step enables the identification of pharmaceutical tablets and capsules and the detection of their counterfeits. A nonlinear classification method, the Support Vector Machines (SVM), is computed together with a correlation with the database and the detection of Active Pharmaceutical Ingredient (API) peaks in the suspect product. If a counterfeit is detected, the second step allows its chemical profiling among former counterfeits in a forensic intelligence perspective. For this second step a classification based on Principal Component Analysis (PCA) and correlation distance measurements is applied to the Raman spectra of the counterfeits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The decision-making process regarding drug dose, regularly used in everyday medical practice, is critical to patients' health and recovery. It is a challenging process, especially for a drug with narrow therapeutic ranges, in which a medical doctor decides the quantity (dose amount) and frequency (dose interval) on the basis of a set of available patient features and doctor's clinical experience (a priori adaptation). Computer support in drug dose administration makes the prescription procedure faster, more accurate, objective, and less expensive, with a tendency to reduce the number of invasive procedures. This paper presents an advanced integrated Drug Administration Decision Support System (DADSS) to help clinicians/patients with the dose computing. Based on a support vector machine (SVM) algorithm, enhanced with the random sample consensus technique, this system is able to predict the drug concentration values and computes the ideal dose amount and dose interval for a new patient. With an extension to combine the SVM method and the explicit analytical model, the advanced integrated DADSS system is able to compute drug concentration-to-time curves for a patient under different conditions. A feedback loop is enabled to update the curve with a new measured concentration value to make it more personalized (a posteriori adaptation).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Difficult tracheal intubation assessment is an important research topic in anesthesia as failed intubations are important causes of mortality in anesthetic practice. The modified Mallampati score is widely used, alone or in conjunction with other criteria, to predict the difficulty of intubation. This work presents an automatic method to assess the modified Mallampati score from an image of a patient with the mouth wide open. For this purpose we propose an active appearance models (AAM) based method and use linear support vector machines (SVM) to select a subset of relevant features obtained using the AAM. This feature selection step proves to be essential as it improves drastically the performance of classification, which is obtained using SVM with RBF kernel and majority voting. We test our method on images of 100 patients undergoing elective surgery and achieve 97.9% accuracy in the leave-one-out crossvalidation test and provide a key element to an automatic difficult intubation assessment system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Self-consciousness has mostly been approached by philosophical enquiry and not by empirical neuroscientific study, leading to an overabundance of diverging theories and an absence of data-driven theories. Using robotic technology, we achieved specific bodily conflicts and induced predictable changes in a fundamental aspect of self-consciousness by altering where healthy subjects experienced themselves to be (self-location). Functional magnetic resonance imaging revealed that temporo-parietal junction (TPJ) activity reflected experimental changes in self-location that also depended on the first-person perspective due to visuo-tactile and visuo-vestibular conflicts. Moreover, in a large lesion analysis study of neurological patients with a well-defined state of abnormal self-location, brain damage was also localized at TPJ, providing causal evidence that TPJ encodes self-location. Our findings reveal that multisensory integration at the TPJ reflects one of the most fundamental subjective feelings of humans: the feeling of being an entity localized at a position in space and perceiving the world from this position and perspective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An active learning method is proposed for the semi-automatic selection of training sets in remote sensing image classification. The method adds iteratively to the current training set the unlabeled pixels for which the prediction of an ensemble of classifiers based on bagged training sets show maximum entropy. This way, the algorithm selects the pixels that are the most uncertain and that will improve the model if added in the training set. The user is asked to label such pixels at each iteration. Experiments using support vector machines (SVM) on an 8 classes QuickBird image show the excellent performances of the methods, that equals accuracies of both a model trained with ten times more pixels and a model whose training set has been built using a state-of-the-art SVM specific active learning method

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The 2008 Data Fusion Contest organized by the IEEE Geoscience and Remote Sensing Data Fusion Technical Committee deals with the classification of high-resolution hyperspectral data from an urban area. Unlike in the previous issues of the contest, the goal was not only to identify the best algorithm but also to provide a collaborative effort: The decision fusion of the best individual algorithms was aiming at further improving the classification performances, and the best algorithms were ranked according to their relative contribution to the decision fusion. This paper presents the five awarded algorithms and the conclusions of the contest, stressing the importance of decision fusion, dimension reduction, and supervised classification methods, such as neural networks and support vector machines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently, kernel-based Machine Learning methods have gained great popularity in many data analysis and data mining fields: pattern recognition, biocomputing, speech and vision, engineering, remote sensing etc. The paper describes the use of kernel methods to approach the processing of large datasets from environmental monitoring networks. Several typical problems of the environmental sciences and their solutions provided by kernel-based methods are considered: classification of categorical data (soil type classification), mapping of environmental and pollution continuous information (pollution of soil by radionuclides), mapping with auxiliary information (climatic data from Aral Sea region). The promising developments, such as automatic emergency hot spot detection and monitoring network optimization are discussed as well.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To be diagnostically useful, structural MRI must reliably distinguish Alzheimer's disease (AD) from normal aging in individual scans. Recent advances in statistical learning theory have led to the application of support vector machines to MRI for detection of a variety of disease states. The aims of this study were to assess how successfully support vector machines assigned individual diagnoses and to determine whether data-sets combined from multiple scanners and different centres could be used to obtain effective classification of scans. We used linear support vector machines to classify the grey matter segment of T1-weighted MR scans from pathologically proven AD patients and cognitively normal elderly individuals obtained from two centres with different scanning equipment. Because the clinical diagnosis of mild AD is difficult we also tested the ability of support vector machines to differentiate control scans from patients without post-mortem confirmation. Finally we sought to use these methods to differentiate scans between patients suffering from AD from those with frontotemporal lobar degeneration. Up to 96% of pathologically verified AD patients were correctly classified using whole brain images. Data from different centres were successfully combined achieving comparable results from the separate analyses. Importantly, data from one centre could be used to train a support vector machine to accurately differentiate AD and normal ageing scans obtained from another centre with different subjects and different scanner equipment. Patients with mild, clinically probable AD and age/sex matched controls were correctly separated in 89% of cases which is compatible with published diagnosis rates in the best clinical centres. This method correctly assigned 89% of patients with post-mortem confirmed diagnosis of either AD or frontotemporal lobar degeneration to their respective group. Our study leads to three conclusions: Firstly, support vector machines successfully separate patients with AD from healthy aging subjects. Secondly, they perform well in the differential diagnosis of two different forms of dementia. Thirdly, the method is robust and can be generalized across different centres. This suggests an important role for computer based diagnostic image analysis for clinical practice.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).