102 resultados para Support Vector Machine
Resumo:
Many shallow landslides are triggered by heavy rainfall on hill slopes resulting in enormous casualties and huge economic losses in mountainous regions. Hill slope failure usually occurs as soil resistance deteriorates in the presence of the acting stress developed due to a number of reasons such as increased soil moisture content, change in land use causing slope instability, etc. Landslides triggered by rainfall can possibly be foreseen in real time by jointly using rainfall intensity-duration and information related to land surface susceptibility. Terrain analysis applications using spatial data such as aspect, slope, flow direction, compound topographic index, etc. along with information derived from remotely sensed data such as land cover / land use maps permit us to quantify and characterise the physical processes governing the landslide occurrence phenomenon. In this work, the probable landslide prone areas are predicted using two different algorithms – GARP (Genetic Algorithm for Rule-set Prediction) and Support Vector Machine (SVM) in a free and open source software package - openModeller. Several environmental layers such as aspect, digital elevation data, flow accumulation, flow direction, slope, land cover, compound topographic index, and precipitation data were used in modelling. A comparison of the simulated outputs, validated by overlaying the actual landslide occurrence points showed 92% accuracy with GARP and 96% accuracy with SVM in predicting landslide prone areas considering precipitation in the wettest month whereas 91% and 94% accuracy were obtained from GARP and SVM considering precipitation in the wettest quarter of the year.
Resumo:
A geometric and non parametric procedure for testing if two finite set of points are linearly separable is proposed. The Linear Separability Test is equivalent to a test that determines if a strictly positive point h > 0 exists in the range of a matrix A (related to the points in the two finite sets). The algorithm proposed in the paper iteratively checks if a strictly positive point exists in a subspace by projecting a strictly positive vector with equal co-ordinates (p), on the subspace. At the end of each iteration, the subspace is reduced to a lower dimensional subspace. The test is completed within r ≤ min(n, d + 1) steps, for both linearly separable and non separable problems (r is the rank of A, n is the number of points and d is the dimension of the space containing the points). The worst case time complexity of the algorithm is O(nr3) and space complexity of the algorithm is O(nd). A small review of some of the prominent algorithms and their time complexities is included. The worst case computational complexity of our algorithm is lower than the worst case computational complexity of Simplex, Perceptron, Support Vector Machine and Convex Hull Algorithms, if d
Resumo:
Many downscaling techniques have been developed in the past few years for projection of station-scale hydrological variables from large-scale atmospheric variables simulated by general circulation models (GCMs) to assess the hydrological impacts of climate change. This article compares the performances of three downscaling methods, viz. conditional random field (CRF), K-nearest neighbour (KNN) and support vector machine (SVM) methods in downscaling precipitation in the Punjab region of India, belonging to the monsoon regime. The CRF model is a recently developed method for downscaling hydrological variables in a probabilistic framework, while the SVM model is a popular machine learning tool useful in terms of its ability to generalize and capture nonlinear relationships between predictors and predictand. The KNN model is an analogue-type method that queries days similar to a given feature vector from the training data and classifies future days by random sampling from a weighted set of K closest training examples. The models are applied for downscaling monsoon (June to September) daily precipitation at six locations in Punjab. Model performances with respect to reproduction of various statistics such as dry and wet spell length distributions, daily rainfall distribution, and intersite correlations are examined. It is found that the CRF and KNN models perform slightly better than the SVM model in reproducing most daily rainfall statistics. These models are then used to project future precipitation at the six locations. Output from the Canadian global climate model (CGCM3) GCM for three scenarios, viz. A1B, A2, and B1 is used for projection of future precipitation. The projections show a change in probability density functions of daily rainfall amount and changes in the wet and dry spell distributions of daily precipitation. Copyright (C) 2011 John Wiley & Sons, Ltd.
Resumo:
A two-stage methodology is developed to obtain future projections of daily relative humidity in a river basin for climate change scenarios. In the first stage, Support Vector Machine (SVM) models are developed to downscale nine sets of predictor variables (large-scale atmospheric variables) for Intergovernmental Panel on Climate Change Special Report on Emissions Scenarios (SRES) (A1B, A2, B1, and COMMIT) to R (H) in a river basin at monthly scale. Uncertainty in the future projections of R (H) is studied for combinations of SRES scenarios, and predictors selected. Subsequently, in the second stage, the monthly sequences of R (H) are disaggregated to daily scale using k-nearest neighbor method. The effectiveness of the developed methodology is demonstrated through application to the catchment of Malaprabha reservoir in India. For downscaling, the probable predictor variables are extracted from the (1) National Centers for Environmental Prediction reanalysis data set for the period 1978-2000 and (2) simulations of the third-generation Canadian Coupled Global Climate Model for the period 1978-2100. The performance of the downscaling and disaggregation models is evaluated by split sample validation. Results show that among the SVM models, the model developed using predictors pertaining to only land location performed better. The R (H) is projected to increase in the future for A1B and A2 scenarios, while no trend is discerned for B1 and COMMIT.
Resumo:
Genetic Algorithm for Rule-set Prediction (GARP) and Support Vector Machine (SVM) with free and open source software (FOSS) - Open Modeller were used to model the probable landslide occurrence points. Environmental layers such as aspect, digital elevation, flow accumulation, flow direction, slope, land cover, compound topographic index and precipitation have been used in modeling. Simulated output of these techniques is validated with the actual landslide occurrence points, which showed 92% (GARP) and 96% (SVM) accuracy considering precipitation in the wettest month and 91% and 94% accuracy considering precipitation in the wettest quarter of the year.
Resumo:
This paper describes a new method of color text localization from generic scene images containing text of different scripts and with arbitrary orientations. A representative set of colors is first identified using the edge information to initiate an unsupervised clustering algorithm. Text components are identified from each color layer using a combination of a support vector machine and a neural network classifier trained on a set of low-level features derived from the geometric, boundary, stroke and gradient information. Experiments on camera-captured images that contain variable fonts, size, color, irregular layout, non-uniform illumination and multiple scripts illustrate the robustness of the method. The proposed method yields precision and recall of 0.8 and 0.86 respectively on a database of 100 images. The method is also compared with others in the literature using the ICDAR 2003 robust reading competition dataset.
Resumo:
This paper presents an efficient approach to the modeling and classification of vehicles using the magnetic signature of the vehicle. A database was created using the magnetic signature collected over a wide range of vehicles(cars). A vehicle is modeled as an array of magnetic dipoles. The strength of the magnetic dipole and the separation between the magnetic dipoles varies for different vehicles and is dependent on the metallic composition and configuration of the vehicle. Based on the magnetic dipole data model, we present a novel method to extract a feature vector from the magnetic signature. In the classification of vehicles, a linear support vector machine configuration is used to classify the vehicles based on the obtained feature vectors.
Resumo:
There are many popular models available for classification of documents like Naïve Bayes Classifier, k-Nearest Neighbors and Support Vector Machine. In all these cases, the representation is based on the “Bag of words” model. This model doesn't capture the actual semantic meaning of a word in a particular document. Semantics are better captured by proximity of words and their occurrence in the document. We propose a new “Bag of Phrases” model to capture this discriminative power of phrases for text classification. We present a novel algorithm to extract phrases from the corpus using the well known topic model, Latent Dirichlet Allocation(LDA), and to integrate them in vector space model for classification. Experiments show a better performance of classifiers with the new Bag of Phrases model against related representation models.
Resumo:
This paper presents an efficient approach to the modeling and classification of vehicles using the magnetic signature of the vehicle. A database was created using the magnetic signature collected over a wide range of vehicles(cars). A sensor dependent approach called as Magnetic Field Angle Model is proposed for modeling the obtained magnetic signature. Based on the data model, we present a novel method to extract the feature vector from the magnetic signature. In the classification of vehicles, a linear support vector machine configuration is used to classify the vehicles based on the obtained feature vectors.
Resumo:
In this paper, we present a machine learning approach for subject independent human action recognition using depth camera, emphasizing the importance of depth in recognition of actions. The proposed approach uses the flow information of all 3 dimensions to classify an action. In our approach, we have obtained the 2-D optical flow and used it along with the depth image to obtain the depth flow (Z motion vectors). The obtained flow captures the dynamics of the actions in space time. Feature vectors are obtained by averaging the 3-D motion over a grid laid over the silhouette in a hierarchical fashion. These hierarchical fine to coarse windows capture the motion dynamics of the object at various scales. The extracted features are used to train a Meta-cognitive Radial Basis Function Network (McRBFN) that uses a Projection Based Learning (PBL) algorithm, referred to as PBL-McRBFN, henceforth. PBL-McRBFN begins with zero hidden neurons and builds the network based on the best human learning strategy, namely, self-regulated learning in a meta-cognitive environment. When a sample is used for learning, PBLMcRBFN uses the sample overlapping conditions, and a projection based learning algorithm to estimate the parameters of the network. The performance of PBL-McRBFN is compared to that of a Support Vector Machine (SVM) and Extreme Learning Machine (ELM) classifiers with representation of every person and action in the training and testing datasets. Performance study shows that PBL-McRBFN outperforms these classifiers in recognizing actions in 3-D. Further, a subject-independent study is conducted by leave-one-subject-out strategy and its generalization performance is tested. It is observed from the subject-independent study that McRBFN is capable of generalizing actions accurately. The performance of the proposed approach is benchmarked with Video Analytics Lab (VAL) dataset and Berkeley Multimodal Human Action Database (MHAD). (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
Multi-task learning solves multiple related learning problems simultaneously by sharing some common structure for improved generalization performance of each task. We propose a novel approach to multi-task learning which captures task similarity through a shared basis vector set. The variability across tasks is captured through task specific basis vector set. We use sparse support vector machine (SVM) algorithm to select the basis vector sets for the tasks. The approach results in a sparse model where the prediction is done using very few examples. The effectiveness of our approach is demonstrated through experiments on synthetic and real multi-task datasets.
Resumo:
In this paper we establish that the Lovasz theta function on a graph can be restated as a kernel learning problem. We introduce the notion of SVM-theta graphs, on which Lovasz theta function can be approximated well by a Support vector machine (SVM). We show that Erdos-Renyi random G(n, p) graphs are SVM-theta graphs for log(4)n/n <= p < 1. Even if we embed a large clique of size Theta(root np/1-p) in a G(n, p) graph the resultant graph still remains a SVM-theta graph. This immediately suggests an SVM based algorithm for recovering a large planted clique in random graphs. Associated with the theta function is the notion of orthogonal labellings. We introduce common orthogonal labellings which extends the idea of orthogonal labellings to multiple graphs. This allows us to propose a Multiple Kernel learning (MKL) based solution which is capable of identifying a large common dense subgraph in multiple graphs. Both in the planted clique case and common subgraph detection problem the proposed solutions beat the state of the art by an order of magnitude.
Resumo:
A variety of methods are available to estimate future solar radiation (SR) scenarios at spatial scales that are appropriate for local climate change impact assessment. However, there are no clear guidelines available in the literature to decide which methodologies are most suitable for different applications. Three methodologies to guide the estimation of SR are discussed in this study, namely: Case 1: SR is measured, Case 2: SR is measured but sparse and Case 3: SR is not measured. In Case 1, future SR scenarios are derived using several downscaling methodologies that transfer the simulated large-scale information of global climate models to a local scale ( measurements). In Case 2, the SR was first estimated at the local scale for a longer time period using sparse measured records, and then future scenarios were derived using several downscaling methodologies. In Case 3: the SR was first estimated at a regional scale for a longer time period using complete or sparse measured records of SR from which SR at the local scale was estimated. Finally, the future scenarios were derived using several downscaling methodologies. The lack of observed SR data, especially in developing countries, has hindered various climate change impact studies. Hence, this was further elaborated by applying the Case 3 methodology to a semi-arid Malaprabha reservoir catchment in southern India. A support vector machine was used in downscaling SR. Future monthly scenarios of SR were estimated from simulations of third-generation Canadian General Circulation Model (CGCM3) for various SRES emission scenarios (A1B, A2, B1, and COMMIT). Results indicated a projected decrease of 0.4 to 12.2 W m(-2) yr(-1) in SR during the period 2001-2100 across the 4 scenarios. SR was calculated using the modified Hargreaves method. The decreasing trends for the future were in agreement with the simulations of SR from the CGCM3 model directly obtained for the 4 scenarios.
Resumo:
In this article, we aim at reducing the error rate of the online Tamil symbol recognition system by employing multiple experts to reevaluate certain decisions of the primary support vector machine classifier. Motivated by the relatively high percentage of occurrence of base consonants in the script, a reevaluation technique has been proposed to correct any ambiguities arising in the base consonants. Secondly, a dynamic time-warping method is proposed to automatically extract the discriminative regions for each set of confused characters. Class-specific features derived from these regions aid in reducing the degree of confusion. Thirdly, statistics of specific features are proposed for resolving any confusions in vowel modifiers. The reevaluation approaches are tested on two databases (a) the isolated Tamil symbols in the IWFHR test set, and (b) the symbols segmented from a set of 10,000 Tamil words. The recognition rate of the isolated test symbols of the IWFHR database improves by 1.9 %. For the word database, the incorporation of the reevaluation step improves the symbol recognition rate by 3.5 % (from 88.4 to 91.9 %). This, in turn, boosts the word recognition rate by 11.9 % (from 65.0 to 76.9 %). The reduction in the word error rate has been achieved using a generic approach, without the incorporation of language models.
Resumo:
Several statistical downscaling models have been developed in the past couple of decades to assess the hydrologic impacts of climate change by projecting the station-scale hydrological variables from large-scale atmospheric variables simulated by general circulation models (GCMs). This paper presents and compares different statistical downscaling models that use multiple linear regression (MLR), positive coefficient regression (PCR), stepwise regression (SR), and support vector machine (SVM) techniques for estimating monthly rainfall amounts in the state of Florida. Mean sea level pressure, air temperature, geopotential height, specific humidity, U wind, and V wind are used as the explanatory variables/predictors in the downscaling models. Data for these variables are obtained from the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis dataset and the Canadian Centre for Climate Modelling and Analysis (CCCma) Coupled Global Climate Model, version 3 (CGCM3) GCM simulations. The principal component analysis (PCA) and fuzzy c-means clustering method (FCM) are used as part of downscaling model to reduce the dimensionality of the dataset and identify the clusters in the data, respectively. Evaluation of the performances of the models using different error and statistical measures indicates that the SVM-based model performed better than all the other models in reproducing most monthly rainfall statistics at 18 sites. Output from the third-generation CGCM3 GCM for the A1B scenario was used for future projections. For the projection period 2001-10, MLR was used to relate variables at the GCM and NCEP grid scales. Use of MLR in linking the predictor variables at the GCM and NCEP grid scales yielded better reproduction of monthly rainfall statistics at most of the stations (12 out of 18) compared to those by spatial interpolation technique used in earlier studies.