216 resultados para Image classification
Resumo:
In the design of practical web page classification systems one often encounters a situation in which the labeled training set is created by choosing some examples from each class; but, the class proportions in this set are not the same as those in the test distribution to which the classifier will be actually applied. The problem is made worse when the amount of training data is also small. In this paper we explore and adapt binary SVM methods that make use of unlabeled data from the test distribution, viz., Transductive SVMs (TSVMs) and expectation regularization/constraint (ER/EC) methods to deal with this situation. We empirically show that when the labeled training data is small, TSVM designed using the class ratio tuned by minimizing the loss on the labeled set yields the best performance; its performance is good even when the deviation between the class ratios of the labeled training set and the test set is quite large. When the labeled training data is sufficiently large, an unsupervised Gaussian mixture model can be used to get a very good estimate of the class ratio in the test set; also, when this estimate is used, both TSVM and EC/ER give their best possible performance, with TSVM coming out superior. The ideas in the paper can be easily extended to multi-class SVMs and MaxEnt models.
Resumo:
The present approach uses stopwords and the gaps that oc- cur between successive stopwords –formed by contentwords– as features for sentiment classification.
Resumo:
Time series classification deals with the problem of classification of data that is multivariate in nature. This means that one or more of the attributes is in the form of a sequence. The notion of similarity or distance, used in time series data, is significant and affects the accuracy, time, and space complexity of the classification algorithm. There exist numerous similarity measures for time series data, but each of them has its own disadvantages. Instead of relying upon a single similarity measure, our aim is to find the near optimal solution to the classification problem by combining different similarity measures. In this work, we use genetic algorithms to combine the similarity measures so as to get the best performance. The weightage given to different similarity measures evolves over a number of generations so as to get the best combination. We test our approach on a number of benchmark time series datasets and present promising results.
Resumo:
This paper investigates a new approach for point matching in multi-sensor satellite images. The feature points are matched using multi-objective optimization (angle criterion and distance condition) based on Genetic Algorithm (GA). This optimization process is more efficient as it considers both the angle criterion and distance condition to incorporate multi-objective switching in the fitness function. This optimization process helps in matching three corresponding corner points detected in the reference and sensed image and thereby using the affine transformation, the sensed image is aligned with the reference image. From the results obtained, the performance of the image registration is evaluated and it is concluded that the proposed approach is efficient.
Resumo:
This paper presents an improved hierarchical clustering algorithm for land cover mapping problem using quasi-random distribution. Initially, Niche Particle Swarm Optimization (NPSO) with pseudo/quasi-random distribution is used for splitting the data into number of cluster centers by satisfying Bayesian Information Criteria (BIC). Themain objective is to search and locate the best possible number of cluster and its centers. NPSO which highly depends on the initial distribution of particles in search space is not been exploited to its full potential. In this study, we have compared more uniformly distributed quasi-random with pseudo-random distribution with NPSO for splitting data set. Here to generate quasi-random distribution, Faure method has been used. Performance of previously proposed methods namely K-means, Mean Shift Clustering (MSC) and NPSO with pseudo-random is compared with the proposed approach - NPSO with quasi distribution(Faure). These algorithms are used on synthetic data set and multi-spectral satellite image (Landsat 7 thematic mapper). From the result obtained we conclude that use of quasi-random sequence with NPSO for hierarchical clustering algorithm results in a more accurate data classification.
Resumo:
This paper discusses an approach for river mapping and flood evaluation based on multi-temporal time-series analysis of satellite images utilizing pixel spectral information for image clustering and region based segmentation for extracting water covered regions. MODIS satellite images are analyzed at two stages: before flood and during flood. Multi-temporal MODIS images are processed in two steps. In the first step, clustering algorithms such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) are used to distinguish the water regions from the non-water based on spectral information. These algorithms are chosen since they are quite efficient in solving multi-modal optimization problems. These classified images are then segmented using spatial features of the water region to extract the river. From the results obtained, we evaluate the performance of the methods and conclude that incorporating region based image segmentation along with clustering algorithms provides accurate and reliable approach for the extraction of water covered region.
Resumo:
Western Blot analysis is an analytical technique used in Molecular Biology, Biochemistry, Immunogenetics and other Molecular Biology studies to separate proteins by electrophoresis. The procedure results in images containing nearly rectangular-shaped blots. In this paper, we address the problem of quantitation of the blots using automated image processing techniques. We formulate a special active contour (or snake) called Oblong, which locks on to rectangular shaped objects. Oblongs depend on five free parameters, which is also the minimum number of parameters required for a unique characterization. Unlike many snake formulations, Oblongs do not require explicit gradient computations and therefore the optimization is carried out fast. The performance of Oblongs is assessed on synthesized data and Western Blot Analysis images.
Resumo:
A new multi-sensor image registration technique is proposed based on detecting the feature corner points using modified Harris Corner Detector (HDC). These feature points are matched using multi-objective optimization (distance condition and angle criterion) based on Discrete Particle Swarm Optimization (DPSO). This optimization process is more efficient as it considers both the distance and angle criteria to incorporate multi-objective switching in the fitness function. This optimization process helps in picking up three corresponding corner points detected in the sensed and base image and thereby using the affine transformation, the sensed image is aligned with the base image. Further, the results show that the new approach can provide a new dimension in solving multi-sensor image registration problems. From the obtained results, the performance of image registration is evaluated and is concluded that the proposed approach is efficient.
Resumo:
Subsurface lithology and seismic site classification of Lucknow urban center located in the central part of the Indo-Gangetic Basin (IGB) are presented based on detailed shallow subsurface investigations and borehole analysis. These are done by carrying out 47 seismic surface wave tests using multichannel analysis of surface waves (MASW) and 23 boreholes drilled up to 30 m with standard penetration test (SPT) N values. Subsurface lithology profiles drawn from the drilled boreholes show low- to medium-compressibility clay and silty to poorly graded sand available till depth of 30 m. In addition, deeper boreholes (depth >150 m) were collected from the Lucknow Jal Nigam (Water Corporation), Government of Uttar Pradesh to understand deeper subsoil stratification. Deeper boreholes in this paper refer to those with depth over 150 m. These reports show the presence of clay mix with sand and Kankar at some locations till a depth of 150 m, followed by layers of sand, clay, and Kankar up to 400 m. Based on the available details, shallow and deeper cross-sections through Lucknow are presented. Shear wave velocity (SWV) and N-SPT values were measured for the study area using MASW and SPT testing. Measured SWV and N-SPT values for the same locations were found to be comparable. These values were used to estimate 30 m average values of N-SPT (N-30) and SWV (V-s(30)) for seismic site classification of the study area as per the National Earthquake Hazards Reduction Program (NEHRP) soil classification system. Based on the NEHRP classification, the entire study area is classified into site class C and D based on V-s(30) and site class D and E based on N-30. The issue of larger amplification during future seismic events is highlighted for a major part of the study area which comes under site class D and E. Also, the mismatch of site classes based on N-30 and V-s(30) raises the question of the suitability of the NEHRP classification system for the study region. Further, 17 sets of SPT and SWV data are used to develop a correlation between N-SPT and SWV. This represents a first attempt of seismic site classification and correlation between N-SPT and SWV in the Indo-Gangetic Basin.
Resumo:
This paper presents an efficient approach to the modeling and classification of vehicles using the magnetic signature of the vehicle. A database was created using the magnetic signature collected over a wide range of vehicles(cars). A vehicle is modeled as an array of magnetic dipoles. The strength of the magnetic dipole and the separation between the magnetic dipoles varies for different vehicles and is dependent on the metallic composition and configuration of the vehicle. Based on the magnetic dipole data model, we present a novel method to extract a feature vector from the magnetic signature. In the classification of vehicles, a linear support vector machine configuration is used to classify the vehicles based on the obtained feature vectors.
Resumo:
Effective conservation and management of natural resources requires up-to-date information of the land cover (LC) types and their dynamics. The LC dynamics are being captured using multi-resolution remote sensing (RS) data with appropriate classification strategies. RS data with important environmental layers (either remotely acquired or derived from ground measurements) would however be more effective in addressing LC dynamics and associated changes. These ancillary layers provide additional information for delineating LC classes' decision boundaries compared to the conventional classification techniques. This communication ascertains the possibility of improved classification accuracy of RS data with ancillary and derived geographical layers such as vegetation index, temperature, digital elevation model (DEM), aspect, slope and texture. This has been implemented in three terrains of varying topography. The study would help in the selection of appropriate ancillary data depending on the terrain for better classified information.
Resumo:
The mode I fracture toughness of concrete can be experimentally determined using three point bend beam in conjunction with digital image correlation (DIC). Three different geometrically similar sizes of beams are cast for this study. To study the influence of fly ash and silica fume on fracture toughness of SCC, three SCC mixes are prepared with and without mineral additions. The scanning electron microscope (SEM) images are taken on the fractured surface to add information on fracture process in SCC. From this study, it is concluded that the fracture toughness of SCC with mineral addition is higher when compared to those without mineral addition.
Resumo:
The assembly of aerospace and automotive structures in recent years is increasingly carried out using adhesives. Adhesive joints have advantages of uniform stress distribution and less stress concentration in the bonded region. Nevertheless, they may suffer due to the presence of defects in bond line and at the interface or due to improper curing process. While defects like voids, cracks and delaminations present in the adhesive bond line may be detected using different NDE methods, interfacial defects in the form of kissing bond may go undetected. Attempts using advanced ultrasonic methods like nonlinear ultrasound and guided wave inspection to detect kissing bond have met with limited success stressing the need for alternate methods. This paper concerns the preliminary studies carried out on detectability of dry contact kissing bonds in adhesive joints using the Digital Image Correlation (DIC) technique. In this attempt, adhesive joint samples containing varied area of kissing bond were prepared using the glass fiber reinforced composite (GFRP) as substrates and epoxy resin as the adhesive layer joining them. The samples were also subjected to conventional and high power ultrasonic inspection. Further, these samples were loaded till failure to determine the bond strength during which digital images were recorded and analyzed using the DIC method. This noncontact method could indicate the existence of kissing bonds at less than 50% failure load. Finite element studies carried out showed a similar trend. Results obtained from these preliminary studies are encouraging and further tests need to be done on a larger set of samples to study experimental uncertainties and scatter associated with the method. (C) 2013 Elsevier Ltd. All rights reserved.
Resumo:
Medical image segmentation finds application in computer-aided diagnosis, computer-guided surgery, measuring tissue volumes, locating tumors, and pathologies. One approach to segmentation is to use active contours or snakes. Active contours start from an initialization (often manually specified) and are guided by image-dependent forces to the object boundary. Snakes may also be guided by gradient vector fields associated with an image. The first main result in this direction is that of Xu and Prince, who proposed the notion of gradient vector flow (GVF), which is computed iteratively. We propose a new formalism to compute the vector flow based on the notion of bilateral filtering of the gradient field associated with the edge map - we refer to it as the bilateral vector flow (BVF). The range kernel definition that we employ is different from the one employed in the standard Gaussian bilateral filter. The advantage of the BVF formalism is that smooth gradient vector flow fields with enhanced edge information can be computed noniteratively. The quality of image segmentation turned out to be on par with that obtained using the GVF and in some cases better than the GVF.
Resumo:
Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.