31 resultados para Computer vision teaching

em Indian Institute of Science - Bangalore - Índia


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Color displays used in image processing systems consist of a refresh memory buffer storing digital image data which are converted into analog signals to display an image by driving the primary color channels (red, green, and blue) of a color television monitor. The color cathode ray tube (CRT) of the monitor is unable to reproduce colors exactly due to phosphor limitations, exponential luminance response of the tube to the applied signal, and limitations imposed by the digital-to-analog conversion. In this paper we describe some computer simulation studies (using the U*V*W* color space) carried out to measure these reproduction errors. Further, a procedure to correct for color reproduction error due to the exponential luminance response (gamma) of the picture tube is proposed, using a video-lookup-table and a higher resolution digital-to-analog converter. It is found, on the basis of computer simulation studies, that the proposed gamma correction scheme is effective and robust with respect to variations in the assumed value of the gamma.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Visual tracking has been a challenging problem in computer vision over the decades. The applications of Visual Tracking are far-reaching, ranging from surveillance and monitoring to smart rooms. Mean-shift (MS) tracker, which gained more attention recently, is known for tracking objects in a cluttered environment and its low computational complexity. The major problem encountered in histogram-based MS is its inability to track rapidly moving objects. In order to track fast moving objects, we propose a new robust mean-shift tracker that uses both spatial similarity measure and color histogram-based similarity measure. The inability of MS tracker to handle large displacements is circumvented by the spatial similarity-based tracking module, which lacks robustness to object's appearance change. The performance of the proposed tracker is better than the individual trackers for tracking fast-moving objects with better accuracy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We describe a novel method for human activity segmentation and interpretation in surveillance applications based on Gabor filter-bank features. A complex human activity is modeled as a sequence of elementary human actions like walking, running, jogging, boxing, hand-waving etc. Since human silhouette can be modeled by a set of rectangles, the elementary human actions can be modeled as a sequence of a set of rectangles with different orientations and scales. The activity segmentation is based on Gabor filter-bank features and normalized spectral clustering. The feature trajectories of an action category are learnt from training example videos using dynamic time warping. The combined segmentation and the recognition processes are very efficient as both the algorithms share the same framework and Gabor features computed for the former can be used for the later. We have also proposed a simple shadow detection technique to extract good silhouette which is necessary for good accuracy of an action recognition technique.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present a growing and pruning radial basis function based no-reference (NR) image quality model for JPEG-coded images. The quality of the images are estimated without referring to their original images. The features for predicting the perceived image quality are extracted by considering key human visual sensitivity factors such as edge amplitude, edge length, background activity and background luminance. Image quality estimation involves computation of functional relationship between HVS features and subjective test scores. Here, the problem of quality estimation is transformed to a function approximation problem and solved using GAP-RBF network. GAP-RBF network uses sequential learning algorithm to approximate the functional relationship. The computational complexity and memory requirement are less in GAP-RBF algorithm compared to other batch learning algorithms. Also, the GAP-RBF algorithm finds a compact image quality model and does not require retraining when the new image samples are presented. Experimental results prove that the GAP-RBF image quality model does emulate the mean opinion score (MOS). The subjective test results of the proposed metric are compared with JPEG no-reference image quality index as well as full-reference structural similarity image quality index and it is observed to outperform both.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose two texture-based approaches, one involving Gabor filters and the other employing log-polar wavelets, for separating text from non-text elements in a document image. Both the proposed algorithms compute local energy at some information-rich points, which are marked by Harris' corner detector. The advantage of this approach is that the algorithm calculates the local energy at selected points and not throughout the image, thus saving a lot of computational time. The algorithm has been tested on a large set of scanned text pages and the results have been seen to be better than the results from the existing algorithms. Among the proposed schemes, the Gabor filter based scheme marginally outperforms the wavelet based scheme.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Computer Vision has seen a resurgence in the parts-based representation for objects over the past few years. The parts are usually annotated beforehand for training. We present an annotation free parts-based representation for the pedestrian using Non-Negative Matrix Factorization (NMF). We show that NMF is able to capture the wide range of pose and clothing of the pedestrians. We use a modified form of NMF i.e. NMF with sparsity constraints on the factored matrices. We also make use of Riemannian distance metric for similarity measurements in NMF space as the basis vectors generated by NMF aren't orthogonal. We show that for 1% drop in accuracy as compared to the Histogram of Oriented Gradients (HOG) representation we can achieve robustness to partial occlusion.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper introduces a scheme for classification of online handwritten characters based on polynomial regression of the sampled points of the sub-strokes in a character. The segmentation is done based on the velocity profile of the written character and this requires a smoothening of the velocity profile. We propose a novel scheme for smoothening the velocity profile curve and identification of the critical points to segment the character. We also porpose another method for segmentation based on the human eye perception. We then extract two sets of features for recognition of handwritten characters. Each sub-stroke is a simple curve, a part of the character, and is represented by the distance measure of each point from the first point. This forms the first set of feature vector for each character. The second feature vector are the coeficients obtained from the B-splines fitted to the control knots obtained from the segmentation algorithm. The feature vector is fed to the SVM classifier and it indicates an efficiency of 68% using the polynomial regression technique and 74% using the spline fitting method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

3D Face Recognition is an active area of research for past several years. For a 3D face recognition system one would like to have an accurate as well as low cost setup for constructing 3D face model. In this paper, we use Profilometry approach to obtain a 3D face model.This method gives a low cost solution to the problem of acquiring 3D data and the 3D face models generated by this method are sufficiently accurate. We also develop an algorithm that can use the 3D face model generated by the above method for the recognition purpose.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Image and video filtering is a key image-processing task in computer vision especially in noisy environment. In most of the cases the noise source is unknown and hence possess a major difficulty in the filtering operation. In this paper we present an error-correction based learning approach for iterative filtering. A new FIR filter is designed in which the filter coefficients are updated based on Widrow-Hoff rule. Unlike the standard filter the proposed filter has the ability to remove noise without the a priori knowledge of the noise. Experimental result shows that the proposed filter efficiently removes the noise and preserves the edges in the image. We demonstrate the capability of the proposed algorithm by testing it on standard images infected by Gaussian noise and on a real time video containing inherent noise. Experimental result shows that the proposed filter is better than some of the existing standard filters

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A cooperative integration of stereopsis and shape-from-shading is presented. The integration makes the process of D surface reconstruction better constrained and more reliable. It also obviates the need for surface boundary conditions, and explicit information about the surface albedo and the light source direction, which can now be estimated in an iterative manner

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In many real world prediction problems the output is a structured object like a sequence or a tree or a graph. Such problems range from natural language processing to compu- tational biology or computer vision and have been tackled using algorithms, referred to as structured output learning algorithms. We consider the problem of structured classifi- cation. In the last few years, large margin classifiers like sup-port vector machines (SVMs) have shown much promise for structured output learning. The related optimization prob -lem is a convex quadratic program (QP) with a large num-ber of constraints, which makes the problem intractable for large data sets. This paper proposes a fast sequential dual method (SDM) for structural SVMs. The method makes re-peated passes over the training set and optimizes the dual variables associated with one example at a time. The use of additional heuristics makes the proposed method more efficient. We present an extensive empirical evaluation of the proposed method on several sequence learning problems.Our experiments on large data sets demonstrate that the proposed method is an order of magnitude faster than state of the art methods like cutting-plane method and stochastic gradient descent method (SGD). Further, SDM reaches steady state generalization performance faster than the SGD method. The proposed SDM is thus a useful alternative for large scale structured output learning.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Scenic word images undergo degradations due to motion blur, uneven illumination, shadows and defocussing, which lead to difficulty in segmentation. As a result, the recognition results reported on the scenic word image datasets of ICDAR have been low. We introduce a novel technique, where we choose the middle row of the image as a sub-image and segment it first. Then, the labels from this segmented sub-image are used to propagate labels to other pixels in the image. This approach, which is unique and distinct from the existing methods, results in improved segmentation. Bayesian classification and Max-flow methods have been independently used for label propagation. This midline based approach limits the impact of degradations that happens to the image. The segmented text image is recognized using the trial version of Omnipage OCR. We have tested our method on ICDAR 2003 and ICDAR 2011 datasets. Our word recognition results of 64.5% and 71.6% are better than those of methods in the literature and also methods that competed in the Robust reading competition. Our method makes an implicit assumption that degradation is not present in the middle row.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multi-view head-pose estimation in low-resolution, dynamic scenes is difficult due to blurred facial appearance and perspective changes as targets move around freely in the environment. Under these conditions, acquiring sufficient training examples to learn the dynamic relationship between position, face appearance and head-pose can be very expensive. Instead, a transfer learning approach is proposed in this work. Upon learning a weighted-distance function from many examples where the target position is fixed, we adapt these weights to the scenario where target positions are varying. The adaptation framework incorporates reliability of the different face regions for pose estimation under positional variation, by transforming the target appearance to a canonical appearance corresponding to a reference scene location. Experimental results confirm effectiveness of the proposed approach, which outperforms state-of-the-art by 9.5% under relevant conditions. To aid further research on this topic, we also make DPOSE- a dynamic, multi-view head-pose dataset with ground-truth publicly available with this paper.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Real-time object tracking is a critical task in many computer vision applications. Achieving rapid and robust tracking while handling changes in object pose and size, varying illumination and partial occlusion, is a challenging task given the limited amount of computational resources. In this paper we propose a real-time object tracker in l(1) framework addressing these issues. In the proposed approach, dictionaries containing templates of overlapping object fragments are created. The candidate fragments are sparsely represented in the dictionary fragment space by solving the l(1) regularized least squares problem. The non zero coefficients indicate the relative motion between the target and candidate fragments along with a fidelity measure. The final object motion is obtained by fusing the reliable motion information. The dictionary is updated based on the object likelihood map. The proposed tracking algorithm is tested on various challenging videos and found to outperform earlier approach.