986 resultados para Computer vision teaching


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Human action recognition has been attracted lots of interest from computer vision researchers due to its various promising applications. In this paper, we employ Pyramid Histogram of Orientation Gradient (PHOG) to characterize human figures for action recognition. Comparing to silhouette-based features, the PHOG descriptor does not require extraction of human silhouettes or contours. Two state-space models, i.e.; Hidden Markov Model (HMM) and Conditional Random Field (CRF), are adopted to model the dynamic human movement. The proposed PHOG descriptor and the state-space models with respect to different parameters are tested using a standard dataset. We also testify the robustness of the method with respect to various unconstrained conditions and viewpoints. Promising experimental result demonstrates the effectiveness and robustness of our proposed method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Offline handwritten recognition is an important automated process in pattern recognition and computer vision field. This paper presents an approach of polar coordinate-based handwritten recognition system involving Support Vector Machines (SVM) classification methodology to achieve high recognition performance. We provide comparison and evaluation for zoning feature extraction methods applied in Polar system. The recognition results we proposed were trained and tested by using SVM with a set of 650 handwritten character images. All the input images are segmented (isolated) handwritten characters. Compared with Cartesian based handwritten recognition system, the recognition rate is more stable and improved up to 86.63%.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Automated tracking of objects through a sequence of images has remained one of the difficult problems in computer vision. Numerous algorithms and techniques have been proposed for this task. Some algorithms perform well in restricted environments, such as tracking using stationary cameras, but a general solution is not currently available. A frequent problem is that when an algorithm is refined for one application, it becomes unsuitable for other applications. This paper proposes a general tracking system based on a different approach. Rather than refine one algorithm for a specific tracking task, two tracking algorithms are employed, and used to correct each other during the tracking task. By choosing the two algorithms such that they have complementary failure modes, a robust algorithm is created without increased specialisation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we described new technique for 1-D and 2-D edge feature extraction to subpixel accuracy using edge models and the local energy approach. A candidate edge is modeled as one of a number of parametric edge models, and the fit is refined by a least-squared error fitting technique.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a low-cost interactive active monocular range finder and illustrates the effect of introducing interactivity to the range acquisition process. The range finder consists of only one camera and a laser pointer, to which three LEDs are attached. When a user scans the laser along surfaces of objects, the camera captures the image of spots (one from the laser, and the others from LEDs), and triangulation is carried out using the camera's viewing direction and the optical axis of the laser. The user interaction allows the range finder to acquire range data in which the sampling rate varies across the object depending on the underlying surface structures. Moreover, the processes of separating objects from the background and/or finding parts in the object can be achieved using the operator's knowledge of the objects.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Interpretation of video information is a difficult task for computer vision and machine intelligence. In this paper we examine the utility of a non-image based source of information about video contents, namely the shot list, and study its use in aiding image interpretation. We show how the shot list may be analysed to produce a simple summary of the 'who and where' of a documentary or interview video. In order to detect the subject of a video we use the notion of a 'shot syntax' of a particular genre to isolate actual interview sections.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present techniques for inverting sparse, symmetric and positive definite matrices on parallel and distributed computers. We propose two algorithms, one for SIMD implementation and the other for MIMD implementation. These algorithms are modified versions of Gaussian elimination and they take into account the sparseness of the matrix. Our algorithms perform better than the general parallel Gaussian elimination algorithm. In order to demonstrate the usefulness of our technique, we implemented the snake problem using our sparse matrix algorithm. Our studies reveal that the proposed sparse matrix inversion algorithm significantly reduces the time taken for obtaining the solution of the snake problem. In this paper, we present the results of our experimental work.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One of the possible models of the human visual system (HVS) in the computer vision literature has a high resolution fovea and exponentially decreasing resolution periphery. The high resolution fovea is used to extract necessary information in order to solve a vision task and the periphery may be used to detect motion. To obtain the desired information, the fovea is guided by the contents of the scene and other knowledge to position the fovea over areas of interest. These eye movements are called saccades and corrective saccades. A two stage process has been implemented as a mechanism for changing foveation in log polar space. Initially, the open loop stage roughly foveates on the best interest feature and then the closed loop stage is invoked to accurately iteratively converge onto the foveation point. The open loop stage developed for the foveation algorithm is applied to saccadic eye movements and a tracking system. Log polar space is preferred over Cartesian space as: (1) it simultaneously provides high resolution and a wide viewing angle; and (2) feature invariance occurs in the fovea which simplifies the foveation process.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a general purpose flexible technique which uses physical modelling techniques for determining the features of a 3D object that are visible from any predefined view. Physical modelling techniques are used to determine which of many different types of features are visible from a complete set of viewpoints. The power of this technique lies in its ability to detect and parameterise object features, regardless of object complexity. Raytracing is used to simulate the physical process by which object features are visible so that surface properties (eg specularity, transparency) as well as object boundaries can be used in the recognition process. Using this technique occluding and non-occluding edge based features are extracted using image processing techniques and then parameterised. Features caused by specularity are also extracted and qualitative descriptions for these are defined.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently, a simple yet powerful branch-and-bound method called Efficient Subwindow Search (ESS) was developed to speed up sliding window search in object detection. A major drawback of ESS is that its computational complexity varies widely from O(n2) to O(n4) for n × n matrices. Our experimental experience shows that the ESS's performance is highly related to the optimal confidence levels which indicate the probability of the object's presence. In particular, when the object is not in the image, the optimal subwindow scores low and ESS may take a large amount of iterations to converge to the optimal solution and so perform very slow. Addressing this problem, we present two significantly faster methods based on the linear-time Kadane's Algorithm for 1D maximum subarray search. The first algorithm is a novel, computationally superior branchand- bound method where the worst case complexity is reduced to O(n3). Experiments on the PASCAL VOC 2006 data set demonstrate that this method is significantly and consistently faster (approximately 30 times faster on average) than the original ESS. Our second algorithm is an approximate algorithm based on alternating search, whose computational complexity is typically O(n2). Experiments shows that (on average) it is 30 times faster again than our first algorithm, or 900 times faster than ESS. It is thus wellsuited for real time object detection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a joint representation and classification framework that achieves the dual goal of finding the most discriminative sparse overcomplete encoding and optimal classifier parameters. Formulating an optimization problem that combines the objective function of the classification with the representation error of both labeled and unlabeled data, constrained by sparsity, we propose an algorithm that alternates between solving for subsets of parameters, whilst preserving the sparsity. The method is then evaluated over two important classification problems in computer vision: object categorization of natural images using the Caltech 101 database and face recognition using the Extended Yale B face database. The results show that the proposed method is competitive against other recently proposed sparse overcomplete counterparts and considerably outperforms many recently proposed face recognition techniques when the number training samples is small.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present novel ridge regression (RR) and kernel ridge regression (KRR) techniques for multivariate labels and apply the methods to the problem of face recognition. Motivated by the fact that the regular simplex vertices are separate points with highest degree of symmetry, we choose such vertices as the targets for the distinct individuals in recognition and apply RR or KRR to map the training face images into a face subspace where the training images from each individual will locate near their individual targets. We identify the new face image by mapping it into this face subspace and comparing its distance to all individual targets. An efficient cross-validation algorithm is also provided for selecting the regularization and kernel parameters. Experiments were conducted on two face databases and the results demonstrate that the proposed algorithm significantly outperforms the three popular linear face recognition techniques (Eigenfaces, Fisherfaces and Laplacianfaces) and also performs comparably with the recently developed Orthogonal Laplacianfaces with the advantage of computational speed. Experimental results also demonstrate that KRR outperforms RR as expected since KRR can utilize the nonlinear structure of the face images. Although we concentrate on face recognition in this paper, the proposed method is general and may be applied for general multi-category classification problems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Activity recognition is an important issue in building intelligent monitoring systems. We address the recognition of multilevel activities in this paper via a conditional Markov random field (MRF), known as the dynamic conditional random field (DCRF). Parameter estimation in general MRFs using maximum likelihood is known to be computationally challenging (except for extreme cases), and thus we propose an efficient boosting-based algorithm AdaBoost.MRF for this task. Distinct from most existing work, our algorithm can handle hidden variables (missing labels) and is particularly attractive for smarthouse domains where reliable labels are often sparsely observed. Furthermore, our method works exclusively on trees and thus is guaranteed to converge. We apply the AdaBoost.MRF algorithm to a home video surveillance application and demonstrate its efficacy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Traditional methods of object recognition are reliant on shape and so are very difficult to apply in cluttered, wideangle and low-detail views such as surveillance scenes. To address this, a method of indirect object recognition is proposed, where human activity is used to infer both the location and identity of objects. No shape analysis is necessary. The concept is dubbed 'interaction signatures', since the premise is that a human will interact with objects in ways characteristic of the function of that object - for example, a person sits in a chair and drinks from a cup. The human-centred approach means that recognition is possible in low-detail views and is largely invariant to the shape of objects within the same functional class. This paper implements a Bayesian network for classifying region patches with object labels, building upon our previous work in automatically segmenting and recognising a human's interactions with the objects. Experiments show that interaction signatures can successfully find and label objects in low-detail views and are equally effective at recognising test objects that differ markedly in appearance from the training objects.