71 resultados para 280208 Computer Vision


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis focuses on the enhancement of differential optical flow techniques. The framwork of differential optical flow has been built upon to improve object motion estimation within a video stream or image sequence. This augmentation comes in the form of a combined optical flow and object state estimation method (SEOS)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Correspondence estimation in one of the most active research areas in the field of computer vision and number of techniques has been proposed, possessing both advantages and shortcomings. Among the techniques reported, multiresolution analysis based stereo correspondence estimation has gained lot of research focus in recent years. Although, the most widely employed medium for multiresolution analysis is wavelets and multiwavelets bases, however, relatively little work has been reported in this context. In this work we have tried to address some of the issues regarding the work done in this domain and the inherited shortcomings. In the light of these shortcomings, we propose a new technique to overcome some of the flaws that could have significantly impact on the algorithm performance and has not been addressed in the earlier propositions. Proposed algorithm uses multiresolution analysis enforced with wavelets/multiwavelts transform modulus maxima to establish correspondences between the stereo pair of images. Variety of wavelets and multiwavelets bases, possessing distinct properties such as orthogonality, approximation order, short support and shape are employed to analyse their effect on the performance of correspondence estimation. The idea is to provide knowledge base to understand and establish relationships between wavelets and multiwavelets properties and their effect on the quality of stereo correspondence estimation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

How to recognize human action from videos captured by modern cameras efficiently and effectively is a challenge in real applications. Traditional methods which need professional analysts are facing a bottleneck because of their shortcomings. To cope with the disadvantage, methods based on computer vision techniques, without or with only a few human interventions, have been proposed to analyse human actions in videos automatically. This paper provides a method combining the three dimensional Scale Invariant Feature Transform (SIFT) detector and the Latent Dirichlet Allocation (LDA) model for human motion analysis. To represent videos effectively and robustly, we extract the 3D SIFT descriptor around each interest point, which is sampled densely from 3D Space-time video volumes. After obtaining the representation of each video frame, the LDA model is adopted to discover the underlying structure-the categorization of human actions in the collection of videos. Public available standard datasets are used to test our method. The concluding part discusses the research challenges and future directions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Human action recognition has been attracted lots of interest from computer vision researchers due to its various promising applications. In this paper, we employ Pyramid Histogram of Orientation Gradient (PHOG) to characterize human figures for action recognition. Comparing to silhouette-based features, the PHOG descriptor does not require extraction of human silhouettes or contours. Two state-space models, i.e.; Hidden Markov Model (HMM) and Conditional Random Field (CRF), are adopted to model the dynamic human movement. The proposed PHOG descriptor and the state-space models with respect to different parameters are tested using a standard dataset. We also testify the robustness of the method with respect to various unconstrained conditions and viewpoints. Promising experimental result demonstrates the effectiveness and robustness of our proposed method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Offline handwritten recognition is an important automated process in pattern recognition and computer vision field. This paper presents an approach of polar coordinate-based handwritten recognition system involving Support Vector Machines (SVM) classification methodology to achieve high recognition performance. We provide comparison and evaluation for zoning feature extraction methods applied in Polar system. The recognition results we proposed were trained and tested by using SVM with a set of 650 handwritten character images. All the input images are segmented (isolated) handwritten characters. Compared with Cartesian based handwritten recognition system, the recognition rate is more stable and improved up to 86.63%.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Automated tracking of objects through a sequence of images has remained one of the difficult problems in computer vision. Numerous algorithms and techniques have been proposed for this task. Some algorithms perform well in restricted environments, such as tracking using stationary cameras, but a general solution is not currently available. A frequent problem is that when an algorithm is refined for one application, it becomes unsuitable for other applications. This paper proposes a general tracking system based on a different approach. Rather than refine one algorithm for a specific tracking task, two tracking algorithms are employed, and used to correct each other during the tracking task. By choosing the two algorithms such that they have complementary failure modes, a robust algorithm is created without increased specialisation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we described new technique for 1-D and 2-D edge feature extraction to subpixel accuracy using edge models and the local energy approach. A candidate edge is modeled as one of a number of parametric edge models, and the fit is refined by a least-squared error fitting technique.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a low-cost interactive active monocular range finder and illustrates the effect of introducing interactivity to the range acquisition process. The range finder consists of only one camera and a laser pointer, to which three LEDs are attached. When a user scans the laser along surfaces of objects, the camera captures the image of spots (one from the laser, and the others from LEDs), and triangulation is carried out using the camera's viewing direction and the optical axis of the laser. The user interaction allows the range finder to acquire range data in which the sampling rate varies across the object depending on the underlying surface structures. Moreover, the processes of separating objects from the background and/or finding parts in the object can be achieved using the operator's knowledge of the objects.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Interpretation of video information is a difficult task for computer vision and machine intelligence. In this paper we examine the utility of a non-image based source of information about video contents, namely the shot list, and study its use in aiding image interpretation. We show how the shot list may be analysed to produce a simple summary of the 'who and where' of a documentary or interview video. In order to detect the subject of a video we use the notion of a 'shot syntax' of a particular genre to isolate actual interview sections.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present techniques for inverting sparse, symmetric and positive definite matrices on parallel and distributed computers. We propose two algorithms, one for SIMD implementation and the other for MIMD implementation. These algorithms are modified versions of Gaussian elimination and they take into account the sparseness of the matrix. Our algorithms perform better than the general parallel Gaussian elimination algorithm. In order to demonstrate the usefulness of our technique, we implemented the snake problem using our sparse matrix algorithm. Our studies reveal that the proposed sparse matrix inversion algorithm significantly reduces the time taken for obtaining the solution of the snake problem. In this paper, we present the results of our experimental work.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

One of the possible models of the human visual system (HVS) in the computer vision literature has a high resolution fovea and exponentially decreasing resolution periphery. The high resolution fovea is used to extract necessary information in order to solve a vision task and the periphery may be used to detect motion. To obtain the desired information, the fovea is guided by the contents of the scene and other knowledge to position the fovea over areas of interest. These eye movements are called saccades and corrective saccades. A two stage process has been implemented as a mechanism for changing foveation in log polar space. Initially, the open loop stage roughly foveates on the best interest feature and then the closed loop stage is invoked to accurately iteratively converge onto the foveation point. The open loop stage developed for the foveation algorithm is applied to saccadic eye movements and a tracking system. Log polar space is preferred over Cartesian space as: (1) it simultaneously provides high resolution and a wide viewing angle; and (2) feature invariance occurs in the fovea which simplifies the foveation process.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a general purpose flexible technique which uses physical modelling techniques for determining the features of a 3D object that are visible from any predefined view. Physical modelling techniques are used to determine which of many different types of features are visible from a complete set of viewpoints. The power of this technique lies in its ability to detect and parameterise object features, regardless of object complexity. Raytracing is used to simulate the physical process by which object features are visible so that surface properties (eg specularity, transparency) as well as object boundaries can be used in the recognition process. Using this technique occluding and non-occluding edge based features are extracted using image processing techniques and then parameterised. Features caused by specularity are also extracted and qualitative descriptions for these are defined.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently, a simple yet powerful branch-and-bound method called Efficient Subwindow Search (ESS) was developed to speed up sliding window search in object detection. A major drawback of ESS is that its computational complexity varies widely from O(n2) to O(n4) for n × n matrices. Our experimental experience shows that the ESS's performance is highly related to the optimal confidence levels which indicate the probability of the object's presence. In particular, when the object is not in the image, the optimal subwindow scores low and ESS may take a large amount of iterations to converge to the optimal solution and so perform very slow. Addressing this problem, we present two significantly faster methods based on the linear-time Kadane's Algorithm for 1D maximum subarray search. The first algorithm is a novel, computationally superior branchand- bound method where the worst case complexity is reduced to O(n3). Experiments on the PASCAL VOC 2006 data set demonstrate that this method is significantly and consistently faster (approximately 30 times faster on average) than the original ESS. Our second algorithm is an approximate algorithm based on alternating search, whose computational complexity is typically O(n2). Experiments shows that (on average) it is 30 times faster again than our first algorithm, or 900 times faster than ESS. It is thus wellsuited for real time object detection.