994 resultados para video sequence matching


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Given a set of events and a set of robots, the dispatch problem is to allocate one robot for each event to visit it. In a single round, each robot may be allowed to visit only one event (matching dispatch), or several events in a sequence (sequence dispatch). In a distributed setting, each event is discovered by a sensor and reported to a robot. Here, we present novel algorithms aimed at overcoming the shortcomings of several existing solutions. We propose pairwise distance based matching algorithm (PDM) to eliminate long edges by pairwise exchanges between matching pairs. Our sequence dispatch algorithm (SQD) iteratively finds the closest event-robot pair, includes the event in dispatch schedule of the selected robot and updates its position accordingly. When event-robot distances are multiplied by robot resistance (inverse of the remaining energy), the corresponding energy-balanced variants are obtained. We also present generalizations which handle multiple visits and timing constraints. Our localized algorithm MAD is based on information mesh infrastructure and local auctions within the robot network for obtaining the optimal dispatch schedule for each robot. The simulations conducted confirm the advantages of our algorithms over other existing solutions in terms of average robot-event distance and lifetime.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Identifying an individual from surveillance video is a difficult, time consuming and labour intensive process. The proposed system aims to streamline this process by filtering out unwanted scenes and enhancing an individual's face through super-resolution. An automatic face recognition system is then used to identify the subject or present the human operator with likely matches from a database. A person tracker is used to speed up the subject detection and super-resolution process by tracking moving subjects and cropping a region of interest around the subject's face to reduce the number and size of the image frames to be super-resolved respectively. In this paper, experiments have been conducted to demonstrate how the optical flow super-resolution method used improves surveillance imagery for visual inspection as well as automatic face recognition on an Eigenface and Elastic Bunch Graph Matching system. The optical flow based method has also been benchmarked against the ``hallucination'' algorithm, interpolation methods and the original low-resolution images. Results show that both super-resolution algorithms improved recognition rates significantly. Although the hallucination method resulted in slightly higher recognition rates, the optical flow method produced less artifacts and more visually correct images suitable for human consumption.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual recording devices such as video cameras, CCTVs, or webcams have been broadly used to facilitate work progress or safety monitoring on construction sites. Without human intervention, however, both real-time reasoning about captured scenes and interpretation of recorded images are challenging tasks. This article presents an exploratory method for automated object identification using standard video cameras on construction sites. The proposed method supports real-time detection and classification of mobile heavy equipment and workers. The background subtraction algorithm extracts motion pixels from an image sequence, the pixels are then grouped into regions to represent moving objects, and finally the regions are identified as a certain object using classifiers. For evaluating the method, the formulated computer-aided process was implemented on actual construction sites, and promising results were obtained. This article is expected to contribute to future applications of automated monitoring systems of work zone safety or productivity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examined the effect that temporal order within the entrepreneurial discovery-exploitation process has on the outcomes of venture creation. Consistent with sequential theories of discovery-exploitation, the general flow of venture creation was found to be directed from discovery toward exploitation in a random sample of nascent ventures. However, venture creation attempts which specifically follow this sequence derive poor outcomes. Moreover, simultaneous discovery-exploitation was the most prevalent temporal order observed, and venture attempts that proceed in this manner more likely become operational. These findings suggest that venture creation is a multi-scale phenomenon that is at once directional in time, and simultaneously driven by symbiotically coupled discovery and exploitation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a novel algorithm for localization during navigation that performs matching over local image sequences. Instead of calculating the single location most likely to correspond to a current visual scene, the approach finds candidate matching locations within every section (subroute) of all learned routes. Through this approach, we reduce the demands upon the image processing front-end, requiring it to only be able to correctly pick the best matching image from within a short local image sequence, rather than globally. We applied this algorithm to a challenging downhill mountainbiking visual dataset where there was significant perceptual or environment change between repeated traverses of the environment, and compared performance to applying the feature-based algorithm FAB-MAP. The results demonstrate the potential for localization using visual sequences, even when there are no visual features that can be reliably detected.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The time consuming and labour intensive task of identifying individuals in surveillance video is often challenged by poor resolution and the sheer volume of stored video. Faces or identifying marks such as tattoos are often too coarse for direct matching by machine or human vision. Object tracking and super-resolution can then be combined to facilitate the automated detection and enhancement of areas of interest. The object tracking process enables the automatic detection of people of interest, greatly reducing the amount of data for super-resolution. Smaller regions such as faces can also be tracked. A number of instances of such regions can then be utilized to obtain a super-resolved version for matching. Performance improvement from super-resolution is demonstrated using a face verification task. It is shown that there is a consistent improvement of approximately 7% in verification accuracy, using both Eigenface and Elastic Bunch Graph Matching approaches for automatic face verification, starting from faces with an eye to eye distance of 14 pixels. Visual improvement in image fidelity from super-resolved images over low-resolution and interpolated images is demonstrated on a small database. Current research and future directions in this area are also summarized.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Facial expression is an important channel of human social communication. Facial expression recognition (FER) aims to perceive and understand emotional states of humans based on information in the face. Building robust and high performance FER systems that can work in real-world video is still a challenging task, due to the various unpredictable facial variations and complicated exterior environmental conditions, as well as the difficulty of choosing a suitable type of feature descriptor for extracting discriminative facial information. Facial variations caused by factors such as pose, age, gender, race and occlusion, can exert profound influence on the robustness, while a suitable feature descriptor largely determines the performance. Most present attention on FER has been paid to addressing variations in pose and illumination. No approach has been reported on handling face localization errors and relatively few on overcoming facial occlusions, although the significant impact of these two variations on the performance has been proved and highlighted in many previous studies. Many texture and geometric features have been previously proposed for FER. However, few comparison studies have been conducted to explore the performance differences between different features and examine the performance improvement arisen from fusion of texture and geometry, especially on data with spontaneous emotions. The majority of existing approaches are evaluated on databases with posed or induced facial expressions collected in laboratory environments, whereas little attention has been paid on recognizing naturalistic facial expressions on real-world data. This thesis investigates techniques for building robust and high performance FER systems based on a number of established feature sets. It comprises of contributions towards three main objectives: (1) Robustness to face localization errors and facial occlusions. An approach is proposed to handle face localization errors and facial occlusions using Gabor based templates. Template extraction algorithms are designed to collect a pool of local template features and template matching is then performed to covert these templates into distances, which are robust to localization errors and occlusions. (2) Improvement of performance through feature comparison, selection and fusion. A comparative framework is presented to compare the performance between different features and different feature selection algorithms, and examine the performance improvement arising from fusion of texture and geometry. The framework is evaluated for both discrete and dimensional expression recognition on spontaneous data. (3) Evaluation of performance in the context of real-world applications. A system is selected and applied into discriminating posed versus spontaneous expressions and recognizing naturalistic facial expressions. A database is collected from real-world recordings and is used to explore feature differences between standard database images and real-world images, as well as between real-world images and real-world video frames. The performance evaluations are based on the JAFFE, CK, Feedtum, NVIE, Semaine and self-collected QUT databases. The results demonstrate high robustness of the proposed approach to the simulated localization errors and occlusions. Texture and geometry have different contributions to the performance of discrete and dimensional expression recognition, as well as posed versus spontaneous emotion discrimination. These investigations provide useful insights into enhancing robustness and achieving high performance of FER systems, and putting them into real-world applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

From a law enforcement standpoint, the ability to search for a person matching a semantic description (i.e. 1.8m tall, red shirt, jeans) is highly desirable. While a significant research effort has focused on person re-detection (the task of identifying a previously observed individual in surveillance video), these techniques require descriptors to be built from existing image or video observations. As such, person re-detection techniques are not suited to situations where footage of the person of interest is not readily available, such as a witness reporting a recent crime. In this paper, we present a novel framework that is able to search for a person based on a semantic description. The proposed approach uses size and colour cues, and does not require a person detection routine to locate people in the scene, improving utility in crowded conditions. The proposed approach is demonstrated with a new database that will be made available to the research community, and we show that the proposed technique is able to correctly localise a person in a video based on a simple semantic description.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An algorithm for computing dense correspondences between images of a stereo pair or image sequence is presented. The algorithm can make use of both standard matching metrics and the rank and census filters, two filters based on order statistics which have been applied to the image matching problem. Their advantages include robustness to radiometric distortion and amenability to hardware implementation. Results obtained using both real stereo pairs and a synthetic stereo pair with ground truth were compared. The rank and census filters were shown to significantly improve performance in the case of radiometric distortion. In all cases, the results obtained were comparable to, if not better than, those obtained using standard matching metrics. Furthermore, the rank and census have the additional advantage that their computational overhead is less than these metrics. For all techniques tested, the difference between the results obtained for the synthetic stereo pair, and the ground truth results was small.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The location of previously unseen and unregistered individuals in complex camera networks from semantic descriptions is a time consuming and often inaccurate process carried out by human operators, or security staff on the ground. To promote the development and evaluation of automated semantic description based localisation systems, we present a new, publicly available, unconstrained 110 sequence database, collected from 6 stationary cameras. Each sequence contains detailed semantic information for a single search subject who appears in the clip (gender, age, height, build, hair and skin colour, clothing type, texture and colour), and between 21 and 290 frames for each clip are annotated with the target subject location (over 11,000 frames are annotated in total). A novel approach for localising a person given a semantic query is also proposed and demonstrated on this database. The proposed approach incorporates clothing colour and type (for clothing worn below the waist), as well as height and build to detect people. A method to assess the quality of candidate regions, as well as a symmetry driven approach to aid in modelling clothing on the lower half of the body, is proposed within this approach. An evaluation on the proposed dataset shows that a relative improvement in localisation accuracy of up to 21 is achieved over the baseline technique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

State-of-the-art image-set matching techniques typically implicitly model each image-set with a Gaussian distribution. Here, we propose to go beyond these representations and model image-sets as probability distribution functions (PDFs) using kernel density estimators. To compare and match image-sets, we exploit Csiszar´ f-divergences, which bear strong connections to the geodesic distance defined on the space of PDFs, i.e., the statistical manifold. Furthermore, we introduce valid positive definite kernels on the statistical manifold, which let us make use of more powerful classification schemes to match image-sets. Finally, we introduce a supervised dimensionality reduction technique that learns a latent space where f-divergences reflect the class labels of the data. Our experiments on diverse problems, such as video-based face recognition and dynamic texture classification, evidence the benefits of our approach over the state-of-the-art image-set matching methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many forms of formative feedback are used in dance training to refine the dancer’s spatial and kinaesthetic awareness in order that the dancer’s sensorimotor intentions and observable danced outcomes might converge. This paper documents the use of smartphones to record and playback movement sequences in ballet and contemporary technique classes. Peers in pairs took turns filming one another and then analysing the playback. This provided immediate visual feedback of the movement sequence as performed by each dancer. This immediacy facilitated the dancer’s capacity to associate what they felt as they were dancing with what they looked like during the dance. The often-dissonant realities of self-perception and perception by others were thus guided towards harmony, generating improved performance and knowledge relating to dance technique. An approach is offered for potential development of peer review activities to support summative progressive assessment in dance technique training.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a new method for establishing correlation between deuterium and its attached carbon in a deuterated liquid crystal. The method is based on transfer of polarization using the DAPT pulse sequence proposed originally for two spin half nuclei, now extended to a spin-1 and a spin-1/2 nuclei. DAPT utilizes the evolution of magnetization of the spin pair under two blocks of phase shifted BLEW-12 pulses on one of the spins separated by a 90 degree pulse on the other spin. The method is easy to implement and does not need to satisfy matching conditions unlike the Hartmann-Hahn cross-polarization. Experimental results presented demonstrate the efficacy of the method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Scalable video coding (SVC) is an emerging standard built on the success of advanced video coding standard (H.264/AVC) by the Joint video team (JVT). Motion compensated temporal filtering (MCTF) and Closed loop hierarchical B pictures (CHBP) are two important coding methods proposed during initial stages of standardization. Either of the coding methods, MCTF/CHBP performs better depending upon noise content and characteristics of the sequence. This work identifies other characteristics of the sequences for which performance of MCTF is superior to that of CHBP and presents a method to adaptively select either of MCTF and CHBP coding methods at the GOP level. This method, referred as "Adaptive Decomposition" is shown to provide better R-D performance than of that by using MCTF or CRBP only. Further this method is extended to non-scalable coders.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our study of a novel technique for adaptive image sequence coding is reported. The number of reference frames and the intervals between them are adjusted to improve the temporal compensability of the input video. The bits are distributed more efficiently on different frame types according to temporal and spatial complexity of the image scene. Experimental results show that this dynamic group-of-picture (GOP) structure coding scheme is not only feasible but also better than the conventional fixed GOP method in terms of perceptual quality and SNR. (C) 1996 Society of Photo-Optical Instrumentation Engineers.