938 resultados para computer vision, facial expression recognition, swig, red5, actionscript, ruby on rails, html5


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents practical vision-based collision avoidance for objects approximating a single point feature. Using a spherical camera model, a visual predictive control scheme guides the aircraft around the object along a conical spiral trajectory. Visibility, state and control constraints are considered explicitly in the controller design by combining image and vehicle dynamics in the process model, and solving the nonlinear optimization problem over the resulting state space. Importantly, range is not required. Instead, the principles of conical spiral motion are used to design an objective function that simultaneously guides the aircraft along the avoidance trajectory, whilst providing an indication of the appropriate point to stop the spiral behaviour. Our approach is aimed at providing a potential solution to the See and Avoid problem for unmanned aircraft and is demonstrated through a series.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents two algorithms to automate the detection of marine species in aerial imagery. An algorithm from an initial pilot study is presented in which morphology operations and colour analysis formed the basis of its working principle. A second approach is presented in which saturation channel and histogram-based shape profiling were used. We report on performance for both algorithms using datasets collected from an unmanned aerial system at an altitude of 1000 ft. Early results have demonstrated recall values of 48.57% and 51.4%, and precision values of 4.01% and 4.97%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monitoring and estimation of marine populations is of paramount importance for the conservation and management of sea species. Regular surveys are used to this purpose followed often by a manual counting process. This paper proposes an algorithm for automatic detection of dugongs from imagery taken in aerial surveys. Our algorithm exploits the fact that dugongs are rare in most images, therefore we determine regions of interest partially based on color rarity. This simple observation makes the system robust to changes in illumination. We also show that by applying the extended-maxima transform on red-ratio images, submerged dugongs with very fuzzy edges can be detected. Performance figures obtained here are promising in terms of degree of confidence in the detection of marine species, but more importantly our approach represents a significant step in automating this type of surveys.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Stereo visual odometry has received little investigation in high altitude applications due to the generally poor performance of rigid stereo rigs at extremely small baseline-to-depth ratios. Without additional sensing, metric scale is considered lost and odometry is seen as effective only for monocular perspectives. This paper presents a novel modification to stereo based visual odometry that allows accurate, metric pose estimation from high altitudes, even in the presence of poor calibration and without additional sensor inputs. By relaxing the (typically fixed) stereo transform during bundle adjustment and reducing the dependence on the fixed geometry for triangulation, metrically scaled visual odometry can be obtained in situations where high altitude and structural deformation from vibration would cause traditional algorithms to fail. This is achieved through the use of a novel constrained bundle adjustment routine and accurately scaled pose initializer. We present visual odometry results demonstrating the technique on a short-baseline stereo pair inside a fixed-wing UAV flying at significant height (~30-100m).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Achieving a robust, accurately scaled pose estimate in long-range stereo presents significant challenges. For large scene depths, triangulation from a single stereo pair is inadequate and noisy. Additionally, vibration and flexible rigs in airborne applications mean accurate calibrations are often compromised. This paper presents a technique for accurately initializing a long-range stereo VO algorithm at large scene depth, with accurate scale, without explicitly computing structure from rigidly fixed camera pairs. By performing a monocular pose estimate over a window of frames from a single camera, followed by adding the secondary camera frames in a modified bundle adjustment, an accurate, metrically scaled pose estimate can be found. To achieve this the scale of the stereo pair is included in the optimization as an additional parameter. Results are presented both on simulated and field gathered data from a fixed-wing UAV flying at significant altitude, where the epipolar geometry is inaccurate due to structural deformation and triangulation from a single pair is insufficient. Comparisons are made with more conventional VO techniques where the scale is not explicitly optimized, and demonstrated over repeated trials to indicate robustness.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Collisions between pedestrians and vehicles continue to be a major problem throughout the world. Pedestrians trying to cross roads and railway tracks without any caution are often highly susceptible to collisions with vehicles and trains. Continuous financial, human and other losses have prompted transport related organizations to come up with various solutions addressing this issue. However, the quest for new and significant improvements in this area is still ongoing. This work addresses this issue by building a general framework using computer vision techniques to automatically monitor pedestrian movements in such high-risk areas to enable better analysis of activity, and the creation of future alerting strategies. As a result of rapid development in the electronics and semi-conductor industry there is extensive deployment of CCTV cameras in public places to capture video footage. This footage can then be used to analyse crowd activities in those particular places. This work seeks to identify the abnormal behaviour of individuals in video footage. In this work we propose using a Semi-2D Hidden Markov Model (HMM), Full-2D HMM and Spatial HMM to model the normal activities of people. The outliers of the model (i.e. those observations with insufficient likelihood) are identified as abnormal activities. Location features, flow features and optical flow textures are used as the features for the model. The proposed approaches are evaluated using the publicly available UCSD datasets, and we demonstrate improved performance using a Semi-2D Hidden Markov Model compared to other state of the art methods. Further we illustrate how our proposed methods can be applied to detect anomalous events at rail level crossings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

After first observing a person, the task of person re-identification involves recognising an individual at different locations across a network of cameras at a later time. Traditionally, this task has been performed by first extracting appearance features of an individual and then matching these features to the previous observation. However, identifying an individual based solely on appearance can be ambiguous, particularly when people wear similar clothing (i.e. people dressed in uniforms in sporting and school settings). This task is made more difficult when the resolution of the input image is small as is typically the case in multi-camera networks. To circumvent these issues, we need to use other contextual cues. In this paper, we use "group" information as our contextual feature to aid in the re-identification of a person, which is heavily motivated by the fact that people generally move together as a collective group. To encode group context, we learn a linear mapping function to assign each person to a "role" or position within the group structure. We then combine the appearance and group context cues using a weighted summation. We demonstrate how this improves performance of person re-identification in a sports environment over appearance based-features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Active Appearance Models (AAMs) employ a paradigm of inverting a synthesis model of how an object can vary in terms of shape and appearance. As a result, the ability of AAMs to register an unseen object image is intrinsically linked to two factors. First, how well the synthesis model can reconstruct the object image. Second, the degrees of freedom in the model. Fewer degrees of freedom yield a higher likelihood of good fitting performance. In this paper we look at how these seemingly contrasting factors can complement one another for the problem of AAM fitting of an ensemble of images stemming from a constrained set (e.g. an ensemble of face images of the same person).