883 resultados para vision-based place recognition
Resumo:
The following paper proposes a novel application of Skid-to-Turn maneuvers for fixed wing Unmanned Aerial Vehicles (UAVs) inspecting locally linear infrastructure. Fixed wing UAVs, following the design of manned aircraft, traditionally employ Bank-to-Turn maneuvers to change heading and thus direction of travel. Commonly overlooked is the effect these maneuvers have on downward facing body fixed sensors, which as a result of bank, point away from the feature during turns. By adopting Skid-to-Turn maneuvers, the aircraft is able change heading whilst maintaining wings level flight, thus allowing body fixed sensors to maintain a downward facing orientation. Eliminating roll also helps to improve data quality, as sensors are no longer subjected to the swinging motion induced as they pivot about an axis perpendicular to their line of sight. Traditional tracking controllers that apply an indirect approach of capturing ground based data by flying directly overhead can also see the feature off center due to steady state pitch and roll required to stay on course. An Image Based Visual Servo controller is developed to address this issue, allowing features to be directly tracked within the image plane. Performance of the proposed controller is tested against that of a Bank-to-Turn tracking controller driven by GPS derived cross track error in a simulation environment developed to simulate the field of view of a body fixed camera.
Resumo:
In this paper we present a novel algorithm for localization during navigation that performs matching over local image sequences. Instead of calculating the single location most likely to correspond to a current visual scene, the approach finds candidate matching locations within every section (subroute) of all learned routes. Through this approach, we reduce the demands upon the image processing front-end, requiring it to only be able to correctly pick the best matching image from within a short local image sequence, rather than globally. We applied this algorithm to a challenging downhill mountainbiking visual dataset where there was significant perceptual or environment change between repeated traverses of the environment, and compared performance to applying the feature-based algorithm FAB-MAP. The results demonstrate the potential for localization using visual sequences, even when there are no visual features that can be reliably detected.
Resumo:
We address the problem of face recognition on video by employing the recently proposed probabilistic linear discrimi-nant analysis (PLDA). The PLDA has been shown to be robust against pose and expression in image-based face recognition. In this research, the method is extended and applied to video where image set to image set matching is performed. We investigate two approaches of computing similarities between image sets using the PLDA: the closest pair approach and the holistic sets approach. To better model face appearances in video, we also propose the heteroscedastic version of the PLDA which learns the within-class covariance of each individual separately. Our experi-ments on the VidTIMIT and Honda datasets show that the combination of the heteroscedastic PLDA and the closest pair approach achieves the best performance.
Resumo:
This paper presents a survey of previously presented vision based aircraft detection flight test, and then presents new flight test results examining the impact of camera field-of view choice on the detection range and false alarm rate characteristics of a vision-based aircraft detection technique. Using data collected from approaching aircraft, we examine the impact of camera fieldof-view choice and confirm that, when aiming for similar levels of detection confidence, an improvement in detection range can be obtained by choosing a smaller effective field-of-view (in terms of degrees per pixel).
Resumo:
This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.
Resumo:
The quick detection of abrupt (unknown) parameter changes in an observed hidden Markov model (HMM) is important in several applications. Motivated by the recent application of relative entropy concepts in the robust sequential change detection problem (and the related model selection problem), this paper proposes a sequential unknown change detection algorithm based on a relative entropy based HMM parameter estimator. Our proposed approach is able to overcome the lack of knowledge of post-change parameters, and is illustrated to have similar performance to the popular cumulative sum (CUSUM) algorithm (which requires knowledge of the post-change parameter values) when examined, on both simulated and real data, in a vision-based aircraft manoeuvre detection problem.
Resumo:
For many years, computer vision has lured researchers with promises of a low-cost, passive, lightweight and information-rich sensor suitable for navigation purposes. The prime difficulty in vision-based navigation is that the navigation solution will continually drift with time unless external information is available, whether it be cues from the appearance of the scene, a map of features (whether built online or known a priori), or from an externally-referenced sensor. It is not merely position that is of interest in the navigation problem. Attitude (i.e. the angular orientation of a body with respect to a reference frame) is integral to a visionbased navigation solution and is often of interest in its own right (e.g. flight control). This thesis examines vision-based attitude estimation in an aerospace environment, and two methods are proposed for constraining drift in the attitude solution; one through a novel integration of optical flow and the detection of the sky horizon, and the other through a loosely-coupled integration of Visual Odometry and GPS position measurements. In the first method, roll angle, pitch angle and the three aircraft body rates are recovered though a novel method of tracking the horizon over time and integrating the horizonderived attitude information with optical flow. An image processing front-end is used to select several candidate lines in a image that may or may not correspond to the true horizon, and the optical flow is calculated for each candidate line. Using an Extended Kalman Filter (EKF), the previously estimated aircraft state is propagated using a motion model and a candidate horizon line is associated using a statistical test based on the optical flow measurements and location of the horizon in the image. Once associated, the selected horizon line, along with the associated optical flow, is used as a measurement to the EKF. To evaluate the accuracy of the algorithm, two flights were conducted, one using a highly dynamic Uninhabited Airborne Vehicle (UAV) in clear flight conditions and the other in a human-piloted Cessna 172 in conditions where the horizon was partially obscured by terrain, haze and smoke. The UAV flight resulted in pitch and roll error standard deviations of 0.42° and 0.71° respectively when compared with a truth attitude source. The Cessna 172 flight resulted in pitch and roll error standard deviations of 1.79° and 1.75° respectively. In the second method for estimating attitude, a novel integrated GPS/Visual Odometry (GPS/VO) navigation filter is proposed, using a structure similar to a classic looselycoupled GPS/INS error-state navigation filter. Under such an arrangement, the error dynamics of the system are derived and a Kalman Filter is developed for estimating the errors in position and attitude. Through similar analysis to the GPS/INS problem, it is shown that the proposed filter is capable of recovering the complete attitude (i.e. pitch, roll and yaw) of the platform when subjected to acceleration not parallel to velocity for both the monocular and stereo variants of the filter. Furthermore, it is shown that under general straight line motion (e.g. constant velocity), only the component of attitude in the direction of motion is unobservable. Numerical simulations are performed to demonstrate the observability properties of the GPS/VO filter in both the monocular and stereo camera configurations. Furthermore, the proposed filter is tested on imagery collected using a Cessna 172 to demonstrate the observability properties on real-world data. The proposed GPS/VO filter does not require additional restrictions or assumptions such as platform-specific dynamics, map-matching, feature-tracking, visual loop-closing, gravity vector or additional sensors such as an IMU or magnetic compass. Since no platformspecific dynamics are required, the proposed filter is not limited to the aerospace domain and has the potential to be deployed in other platforms such as ground robots or mobile phones.
Resumo:
This paper investigates the use of mel-frequency deltaphase (MFDP) features in comparison to, and in fusion with, traditional mel-frequency cepstral coefficient (MFCC) features within joint factor analysis (JFA) speaker verification. MFCC features, commonly used in speaker recognition systems, are derived purely from the magnitude spectrum, with the phase spectrum completely discarded. In this paper, we investigate if features derived from the phase spectrum can provide additional speaker discriminant information to the traditional MFCC approach in a JFA based speaker verification system. Results are presented which provide a comparison of MFCC-only, MFDPonly and score fusion of the two approaches within a JFA speaker verification approach. Based upon the results presented using the NIST 2008 Speaker Recognition Evaluation (SRE) dataset, we believe that, while MFDP features alone cannot compete with MFCC features, MFDP can provide complementary information that result in improved speaker verification performance when both approaches are combined in score fusion, particularly in the case of shorter utterances.
Resumo:
The future emergence of many types of airborne vehicles and unpiloted aircraft in the national airspace means collision avoidance is of primary concern in an uncooperative airspace environment. The ability to replicate a pilot’s see and avoid capability using cameras coupled with vision based avoidance control is an important part of an overall collision avoidance strategy. But unfortunately without range collision avoidance has no direct way to guarantee a level of safety. Collision scenario flight tests with two aircraft and a monocular camera threat detection and tracking system were used to study the accuracy of image-derived angle measurements. The effect of image-derived angle errors on reactive vision-based avoidance performance was then studied by simulation. The results show that whilst large angle measurement errors can significantly affect minimum ranging characteristics across a variety of initial conditions and closing speeds, the minimum range is always bounded and a collision never occurs.
Resumo:
This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques: (a) weighted maximum margin criterion (WMMC), (b) source-normalized WMMC (SN-WMMC), (c) weighted linear discriminant analysis (WLDA), and; (d) source-normalized WLDA (SN-WLDA) have been investigated. We show that, by extracting the discriminatory information between pairs of speakers as well as capturing the source variation information in the development i-vector space, the SN-WLDA based cosine similarity scoring (CSS) i-vector system is shown to provide over 20% improvement in EER for NIST 2008 interview and microphone verification and over 10% improvement in EER for NIST 2008 telephone verification, when compared to SN-LDA based CSS i-vector system. Further, score-level fusion techniques are analyzed to combine the best channel compensation approaches, to provide over 8% improvement in DCF over the best single approach, (SN-WLDA), for NIST 2008 interview/ telephone enrolment-verification condition. Finally, we demonstrate that the improvements found in the context of CSS also generalize to state-of-the-art GPLDA with up to 14% relative improvement in EER for NIST SRE 2010 interview and microphone verification and over 7% relative improvement in EER for NIST SRE 2010 telephone verification.
Resumo:
This work presents a collision avoidance approach based on omnidirectional cameras that does not require the estimation of range between two platforms to resolve a collision encounter. Our method achieves minimum separation between the two vehicles involved by maximising the view-angle given by the omnidirectional sensor. Only visual information is used to achieve avoidance under a bearing- only visual servoing approach. We provide theoretical problem formulation, as well as results from real flights using small quadrotors
Resumo:
In this paper, we present a monocular vision based autonomous navigation system for Micro Aerial Vehicles (MAVs) in GPS-denied environments. The major drawback of monocular systems is that the depth scale of the scene can not be determined without prior knowledge or other sensors. To address this problem, we minimize a cost function consisting of a drift-free altitude measurement and up-to-scale position estimate obtained using the visual sensor. We evaluate the scale estimator, state estimator and controller performance by comparing with ground truth data acquired using a motion capture system. All resources including source code, tutorial documentation and system models are available online.
Resumo:
The problem of estimating pseudobearing rate information of an airborne target based on measurements from a vision sensor is considered. Novel image speed and heading angle estimators are presented that exploit image morphology, hidden Markov model (HMM) filtering, and relative entropy rate (RER) concepts to allow pseudobearing rate information to be determined before (or whilst) the target track is being estimated from vision information.
Resumo:
Machine vision is emerging as a viable sensing approach for mid-air collision avoidance (particularly for small to medium aircraft such as unmanned aerial vehicles). In this paper, using relative entropy rate concepts, we propose and investigate a new change detection approach that uses hidden Markov model filters to sequentially detect aircraft manoeuvres from morphologically processed image sequences. Experiments using simulated and airborne image sequences illustrate the performance of our proposed algorithm in comparison to other sequential change detection approaches applied to this application.
Resumo:
This paper presents a new multi-scale place recognition system inspired by the recent discovery of overlapping, multi-scale spatial maps stored in the rodent brain. By training a set of Support Vector Machines to recognize places at varying levels of spatial specificity, we are able to validate spatially specific place recognition hypotheses against broader place recognition hypotheses without sacrificing localization accuracy. We evaluate the system in a range of experiments using cameras mounted on a motorbike and a human in two different environments. At 100% precision, the multiscale approach results in a 56% average improvement in recall rate across both datasets. We analyse the results and then discuss future work that may lead to improvements in both robotic mapping and our understanding of sensory processing and encoding in the mammalian brain.