201 resultados para Visual Speaker Recognition, Visual Speech Recognition, Cascading Appearance-Based Features
Resumo:
PURPOSE: This study investigated the effects of simulated visual impairment on nighttime driving performance and pedestrian recognition under real-road conditions. METHODS: Closed road nighttime driving performance was measured for 20 young visually normal participants (M = 27.5 +/- 6.1 years) under three visual conditions: normal vision, simulated cataracts, and refractive blur that were incorporated in modified goggles. The visual acuity levels for the cataract and blur conditions were matched for each participant. Driving measures included sign recognition, avoidance of low contrast road hazards, time to complete the course, and lane keeping. Pedestrian recognition was measured for pedestrians wearing either black clothing or black clothing with retroreflective markings on the moveable joints to create the perception of biological motion ("biomotion"). RESULTS: Simulated visual impairment significantly reduced participants' ability to recognize road signs, avoid road hazards, and increased the time taken to complete the driving course (p < 0.05); the effect was greatest for the cataract condition, even though the cataract and blur conditions were matched for visual acuity. Although visual impairment also significantly reduced the ability to recognize the pedestrian wearing black clothing, the pedestrian wearing "biomotion" was seen 80% of the time. CONCLUSIONS: Driving performance under nighttime conditions was significantly degraded by modest visual impairment; these effects were greatest for the cataract condition. Pedestrian recognition was greatly enhanced by marking limb joints in the pattern of "biomotion," which was relatively robust to the effects of visual impairment.
Resumo:
Inspection of solder joints has been a critical process in the electronic manufacturing industry to reduce manufacturing cost, improve yield, and ensure product quality and reliability. This paper proposes two inspection modules for an automatic solder joint classification system. The “front-end” inspection system includes illumination normalisation, localisation and segmentation. The “back-end” inspection involves the classification of solder joints using the Log Gabor filter and classifier fusion. Five different levels of solder quality with respect to the amount of solder paste have been defined. The Log Gabor filter has been demonstrated to achieve high recognition rates and is resistant to misalignment. This proposed system does not need any special illumination system, and the images are acquired by an ordinary digital camera. This system could contribute to the development of automated non-contact, non-destructive and low cost solder joint quality inspection systems.
Resumo:
Characteristics of surveillance video generally include low resolution and poor quality due to environmental, storage and processing limitations. It is extremely difficult for computers and human operators to identify individuals from these videos. To overcome this problem, super-resolution can be used in conjunction with an automated face recognition system to enhance the spatial resolution of video frames containing the subject and narrow down the number of manual verifications performed by the human operator by presenting a list of most likely candidates from the database. As the super-resolution reconstruction process is ill-posed, visual artifacts are often generated as a result. These artifacts can be visually distracting to humans and/or affect machine recognition algorithms. While it is intuitive that higher resolution should lead to improved recognition accuracy, the effects of super-resolution and such artifacts on face recognition performance have not been systematically studied. This paper aims to address this gap while illustrating that super-resolution allows more accurate identification of individuals from low-resolution surveillance footage. The proposed optical flow-based super-resolution method is benchmarked against Baker et al.’s hallucination and Schultz et al.’s super-resolution techniques on images from the Terrascope and XM2VTS databases. Ground truth and interpolated images were also tested to provide a baseline for comparison. Results show that a suitable super-resolution system can improve the discriminability of surveillance video and enhance face recognition accuracy. The experiments also show that Schultz et al.’s method fails when dealing surveillance footage due to its assumption of rigid objects in the scene. The hallucination and optical flow-based methods performed comparably, with the optical flow-based method producing less visually distracting artifacts that interfered with human recognition.
Resumo:
In this paper we present a novel algorithm for localization during navigation that performs matching over local image sequences. Instead of calculating the single location most likely to correspond to a current visual scene, the approach finds candidate matching locations within every section (subroute) of all learned routes. Through this approach, we reduce the demands upon the image processing front-end, requiring it to only be able to correctly pick the best matching image from within a short local image sequence, rather than globally. We applied this algorithm to a challenging downhill mountainbiking visual dataset where there was significant perceptual or environment change between repeated traverses of the environment, and compared performance to applying the feature-based algorithm FAB-MAP. The results demonstrate the potential for localization using visual sequences, even when there are no visual features that can be reliably detected.
Resumo:
Purpose: To determine the effect of moderate levels of refractive blur and simulated cataracts on nighttime pedestrian conspicuity in the presence and absence of headlamp glare. Methods: The ability to recognize pedestrians at night was measured in 28 young adults (M=27.6 years) under three visual conditions: normal vision, refractive blur and simulated cataracts; mean acuity was 20/40 or better in all conditions. Pedestrian recognition distances were recorded while participants drove an instrumented vehicle along a closed road course at night. Pedestrians wore one of three clothing conditions and oncoming headlamps were present for 16 participants and absent for 12 participants. Results: Simulated visual impairment and glare significantly reduced the frequency with which drivers recognized pedestrians and the distance at which the drivers first recognized them. Simulated cataracts were significantly more disruptive than blur even though photopic visual acuity levels were matched. With normal vision, drivers responded to pedestrians at 3.6x and 5.5x longer distances on average than for the blur or cataract conditions, respectively. Even in the presence of visual impairment and glare, pedestrians were recognized more often and at longer distances when they wore a “biological motion” reflective clothing configuration than when they wore a reflective vest or black clothing. Conclusions: Drivers’ ability to recognize pedestrians at night is degraded by common visual impairments even when the drivers’ mean visual acuity meets licensing requirements. To maximize drivers’ ability to see pedestrians, drivers should wear their optimum optical correction, and cataract surgery should be performed early enough to avoid potentially dangerous reductions in visual performance.
Resumo:
While researchers strive to improve automatic face recognition performance, the relationship between image resolution and face recognition performance has not received much attention. This relationship is examined systematically and a framework is developed such that results from super-resolution techniques can be compared. Three super-resolution techniques are compared with the Eigenface and Elastic Bunch Graph Matching face recognition engines. Parameter ranges over which these techniques provide better recognition performance than interpolated images is determined.
Resumo:
Introduction: Delirium is a serious issue associated with high morbidity and mortality in older hospitalised people. Early recognition enables diagnosis and treatment of underlying cause/s, which can lead to improved patient outcomes. However, research shows knowledge and accurate nurse recognition of delirium and is poor and lack of education appears to be a key issue related to this problem. Thus, the purpose of this randomised controlled trial (RCT) was to evaluate, in a sample of registered nurses, the usability and effectiveness of a web-based learning site, designed using constructivist learning principles, to improve acute care nurse knowledge and recognition of delirium. Prior to undertaking the RCT preliminary phases involving; validation of vignettes, video-taping five of the validated vignettes, website development and pilot testing were completed. Methods: The cluster RCT involved consenting registered nurse participants (N = 175) from twelve clinical areas within three acute health care facilities in Queensland, Australia. Data were collected through a variety of measures and instruments. Primary outcomes were improved ability of nurses to recognise delirium using written validated vignettes and improved knowledge of delirium using a delirium knowledge questionnaire. The secondary outcomes were aimed at determining nurse satisfaction and usability of the website. Primary outcome measures were taken at baseline (T1), directly after the intervention (T2) and two months later (T3). The secondary outcomes were measured at T2 by participants in the intervention group. Following baseline data collection remaining participants were assigned to either the intervention (n=75) or control (n=72) group. Participants in the intervention group were given access to the learning intervention while the control group continued to work in their clinical area and at that time, did not receive access to the learning intervention. Data from the primary outcome measures were examined in mixed model analyses. Results: Overall, the effect of the online learning intervention over time comparing the intervention group and the control group were positive. The intervention groups‘ scores were higher and the change over time results were statistically significant [T3 and T1 (t=3.78 p=<0.001) and T2 and T1 baseline (t=5.83 p=<0.001)]. Statistically significant improvements were also seen for delirium recognition when comparing T2 and T1 results (t=2.58 p=0.012) between the control and intervention group but not for changes in delirium recognition scores between the two groups from T3 and T1 (t=1.80 p=0.074). The majority of the participants rated the website highly on the visual, functional and content elements. Additionally, nearly 80% of the participants liked the overall website features and there were self-reported improvements in delirium knowledge and recognition by the registered nurses in the intervention group. Discussion: Findings from this study support the concept that online learning is an effective and satisfying method of information delivery. Embedded within a constructivist learning environment the site produced a high level of satisfaction and usability for the registered nurse end-users. Additionally, the results showed that the website significantly improved delirium knowledge & recognition scores and the improvement in delirium knowledge was retained at a two month follow-up. Given the strong effect of the intervention the online delirium intervention should be utilised as a way of providing information to registered nurses. It is envisaged that this knowledge would lead to improved recognition of delirium as well as improvement in patient outcomes however; translation of this knowledge attainment into clinical practice was outside the scope of this study. A critical next step is demonstrating the effect of the intervention in changing clinical behaviour, and improving patient health outcomes.
Resumo:
In this paper, we present SMART (Sequence Matching Across Route Traversals): a vision- based place recognition system that uses whole image matching techniques and odometry information to improve the precision-recall performance, latency and general applicability of the SeqSLAM algorithm. We evaluate the system’s performance on challenging day and night journeys over several kilometres at widely varying vehicle velocities from 0 to 60 km/h, compare performance to the current state-of- the-art SeqSLAM algorithm, and provide parameter studies that evaluate the effectiveness of each system component. Using 30-metre sequences, SMART achieves place recognition performance of 81% recall at 100% precision, outperforming SeqSLAM, and is robust to significant degradations in odometry.
Resumo:
Delirium is a significant problem for older hospitalized people and is associated with poor outcomes. It is poorly recognized and evidence suggests that a major reason is lack of education. Nurses, who are educated about delirium, can play a significant role in improving delirium recognition. This study evaluated the impact of a delirium specific educational website. A cluster randomized controlled trial, with a pretest/post-test time series design, was conducted to measure delirium knowledge (DK) and delirium recognition (DR) over three time-points. Statistically significant differences were found between the intervention and non-intervention group. The intervention groups' DK scores were higher and the change over time results were statistically significant [T3 and T1 (t=3.78 p=<0.001) and T2 and T1 baseline (t=5.83 p=<0.001)]. Statistically significant improvements were also seen for DR when comparing T2 and T1 results (t=2.56 p=0.011) between both groups but not for changes in DR scores between T3 and T1 (t=1.80 p=0.074). Participants rated the website highly on the visual, functional and content elements. This study supports the concept that web-based delirium learning is an effective and satisfying method of information delivery for registered nurses. Future research is required to investigate clinical outcomes as a result of this web-based education.
Resumo:
This work aims at developing a planetary rover capable of acting as an assistant astrobiologist: making a preliminary analysis of the collected visual images that will help to make better use of the scientists time by pointing out the most interesting pieces of data. This paper focuses on the problem of detecting and recognising particular types of stromatolites. Inspired by the processes actual astrobiologists go through in the field when identifying stromatolites, the processes we investigate focus on recognising characteristics associated with biogenicity. The extraction of these characteristics is based on the analysis of geometrical structure enhanced by passing the images of stromatolites into an edge-detection filter and its Fourier Transform, revealing typical spatial frequency patterns. The proposed analysis is performed on both simulated images of stromatolite structures and images of real stromatolites taken in the field by astrobiologists.
Resumo:
The integration of separate, yet complimentary, cortical pathways appears to play a role in visual perception and action when intercepting objects. The ventral system is responsible for object recognition and identification, while the dorsal system facilitates continuous regulation of action. This dual-system model implies that empirically manipulating different visual information sources during performance of an interceptive action might lead to the emergence of distinct gaze and movement pattern profiles. To test this idea, we recorded hand kinematics and eye movements of participants as they attempted to catch balls projected from a novel apparatus that synchronised or de-synchronised accompanying video images of a throwing action and ball trajectory. Results revealed that ball catching performance was less successful when patterns of hand movements and gaze behaviours were constrained by the absence of advanced perceptual information from the thrower's actions. Under these task constraints, participants began tracking the ball later, followed less of its trajectory, and adapted their actions by initiating movements later and moving the hand faster. There were no performance differences when the throwing action image and ball speed were synchronised or de-synchronised since hand movements were closely linked to information from ball trajectory. Results are interpreted relative to the two-visual system hypothesis, demonstrating that accurate interception requires integration of advanced visual information from kinematics of the throwing action and from ball flight trajectory.
Resumo:
Purpose To quantify the effects of driver age on night-time pedestrian conspicuity, and to determine whether individual differences in visual performance can predict drivers' ability to recognise pedestrians at night. Methods Participants were 32 visually normal drivers (20 younger: M = 24.4 years ± 6.4 years; 12 older: M = 72.0 years ± 5.0 years). Visual performance was measured in a laboratory-based testing session including visual acuity, contrast sensitivity, motion sensitivity and the useful field of view. Night-time pedestrian recognition distances were recorded while participants drove an instrumented vehicle along a closed road course at night; to increase the workload of drivers, auditory and visual distracter tasks were presented for some of the laps. Pedestrians walked in place, sideways to the oncoming vehicles, and wore either a standard high visibility reflective vest or reflective tape positioned on the movable joints (biological motion). Results Driver age and pedestrian clothing significantly (p < 0.05) affected the distance at which the drivers first responded to the pedestrians. Older drivers recognised pedestrians at approximately half the distance of the younger drivers and pedestrians were recognised more often and at longer distances when they wore a biological motion reflective clothing configuration than when they wore a reflective vest. Motion sensitivity was an independent predictor of pedestrian recognition distance, even when controlling for driver age. Conclusions The night-time pedestrian recognition capacity of older drivers was significantly worse than that of younger drivers. The distance at which drivers first recognised pedestrians at night was best predicted by a test of motion sensitivity.
Resumo:
Speech recognition in car environments has been identified as a valuable means for reducing driver distraction when operating noncritical in-car systems. Under such conditions, however, speech recognition accuracy degrades significantly, and techniques such as speech enhancement are required to improve these accuracies. Likelihood-maximizing (LIMA) frameworks optimize speech enhancement algorithms based on recognized state sequences rather than traditional signal-level criteria such as maximizing signal-to-noise ratio. LIMA frameworks typically require calibration utterances to generate optimized enhancement parameters that are used for all subsequent utterances. Under such a scheme, suboptimal recognition performance occurs in noise conditions that are significantly different from that present during the calibration session – a serious problem in rapidly changing noise environments out on the open road. In this chapter, we propose a dialog-based design that allows regular optimization iterations in order to track the ever-changing noise conditions. Experiments using Mel-filterbank noise subtraction (MFNS) are performed to determine the optimization requirements for vehicular environments and show that minimal optimization is required to improve speech recognition, avoid over-optimization, and ultimately assist with semireal-time operation. It is also shown that the proposed design is able to provide improved recognition performance over frameworks incorporating a calibration session only.