939 resultados para audio equipment
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.
Resumo:
Managing livestock movement in extensive systems has environmental and production benefits. Currently permanent wire fencing is used to control cattle; this is both expensive and inflexible. Cattle are known to respond to auditory and visual cues and we investigated whether these can be used to manipulate their behaviour. Twenty-five Belmont Red steers with a mean live weight of 270kg were each randomly assigned to one of five treatments. Treatments consisted of a combination of cues (audio, tactile and visual stimuli) and consequence (electrical stimulation). The treatments were electrical stimulation alone, audio plus electrical stimulation, vibration plus electrical stimulation, light plus electrical stimulation and electrified electric fence (6kV) plus electrical stimulation. Cue stimuli were administered for 3s followed immediately by electrical stimulation (consequence) of 1kV for 1s. The experiment tested the operational efficacy of an on-animal control or virtual fencing system. A collar-halter device was designed to carry the electronics, batteries and equipment providing the stimuli, including audio, vibration, light and electrical of a prototype virtual fencing device. Cattle were allowed to travel along a 40m alley to a group of peers and feed while their rate of travel and response to the stimuli were recorded. The prototype virtual fencing system was successful in modifying the behaviour of the cattle. The rate of travel of cattle along the alley demonstrated the large variability in behavioural response associated with tactile, visual and audible cues. The experiment demonstrated virtual fencing has potential for controlling cattle in extensive grazing systems. However, larger numbers of cattle need to be tested to derive a better understanding of the behavioural variance. Further controlled experimental work is also necessary to quantify the interaction between cues, consequences and cattle learning.
Resumo:
Draglines are massive machines commonly used in surface mining to strip overburden, revealing the targeted minerals for extraction. Automating some or all of the phases of operation of these machines offers the potential for significant productivity and maintenance benefits. The mining industry has a history of slow uptake of automation systems due to the challenges contained in the harsh, complex, three-dimensional (3D), dynamically changing mine operating environment. Robotics as a discipline is finally starting to gain acceptance as a technology with the potential to assist mining operations. This article examines the evolution of robotic technologies applied to draglines in the form of machine embedded intelligent systems. Results from this work include a production trial in which 250,000 tons of material was moved autonomously, experiments demonstrating steps towards full autonomy, and teleexcavation experiments in which a dragline in Australia was tasked by an operator in the United States.
Resumo:
The cascading appearance-based (CAB) feature extraction technique has established itself as the state-of-the-art in extracting dynamic visual speech features for speech recognition. In this paper, we will focus on investigating the effectiveness of this technique for the related speaker verification application. By investigating the speaker verification ability of each stage of the cascade we will demonstrate that the same steps taken to reduce static speaker and environmental information for the visual speech recognition application also provide similar improvements for visual speaker recognition. A further study is conducted comparing synchronous HMM (SHMM) based fusion of CAB visual features and traditional perceptual linear predictive (PLP) acoustic features to show that higher complexity inherit in the SHMM approach does not appear to provide any improvement in the final audio-visual speaker verification system over simpler utterance level score fusion.
Resumo:
Interacting with technology within a vehicle environment using a voice interface can greatly reduce the effects of driver distraction. Most current approaches to this problem only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to circumvent this is to use the visual modality in addition. However, capturing, storing and distributing audio-visual data in a vehicle environment is very costly and difficult. One current dataset available for such research is the AVICAR [1] database. Unfortunately this database is largely unusable due to timing mismatch between the two streams and in addition, no protocol is available. We have overcome this problem by re-synchronising the streams on the phone-number portion of the dataset and established a protocol for further research. This paper presents the first audio-visual results on this dataset for speaker-independent speech recognition. We hope this will serve as a catalyst for future research in this area.
Resumo:
The modern society has come to expect the electrical energy on demand, while many of the facilities in power systems are aging beyond repair and maintenance. The risk of failure is increasing with the aging equipments and can pose serious consequences for continuity of electricity supply. As the equipments used in high voltage power networks are very expensive, economically it may not be feasible to purchase and store spares in a warehouse for extended periods of time. On the other hand, there is normally a significant time before receiving equipment once it is ordered. This situation has created a considerable interest in the evaluation and application of probability methods for aging plant and provisions of spares in bulk supply networks, and can be of particular importance for substations. Quantitative adequacy assessment of substation and sub-transmission power systems is generally done using a contingency enumeration approach which includes the evaluation of contingencies, classification of the contingencies based on selected failure criteria. The problem is very complex because of the need to include detailed modelling and operation of substation and sub-transmission equipment using network flow evaluation and to consider multiple levels of component failures. In this thesis a new model associated with aging equipment is developed to combine the standard tools of random failures, as well as specific model for aging failures. This technique is applied in this thesis to include and examine the impact of aging equipments on system reliability of bulk supply loads and consumers in distribution network for defined range of planning years. The power system risk indices depend on many factors such as the actual physical network configuration and operation, aging conditions of the equipment, and the relevant constraints. The impact and importance of equipment reliability on power system risk indices in a network with aging facilities contains valuable information for utilities to better understand network performance and the weak links in the system. In this thesis, algorithms are developed to measure the contribution of individual equipment to the power system risk indices, as part of the novel risk analysis tool. A new cost worth approach was developed in this thesis that can make an early decision in planning for replacement activities concerning non-repairable aging components, in order to maintain a system reliability performance which economically is acceptable. The concepts, techniques and procedures developed in this thesis are illustrated numerically using published test systems. It is believed that the methods and approaches presented, substantially improve the accuracy of risk predictions by explicit consideration of the effect of equipment entering a period of increased risk of a non-repairable failure.
Resumo:
This paper presents a preliminary crash avoidance framework for heavy equipment control systems. Safe equipment operation is a major concern on construction sites since fatal on-site injuries are an industry-wide problem. The proposed framework has potential for effecting active safety for equipment operation. The framework contains algorithms for spatial modeling, object tracking, and path planning. Beyond generating spatial models in fractions of seconds, these algorithms can successfully track objects in an environment and produce a collision-free 3D motion trajectory for equipment.