157 resultados para auditory EEG
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.
Resumo:
Gabor representations have been widely used in facial analysis (face recognition, face detection and facial expression detection) due to their biological relevance and computational properties. Two popular Gabor representations used in literature are: 1) Log-Gabor and 2) Gabor energy filters. Even though these representations are somewhat similar, they also have distinct differences as the Log-Gabor filters mimic the simple cells in the visual cortex while the Gabor energy filters emulate the complex cells, which causes subtle differences in the responses. In this paper, we analyze the difference between these two Gabor representations and quantify these differences on the task of facial action unit (AU) detection. In our experiments conducted on the Cohn-Kanade dataset, we report an average area underneath the ROC curve (A`) of 92.60% across 17 AUs for the Gabor energy filters, while the Log-Gabor representation achieved an average A` of 96.11%. This result suggests that small spatial differences that the Log-Gabor filters pick up on are more useful for AU detection than the differences in contours and edges that the Gabor energy filters extract.
Resumo:
The detection of voice activity is a challenging problem, especially when the level of acoustic noise is high. Most current approaches only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to overcome this is to use the visual modality. The current state-of-the-art visual feature extraction technique is one that uses a cascade of visual features (i.e. 2D-DCT, feature mean normalisation, interstep LDA). In this paper, we investigate the effectiveness of this technique for the task of visual voice activity detection (VAD), and analyse each stage of the cascade and quantify the relative improvement in performance gained by each successive stage. The experiments were conducted on the CUAVE database and our results highlight that the dynamics of the visual modality can be used to good effect to improve visual voice activity detection performance.
Resumo:
Managing livestock movement in extensive systems has environmental and production benefits. Currently permanent wire fencing is used to control cattle; this is both expensive and inflexible. Cattle are known to respond to auditory and visual cues and we investigated whether these can be used to manipulate their behaviour. Twenty-five Belmont Red steers with a mean live weight of 270kg were each randomly assigned to one of five treatments. Treatments consisted of a combination of cues (audio, tactile and visual stimuli) and consequence (electrical stimulation). The treatments were electrical stimulation alone, audio plus electrical stimulation, vibration plus electrical stimulation, light plus electrical stimulation and electrified electric fence (6kV) plus electrical stimulation. Cue stimuli were administered for 3s followed immediately by electrical stimulation (consequence) of 1kV for 1s. The experiment tested the operational efficacy of an on-animal control or virtual fencing system. A collar-halter device was designed to carry the electronics, batteries and equipment providing the stimuli, including audio, vibration, light and electrical of a prototype virtual fencing device. Cattle were allowed to travel along a 40m alley to a group of peers and feed while their rate of travel and response to the stimuli were recorded. The prototype virtual fencing system was successful in modifying the behaviour of the cattle. The rate of travel of cattle along the alley demonstrated the large variability in behavioural response associated with tactile, visual and audible cues. The experiment demonstrated virtual fencing has potential for controlling cattle in extensive grazing systems. However, larger numbers of cattle need to be tested to derive a better understanding of the behavioural variance. Further controlled experimental work is also necessary to quantify the interaction between cues, consequences and cattle learning.
Resumo:
To date, studies have focused on the acquisition of alphabetic second languages (L2s) in alphabetic first language (L1) users, demonstrating significant transfer effects. The present study examined the process from a reverse perspective, comparing logographic (Mandarin-Chinese) and alphabetic (English) L1 users in the acquisition of an artificial logographic script, in order to determine whether similar language-specific advantageous transfer effects occurred. English monolinguals, English-French bilinguals and Chinese-English bilinguals learned a small set of symbols in an artificial logographic script and were subsequently tested on their ability to process this script in regard to three main perspectives: L2 reading, L2 working memory (WM), and inner processing strategies. In terms of L2 reading, a lexical decision task on the artificial symbols revealed markedly faster response times in the Chinese-English bilinguals, indicating a logographic transfer effect suggestive of a visual processing advantage. A syntactic decision task evaluated the degree to which the new language was mastered beyond the single word level. No L1-specific transfer effects were found for artificial language strings. In order to investigate visual processing of the artificial logographs further, a series of WM experiments were conducted. Artificial logographs were recalled under concurrent auditory and visuo-spatial suppression conditions to disrupt phonological and visual processing, respectively. No L1-specific transfer effects were found, indicating no visual processing advantage of the Chinese-English bilinguals. However, a bilingual processing advantage was found indicative of a superior ability to control executive functions. In terms of L1 WM, the Chinese-English bilinguals outperformed the alphabetic L1 users when processing L1 words, indicating a language experience-specific advantage. Questionnaire data on the cognitive strategies that were deployed during the acquisition and processing of the artificial logographic script revealed that the Chinese-English bilinguals rated their inner speech as lower than the alphabetic L1 users, suggesting that they were transferring their phonological processing skill set to the acquisition and use of an artificial script. Overall, evidence was found to indicate that language learners transfer specific L1 orthographic processing skills to L2 logographic processing. Additionally, evidence was also found indicating that a bilingual history enhances cognitive performance in L2.
Resumo:
Monotony has been identified as a contributing factor to road crashes. Drivers’ ability to react to unpredictable events deteriorates when exposed to highly predictable and uneventful driving tasks, such as driving on Australian rural roads, many of which are monotonous by nature. Highway design in particular attempts to reduce the driver’s task to a merely lane-keeping one. Such a task provides little stimulation and is monotonous, thus affecting the driver’s attention which is no longer directed towards the road. Inattention contributes to crashes, especially for professional drivers. Monotony has been studied mainly from the endogenous perspective (for instance through sleep deprivation) without taking into account the influence of the task itself (repetitiveness) or the surrounding environment. The aim and novelty of this thesis is to develop a methodology (mathematical framework) able to predict driver lapses of vigilance under monotonous environments in real time, using endogenous and exogenous data collected from the driver, the vehicle and the environment. Existing approaches have tended to neglect the specificity of task monotony, leaving the question of the existence of a “monotonous state” unanswered. Furthermore the issue of detecting vigilance decrement before it occurs (predictions) has not been investigated in the literature, let alone in real time. A multidisciplinary approach is necessary to explain how vigilance evolves in monotonous conditions. Such an approach needs to draw on psychology, physiology, road safety, computer science and mathematics. The systemic approach proposed in this study is unique with its predictive dimension and allows us to define, in real time, the impacts of monotony on the driver’s ability to drive. Such methodology is based on mathematical models integrating data available in vehicles to the vigilance state of the driver during a monotonous driving task in various environments. The model integrates different data measuring driver’s endogenous and exogenous factors (related to the driver, the vehicle and the surrounding environment). Electroencephalography (EEG) is used to measure driver vigilance since it has been shown to be the most reliable and real time methodology to assess vigilance level. There are a variety of mathematical models suitable to provide a framework for predictions however, to find the most accurate model, a collection of mathematical models were trained in this thesis and the most reliable was found. The methodology developed in this research is first applied to a theoretically sound measure of sustained attention called Sustained Attention Response to Task (SART) as adapted by Michael (2010), Michael and Meuter (2006, 2007). This experiment induced impairments due to monotony during a vigilance task. Analyses performed in this thesis confirm and extend findings from Michael (2010) that monotony leads to an important vigilance impairment independent of fatigue. This thesis is also the first to show that monotony changes the dynamics of vigilance evolution and tends to create a “monotonous state” characterised by reduced vigilance. Personality traits such as being a low sensation seeker can mitigate this vigilance decrement. It is also evident that lapses in vigilance can be predicted accurately with Bayesian modelling and Neural Networks. This framework was then applied to the driving task by designing a simulated monotonous driving task. The design of such task requires multidisciplinary knowledge and involved psychologist Rebecca Michael. Monotony was varied through both the road design and the road environment variables. This experiment demonstrated that road monotony can lead to driving impairment. Particularly monotonous road scenery was shown to have the most impact compared to monotonous road design. Next, this study identified a variety of surrogate measures that are correlated with vigilance levels obtained from the EEG. Such vigilance states can be predicted with these surrogate measures. This means that vigilance decrement can be detected in a car without the use of an EEG device. Amongst the different mathematical models tested in this thesis, only Neural Networks predicted the vigilance levels accurately. The results of both these experiments provide valuable information about the methodology to predict vigilance decrement. Such an issue is quite complex and requires modelling that can adapt to highly inter-individual differences. Only Neural Networks proved accurate in both studies, suggesting that these models are the most likely to be accurate when used on real roads or for further research on vigilance modelling. This research provides a better understanding of the driving task under monotonous conditions. Results demonstrate that mathematical modelling can be used to determine the driver’s vigilance state when driving using surrogate measures identified during this study. This research has opened up avenues for future research and could result in the development of an in-vehicle device predicting driver vigilance decrement. Such a device could contribute to a reduction in crashes and therefore improve road safety.
Resumo:
Schizophrenia is a mental disorder affecting 1-2% of the population and it is estimated 12-16% of hospital beds in Australia are occupied by patients with psychosis. The suicide rate for patients with this diagnosis is higher than that of the general population. Any technique which enhances training and treatment of this disorder will have a significant societal and economic impact. A significant research project using Virtual Reality (VR), in which both visual and auditory hallucinations are simulated, is currently being undertaken at the University of Queensland. The virtual environments created by the new software are expected to enhance the experiential learning outcomes of medical students by enabling them to experience the inner world of a patient with psychosis. In addition the Virtual Environment has the potential to provide a technologically advanced therapeutic setting where behavioral, exposure therapies can be conducted with exactly controlled exposure stimuli with an expected reduction in risk of harm. This paper reports on the current work of the project, previous stages of software development and future educational and clinical applications of the Virtual Environments. (C) 2004 Elsevier Ltd. All rights reserved.
Resumo:
The term ’public discourses’ describes a range of texts or signifiers that inform the conditions of audience reception. Public discourses include myriad written, visual, spatial, auditory and sensory texts experienced by an audience at a particular theatrical event. Ric Knowles first introduced this term in his recent work Reading the Material Theatre. Whereas Knowles was interested in how public discourses modified the conditions of reception, my broader research is to explore how these public discourses become texts in themselves. This paper will discuss one public discourse, the theatre programme, as it related to a staging of Maxwell Anderson’s Anne of the Thousand Days at the Brisbane Powerhouse in June 2006. The significance of the programme was explored at symposiums held after the performances. Audiences generally view programmes before a performance and after a performance and its significance as a written text changes. The program became a sign vehicle that worked to expound and explicate the meaning of the play for the audience. This public discourse became a significant written text contributing to the textual whole of the theatrical event.
Resumo:
The artwork I created is to depict missing of face-to-face communication in this digital and information era. It can be seen that social network technologies have enhanced people-to-people communications and enriched their interactions. Yet, these inundated communication mediums have changed people’s preferences for communication through visual-driven interface. This has reduced people’s capabilities of communication skills including listening. Surprisingly, it was reported that 70 percent of young generations are non-auditory learners, influenced by the visual nature of communication (McCrindle, 2006). As a result, they are defined as a pragmatic generation focussed on outcomes and not processes. This serious and societal issue was drawn with a somewhat violent and aggressive form, yet its pop art style should enable audience to approach to the theme with a satirical and light way.
Resumo:
The theory of nonlinear dyamic systems provides some new methods to handle complex systems. Chaos theory offers new concepts, algorithms and methods for processing, enhancing and analyzing the measured signals. In recent years, researchers are applying the concepts from this theory to bio-signal analysis. In this work, the complex dynamics of the bio-signals such as electrocardiogram (ECG) and electroencephalogram (EEG) are analyzed using the tools of nonlinear systems theory. In the modern industrialized countries every year several hundred thousands of people die due to sudden cardiac death. The Electrocardiogram (ECG) is an important biosignal representing the sum total of millions of cardiac cell depolarization potentials. It contains important insight into the state of health and nature of the disease afflicting the heart. Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart by the sympathetic and parasympathetic branches of the autonomic nervous system. Heart rate variability analysis is an important tool to observe the heart's ability to respond to normal regulatory impulses that affect its rhythm. A computerbased intelligent system for analysis of cardiac states is very useful in diagnostics and disease management. Like many bio-signals, HRV signals are non-linear in nature. Higher order spectral analysis (HOS) is known to be a good tool for the analysis of non-linear systems and provides good noise immunity. In this work, we studied the HOS of the HRV signals of normal heartbeat and four classes of arrhythmia. This thesis presents some general characteristics for each of these classes of HRV signals in the bispectrum and bicoherence plots. Several features were extracted from the HOS and subjected an Analysis of Variance (ANOVA) test. The results are very promising for cardiac arrhythmia classification with a number of features yielding a p-value < 0.02 in the ANOVA test. An automated intelligent system for the identification of cardiac health is very useful in healthcare technology. In this work, seven features were extracted from the heart rate signals using HOS and fed to a support vector machine (SVM) for classification. The performance evaluation protocol in this thesis uses 330 subjects consisting of five different kinds of cardiac disease conditions. The classifier achieved a sensitivity of 90% and a specificity of 89%. This system is ready to run on larger data sets. In EEG analysis, the search for hidden information for identification of seizures has a long history. Epilepsy is a pathological condition characterized by spontaneous and unforeseeable occurrence of seizures, during which the perception or behavior of patients is disturbed. An automatic early detection of the seizure onsets would help the patients and observers to take appropriate precautions. Various methods have been proposed to predict the onset of seizures based on EEG recordings. The use of nonlinear features motivated by the higher order spectra (HOS) has been reported to be a promising approach to differentiate between normal, background (pre-ictal) and epileptic EEG signals. In this work, these features are used to train both a Gaussian mixture model (GMM) classifier and a Support Vector Machine (SVM) classifier. Results show that the classifiers were able to achieve 93.11% and 92.67% classification accuracy, respectively, with selected HOS based features. About 2 hours of EEG recordings from 10 patients were used in this study. This thesis introduces unique bispectrum and bicoherence plots for various cardiac conditions and for normal, background and epileptic EEG signals. These plots reveal distinct patterns. The patterns are useful for visual interpretation by those without a deep understanding of spectral analysis such as medical practitioners. It includes original contributions in extracting features from HRV and EEG signals using HOS and entropy, in analyzing the statistical properties of such features on real data and in automated classification using these features with GMM and SVM classifiers.
Resumo:
OBJECTIVES: To investigate the effects of hearing impairment and distractibility on older people's driving ability, assessed under real-world conditions. DESIGN: Experimental cross-sectional study. SETTING: University laboratory setting and an on-road driving test. PARTICIPANTS: One hundred seven community-living adults aged 62 to 88. Fifty-five percent had normal hearing, 26% had a mild hearing impairment, and 19% had a moderate or greater impairment. ---------- MEASUREMENTS: Hearing was assessed using objective impairment measures (pure-tone audiometry, speech perception testing) and a self-report measure (Hearing Handicap Inventory for the Elderly). Driving was assessed on a closed road circuit under three conditions: no distracters, auditory distracters, and visual distracters. RESULTS: There was a significant interaction between hearing impairment and distracters, such that people with moderate to severe hearing impairment had significantly poorer driving performance in the presence of distracters than those with normal or mild hearing impairment. CONCLUSION: Older adults with poor hearing have greater difficulty with driving in the presence of distracters than older adults with good hearing.