942 resultados para dynamic visual noise
Resumo:
The performance of automatic speech recognition systems deteriorates in the presence of noise. One known solution is to incorporate video information with an existing acoustic speech recognition system. We investigate the performance of the individual acoustic and visual sub-systems and then examine different ways in which the integration of the two systems may be performed. The system is to be implemented in real time on a Texas Instruments' TMS320C80 DSP.
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.
Resumo:
This chapter focuses on the interactions and roles between delays and intrinsic noise effects within cellular pathways and regulatory networks. We address these aspects by focusing on genetic regulatory networks that share a common network motif, namely the negative feedback loop, leading to oscillatory gene expression and protein levels. In this context, we discuss computational simulation algorithms for addressing the interplay of delays and noise within the signaling pathways based on biological data. We address implementational issues associated with efficiency and robustness. In a molecular biology setting we present two case studies of temporal models for the Hes1 gene (Monk, 2003; Hirata et al., 2002), known to act as a molecular clock, and the Her1/Her7 regulatory system controlling the periodic somite segmentation in vertebrate embryos (Giudicelli and Lewis, 2004; Horikawa et al., 2006).
Resumo:
One of the fundamental motivations underlying computational cell biology is to gain insight into the complicated dynamical processes taking place, for example, on the plasma membrane or in the cytosol of a cell. These processes are often so complicated that purely temporal mathematical models cannot adequately capture the complex chemical kinetics and transport processes of, for example, proteins or vesicles. On the other hand, spatial models such as Monte Carlo approaches can have very large computational overheads. This chapter gives an overview of the state of the art in the development of stochastic simulation techniques for the spatial modelling of dynamic processes in a living cell.
Resumo:
Delays are an important feature in temporal models of genetic regulation due to slow biochemical processes, such as transcription and translation. In this paper, we show how to model intrinsic noise effects in a delayed setting by either using a delay stochastic simulation algorithm (DSSA) or, for larger and more complex systems, a generalized Binomial τ-leap method (Bτ-DSSA). As a particular application, we apply these ideas to modeling somite segmentation in zebra fish across a number of cells in which two linked oscillatory genes (her1 and her7) are synchronized via Notch signaling between the cells.
Resumo:
Micro aerial vehicles (MAVs) are a rapidly growing area of research and development in robotics. For autonomous robot operations, localization has typically been calculated using GPS, external camera arrays, or onboard range or vision sensing. In cluttered indoor or outdoor environments, onboard sensing is the only viable option. In this paper we present an appearance-based approach to visual SLAM on a flying MAV using only low quality vision. Our approach consists of a visual place recognition algorithm that operates on 1000 pixel images, a lightweight visual odometry algorithm, and a visual expectation algorithm that improves the recall of place sequences and the precision with which they are recalled as the robot flies along a similar path. Using data gathered from outdoor datasets, we show that the system is able to perform visual recognition with low quality, intermittent visual sensory data. By combining the visual algorithms with the RatSLAM system, we also demonstrate how the algorithms enable successful SLAM.
Resumo:
This study investigated whether conceptual development is greater if students learning senior chemistry hear teacher explanations and other traditional teaching approaches first then see computer based visualizations or vice versa. Five Canadian chemistry classes, taught by three different teachers, studied the topics of Le Chatelier’s Principle and dynamic chemical equilibria using scientific visualizations with the explanation and visualizations in different orders. Conceptual development was measured using a 12 item test based on the Chemistry Concepts Inventory. Data was obtained about the students’ abilities, learning styles (auditory, visual or kinesthetic) and sex, and the relationships between these factors and conceptual development due to the teaching sequences were investigated. It was found that teaching sequence is not important in terms of students’ conceptual learning gains, across the whole cohort or for any of the three subgroups.
Resumo:
Diabetes is an increasingly prevalent disease worldwide. Providing early management of the complications can prevent morbidity and mortality in this population. Peripheral neuropathy, a significant complication of diabetes, is the major cause of foot ulceration and amputation in diabetes. Delay in attending to complication of the disease contributes to significant medical expenses for diabetic patients and the community. Early structural changes to the neural components of the retina have been demonstrated to occur prior to the clinically visible retinal vasculature complication of diabetic retinopathy. Additionally visual functionloss has been shown to exist before the ophthalmoscopic manifestations of vasculature damage. The purpose of this thesis was to evaluate the relationship between diabetic peripheral neuropathy and both retinal structure and visual function. The key question was whether diabetic peripheral neuropathy is the potential underlying factor responsible for retinal anatomical change and visual functional loss in people with diabetes. This study was conducted on a cohort with type 2 diabetes. Retinal nerve fibre layer thickness was assessed by means of Optical Coherence Tomography (OCT). Visual function was assessed using two different methods; Standard Automated Perimetry (SAP) and flicker perimetry were performed within the central 30 degrees of fixation. The level of diabetic peripheral neuropathy (DPN) was assessed using two techniques - Quantitative Sensory Testing and Neuropathy Disability Score (NDS). These techniques are known to be capable of detecting DPN at very early stages. NDS has also been shown as a gold standard for detecting 'risk of foot ulceration'. Findings reported in this thesis showed that RNFL thickness, particularly in the inferior quadrant, has a significant association with severity of DPN when the condition has been assessed using NDS. More specifically it was observed that inferior RNFL thickness has the ability to differentiate individuals who are at higher risk of foot ulceration from those who are at lower risk, indicating that RNFL thickness can predict late-staged DPN. Investigating the association between RNFL and QST did not show any meaningful interaction, which indicates that RNFL thickness for this cohort was not as predictive of neuropathy status as NDS. In both of these studies, control participants did not have different results from the type 2 cohort who did not DPN suggesting that RNFL thickness is not a marker for diagnosing DPN at early stages. The latter finding also indicated that diabetes per se, is unlikely to affect the RNFL thickness. Visual function as measured by SAP and flicker perimetry was found to be associated with severity of peripheral neuropathy as measured by NDS. These findings were also capable of differentiating individuals at higher risk of foot ulceration; however, visual function also proved not to be a maker for early diagnosis of DPN. It was found that neither SAP, nor flicker sensitivity have meaningful associations with DPN when neuropathy status was measured using QST. Importantly diabetic retinopathy did not explain any of the findings in these experiments. The work described here is valuable as no other research to date has investigated the association between diabetic peripheral neuropathy and either retinal structure or visual function.
Resumo:
Modelling events in densely crowded environments remains challenging, due to the diversity of events and the noise in the scene. We propose a novel approach for anomalous event detection in crowded scenes using dynamic textures described by the Local Binary Patterns from Three Orthogonal Planes (LBP-TOP) descriptor. The scene is divided into spatio-temporal patches where LBP-TOP based dynamic textures are extracted. We apply hierarchical Bayesian models to detect the patches containing unusual events. Our method is an unsupervised approach, and it does not rely on object tracking or background subtraction. We show that our approach outperforms existing state of the art algorithms for anomalous event detection in UCSD dataset.