37 resultados para Visual Object Recognition

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present results of a study into the performance of a variety of different image transform-based feature types for speaker-independent visual speech recognition of isolated digits. This includes the first reported use of features extracted using a discrete curvelet transform. The study will show a comparison of some methods for selecting features of each feature type and show the relative benefits of both static and dynamic visual features. The performance of the features will be tested on both clean video data and also video data corrupted in a variety of ways to assess each feature type's robustness to potential real-world conditions. One of the test conditions involves a novel form of video corruption we call jitter which simulates camera and/or head movement during recording.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present the application of Hidden Conditional Random Fields (HCRFs) to modelling speech for visual speech recognition. HCRFs may be easily adapted to model long range dependencies across an observation sequence. As a result visual word recognition performance can be improved as the model is able to take more of a contextual approach to generating state sequences. Results are presented from a speaker-dependent, isolated digit, visual speech recognition task using comparisons with a baseline HMM system. We firstly illustrate that word recognition rates on clean video using HCRFs can be improved by increasing the number of past and future observations being taken into account by each state. Secondly we compare model performances using various levels of video compression on the test set. As far as we are aware this is the first attempted use of HCRFs for visual speech recognition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a new approach to visual speech recognition which improves contextual modelling by combining Inter-Frame Dependent and Hidden Markov Models. This approach captures contextual information in visual speech that may be lost using a Hidden Markov Model alone. We apply contextual modelling to a large speaker independent isolated digit recognition task, and compare our approach to two commonly adopted feature based techniques for incorporating speech dynamics. Results are presented from baseline feature based systems and the combined modelling technique. We illustrate that both of these techniques achieve similar levels of performance when used independently. However significant improvements in performance can be achieved through a combination of the two. In particular we report an improvement in excess of 17% relative Word Error Rate in comparison to our best baseline system.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, we propose a biologically inspired appearance model for robust visual tracking. Motivated in part by the success of the hierarchical organization of the primary visual cortex (area V1), we establish an architecture consisting of five layers: whitening, rectification, normalization, coding and polling. The first three layers stem from the models developed for object recognition. In this paper, our attention focuses on the coding and pooling layers. In particular, we use a discriminative sparse coding method in the coding layer along with spatial pyramid representation in the pooling layer, which makes it easier to distinguish the target to be tracked from its background in the presence of appearance variations. An extensive experimental study shows that the proposed method has higher tracking accuracy than several state-of-the-art trackers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A new high performance, programmable image processing chip targeted at video and HDTV applications is described. This was initially developed for image small object recognition but has much broader functional application including 1D and 2D FIR filtering as well as neural network computation. The core of the circuit is made up of an array of twenty one multiplication-accumulation cells based on systolic architecture. Devices can be cascaded to increase the order of the filter both vertically and horizontally. The chip has been fabricated in a 0.6 µ, low power CMOS technology and operates on 10 bit input data at over 54 Megasamples per second. The introduction gives some background to the chip design and highlights that there are few other comparable devices. Section 2 gives a brief introduction to small object detection. The chip architecture and the chip design will be described in detail in the later sections.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

β-amyloid1-42 (Aβ1-42) is a major endogenous pathogen underlying the aetiology of Alzheimer's disease (AD). Recent evidence indicates that soluble Aβ oligomers, rather than plaques, are the major cause of synaptic dysfunction and neurodegeneration. Small molecules that suppress Aβ aggregation, reduce oligomer stability or promote off-pathway non-toxic oligomerization represent a promising alternative strategy for neuroprotection in AD. MRZ-99030 was recently identified as a dipeptide that modulates Aβ1-42 aggregation by triggering a non-amyloidogenic aggregation pathway, thereby reducing the amount of intermediate toxic soluble oligomeric Aβ species. The present study evaluated the relevance of these promising results with MRZ-99030 under pathophysiological conditions i.e. against the synaptotoxic effects of Aβ oligomers on hippocampal long term potentiation (LTP) and two different memory tasks. Aβ1-42 interferes with the glutamatergic system and with neuronal Ca2+ signalling and abolishes the induction of LTP. Here we demonstrate that MRZ-99030 (100–500 nM) at a 10:1 stoichiometric excess to Aβ clearly reversed the synaptotoxic effects of Aβ1-42 oligomers on CA1-LTP in murine hippocampal slices. Co-application of MRZ-99030 also prevented the two-fold increase in resting Ca2+ levels in pyramidal neuron dendrites and spines triggered by Aβ1-42 oligomers. In anaesthetized rats, pre-administration of MRZ-99030 (50 mg/kg s.c.) protected against deficits in hippocampal LTP following i.c.v. injection of oligomeric Aβ1-42. Furthermore, similar treatment significantly ameliorated cognitive deficits in an object recognition task and under an alternating lever cyclic ratio schedule after the i.c.v. application of Aβ1-42 and 7PA2 conditioned medium, respectively. Altogether, these results demonstrate the potential therapeutic benefit of MRZ-99030 in AD.