12 resultados para Robust speech recognition

em CentAUR: Central Archive University of Reading - UK


Relevância:

80.00% 80.00%

Publicador:

Resumo:

It has been shown through a number of experiments that neural networks can be used for a phonetic typewriter. Algorithms can be looked on as producing self-organizing feature maps which correspond to phonemes. In the Chinese language the utterance of a Chinese character consists of a very simple string of Chinese phonemes. With this as a starting point, a neural network feature map for Chinese phonemes can be built up. In this paper, feature map structures for Chinese phonemes are discussed and tested. This research on a Chinese phonetic feature map is important both for Chinese speech recognition and for building a Chinese phonetic typewriter.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this work the G(A)(0) distribution is assumed as the universal model for amplitude Synthetic Aperture (SAR) imagery data under the Multiplicative Model. The observed data, therefore, is assumed to obey a G(A)(0) (alpha; gamma, n) law, where the parameter n is related to the speckle noise, and (alpha, gamma) are related to the ground truth, giving information about the background. Therefore, maps generated by the estimation of (alpha, gamma) in each coordinate can be used as the input for classification methods. Maximum likelihood estimators are derived and used to form estimated parameter maps. This estimation can be hampered by the presence of corner reflectors, man-made objects used to calibrate SAR images that produce large return values. In order to alleviate this contamination, robust (M) estimators are also derived for the universal model. Gaussian Maximum Likelihood classification is used to obtain maps using hard-to-deal-with simulated data, and the superiority of robust estimation is quantitatively assessed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Garment information tracking is required for clean room garment management. In this paper, we present a camera-based robust system with implementation of Optical Character Reconition (OCR) techniques to fulfill garment label recognition. In the system, a camera is used for image capturing; an adaptive thresholding algorithm is employed to generate binary images; Connected Component Labelling (CCL) is then adopted for object detection in the binary image as a part of finding the ROI (Region of Interest); Artificial Neural Networks (ANNs) with the BP (Back Propagation) learning algorithm are used for digit recognition; and finally the system is verified by a system database. The system has been tested. The results show that it is capable of coping with variance of lighting, digit twisting, background complexity, and font orientations. The system performance with association to the digit recognition rate has met the design requirement. It has achieved real-time and error-free garment information tracking during the testing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a real-time multi-camera surveillance system that can be applied to a range of application domains. This integrated system is designed to observe crowded scenes and has mechanisms to improve tracking of objects that are in close proximity. The four component modules described in this paper are (i) motion detection using a layered background model, (ii) object tracking based on local appearance, (iii) hierarchical object recognition, and (iv) fused multisensor object tracking using multiple features and geometric constraints. This integrated approach to complex scene tracking is validated against a number of representative real-world scenarios to show that robust, real-time analysis can be performed. Copyright (C) 2007 Hindawi Publishing Corporation. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates the robustness of a hybrid analog/digital feedback active noise cancellation (ANC) headset system. The digital ANC systems with the filtered-x least-mean-square (FXLMS) algorithm require accurate estimation of the secondary path for the stability and convergence of the algorithm. This demands a great challenge for the ANC headset design because the secondary path may fluctuate dramatically such as when the user adjusts the position of the ear-cup. In this paper, we analytically show that adding an analog feedback loop into the digital ANC systems can effectively reduce the plant fluctuation, thus achieving a more robust system. The method for designing the analog controller is highlighted. A practical hybrid analog/digital feedback ANC headset has been built and used to conduct experiments, and the experimental results show that the hybrid headset system is more robust under large plant fluctuation, and has achieved satisfactory noise cancellation for both narrowband and broadband noises.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spoken word recognition, during gating, appears intact in specific language impairment (SLI). This study used gating to investigate the process in adolescents with autism spectrum disorders plus language impairment (ALI). Adolescents with ALI, SLI, and typical language development (TLD), matched on nonverbal IQ listened to gated words that varied in frequency (low/high) and number of phonological onset neighbors (low/high density). Adolescents with ALI required more speech input to initially identify low-frequency words with low competitor density than those with SLI and those with TLD, who did not differ. These differences may be due to less well specified word form representations in ALI.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background and aims: In addition to the well-known linguistic processing impairments in aphasia, oro-motor skills and articulatory implementation of speech segments are reported to be compromised to some degree in most types of aphasia. This study aimed to identify differences in the characteristics and coordination of lip movements in the production of a bilabial closure gesture between speech-like and nonspeech tasks in individuals with aphasia and healthy control subjects. Method and procedure: Upper and lower lip movement data were collected for a speech-like and a nonspeech task using an AG 100 EMMA system from five individuals with aphasia and five age and gender matched control subjects. Each task was produced at two rate conditions (normal and fast), and in a familiar and a less-familiar manner. Single articulator kinematic parameters (peak velocity, amplitude, duration, and cyclic spatio-temporal index) and multi-articulator coordination indices (average relative phase and variability of relative phase) were measured to characterize lip movements. Outcome and results: The results showed that when the two lips had similar task goals (bilabial closure) in speech-like versus nonspeech task, kinematic and coordination characteristics were not found to be different. However, when changes in rate were imposed on the bilabial gesture, only speech-like task showed functional adaptations, indicated by a greater decrease in amplitude and duration at fast rates. In terms of group differences, individuals with aphasia showed smaller amplitudes and longer movement durations for upper lip, higher spatio-temporal variability for both lips, and higher variability in lip coordination than the control speakers. Rate was an important factor in distinguishing the two groups, and individuals with aphasia were limited in implementing the rate changes. Conclusion and implications: The findings support the notion of subtle but robust differences in motor control characteristics between individuals with aphasia and the control participants, even in the context of producing bilabial closing gestures for a relatively simple speech-like task. The findings also highlight the functional differences between speech-like and nonspeech tasks, despite a common movement coordination goal for bilabial closure.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a video surveillance framework that robustly and efficiently detects abandoned objects in surveillance scenes. The framework is based on a novel threat assessment algorithm which combines the concept of ownership with automatic understanding of social relations in order to infer abandonment of objects. Implementation is achieved through development of a logic-based inference engine based on Prolog. Threat detection performance is conducted by testing against a range of datasets describing realistic situations and demonstrates a reduction in the number of false alarms generated. The proposed system represents the approach employed in the EU SUBITO project (Surveillance of Unattended Baggage and the Identification and Tracking of the Owner).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bowen and colleagues’ methods and conclusions raise concerns.1 At best, the trial evaluates the variability in current practice. In no way is it a robust test of treatment. Two communication impairments (aphasia and dysarthria) were included. In the post-acute stage spontaneous recovery is highly unpredictable, and changes in the profile of impairment during this time are common.2 Both impairments manifest in different forms,3 which may be more or less responsive to treatment. A third kind of impairment, apraxia of speech, was not excluded but was not targeted in therapy. All three impairments can and do co-occur. Whether randomised controlled trial designs can effectively cope with such complex disorders has been discussed elsewhere.4 Treatment was defined within terms of current practice but was unconstrained. Therefore, the treatment group would have received a variety of therapeutic approaches and protocols, some of which may indeed be ineffective. Only 53% of the contact time with a speech and language therapist was direct (one to one), the rest was impairment based therapy. In contrast, all of the visitors’ time was direct contact, usually in conversation. In both groups, the frequency and length of contact time varied. We already know that the transfer from impairment based therapy to functional communication can be limited and varies across individuals.5 However, it is not possible to conclude from this trial that one to one impairment based therapy should be replaced. For that, a well defined impairment therapy protocol must be directly compared with a similarly well defined functional communication therapy, with an attention control.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new, healable, supramolecular nanocomposite material has been developed and evaluated. The material comprises a blend of three components: a pyrene-functionalized polyamide, a polydiimide and pyrenefunctionalized gold nanoparticles (P-AuNPs). The polymeric components interact by forming well-defined p–p stacked complexes between p-electron rich pyrenyl residues and p-electron deficient polydiimide residues. Solution studies in the mixed solvent chloroform–hexafluoroisopropanol (6 : 1, v/v) show that mixing the three components (each of which is soluble in isolation), results in the precipitation of a supramolecular, polymer nanocomposite network. The precipitate thus formed can be re-dissolved on heating, with the thermoreversible dissolution/precipitation procedure repeatable over at least 5 cycles. Robust, self-supporting composite films containing up to 15 wt% P-AuNPs could be cast from 2,2,2- trichloroethanol. Addition of as little as 1.25 wt% P-AuNPs resulted in significantly enhanced mechanical properties compared to the supramolecular blend without nanoparticles. The nanocomposites showed a linear increase in both tensile moduli and ultimate tensile strength with increasing P-AuNP content. All compositions up to 10 wt% P-AuNPs exhibited essentially quantitative healing efficiencies. Control experiments on an analogous nanocomposite material containing dodecylamine-functionalized AuNPs (5 wt%) exhibited a tensile modulus approximately half that of the corresponding nanocomposite that incorporated 5 wt% pyrene functionalized-AuNPs, clearly demonstrating the importance of the designed interactions between the gold filler and the supramolecular polymer matrix.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Anti-spoofing is attracting growing interest in biometrics, considering the variety of fake materials and new means to attack biometric recognition systems. New unseen materials continuously challenge state-of-the-art spoofing detectors, suggesting for additional systematic approaches to target anti-spoofing. By incorporating liveness scores into the biometric fusion process, recognition accuracy can be enhanced, but traditional sum-rule based fusion algorithms are known to be highly sensitive to single spoofed instances. This paper investigates 1-median filtering as a spoofing-resistant generalised alternative to the sum-rule targeting the problem of partial multibiometric spoofing where m out of n biometric sources to be combined are attacked. Augmenting previous work, this paper investigates the dynamic detection and rejection of livenessrecognition pair outliers for spoofed samples in true multi-modal configuration with its inherent challenge of normalisation. As a further contribution, bootstrap aggregating (bagging) classifiers for fingerprint spoof-detection algorithm is presented. Experiments on the latest face video databases (Idiap Replay- Attack Database and CASIA Face Anti-Spoofing Database), and fingerprint spoofing database (Fingerprint Liveness Detection Competition 2013) illustrate the efficiency of proposed techniques.