994 resultados para Religious Speech
Improved speech recognition using adaptive audio-visual fusion via a stochastic secondary classifier
Resumo:
Speech recognition in car environments has been identified as a valuable means for reducing driver distraction when operating non-critical in-car systems. Likelihood-maximising (LIMA) frameworks optimise speech enhancement algorithms based on recognised state sequences rather than traditional signal-level criteria such as maximising signal-to-noise ratio. Previously presented LIMA frameworks require calibration utterances to generate optimised enhancement parameters which are used for all subsequent utterances. Sub-optimal recognition performance occurs in noise conditions which are significantly different from that present during the calibration session - a serious problem in rapidly changing noise environments. We propose a dialog-based design which allows regular optimisation iterations in order to track the changing noise conditions. Experiments using Mel-filterbank spectral subtraction are performed to determine the optimisation requirements for vehicular environments and show that minimal optimisation assists real-time operation with improved speech recognition accuracy. It is also shown that the proposed design is able to provide improved recognition performance over frameworks incorporating a calibration session.
Resumo:
In this chapter, John Howard’s policy speech to The Sydney Institute, a conservative think tank, on October 11, 2007 as the Australian Prime Minister of the day, is analysed within the frame of discourse analysis to make visible how the speech works in old ways to dress up neoliberal policy as new and reformist. Taking centre stage, Howard pointed to concrete steps undertaken to achieve what he called a “new reconciliation.” This cynical manoeuvre, which put reconciliation back onto the election agenda (after it was earlier derided for its divisive and muddle headed symbolism), constituted a “neoliberal quickstep” (Reiger, 2006) or quickfix of sorts. The speech was also used as a place to reintroduce the Northern Territory Intervention, which at the time was purported to be a response to child abuse and Indigenous community dysfunction.
Resumo:
The progress of a nationally representative sample of 3632 children was followed from early childhood through to primary school, using data from the Longitudinal Study of Australian Children (LSAC). The aim was to examine the predictive effects of different aspects of communicative ability, and of early vs. sustained identification of speech and language impairment, on children's achievement and adjustment at school. Four indicators identified speech and language impairment: parent-rated expressive language concern; parent-rated receptive language concern; use of speech-language pathology services; below average scores on the adapted Peabody Picture Vocabulary Test-III. School outcomes were assessed by teachers' ratings of language/literacy ability, numeracy/mathematical thinking and approaches to learning. Comparison of group differences, using ANOVA, provided clear evidence that children who were identified as having speech and language impairment in their early childhood years did not perform as well at school, two years later, as their non-impaired peers on all three outcomes: Language and Literacy, Mathematical Thinking, and Approaches to Learning. The effects of early speech and language status on literacy, numeracy, and approaches to learning outcomes were similar in magnitude to the effect of family socio-economic factors, after controlling for child characteristics. Additionally, early identification of speech and language impairment (at age 4-5) was found to be a better predictor of school outcomes than sustained identification (at aged 4-5 and 6-7 years). Parent-reports of speech and language impairment in early childhood are useful in foreshadowing later difficulties with school and providing early intervention and targeted support from speech-language pathologists and specialist teachers.
Resumo:
In this paper, cognitive load analysis via acoustic- and CAN-Bus-based driver performance metrics is employed to assess two different commercial speech dialog systems (SDS) during in-vehicle use. Several metrics are proposed to measure increases in stress, distraction and cognitive load and we compare these measures with statistical analysis of the speech recognition component of each SDS. It is found that care must be taken when designing an SDS as it may increase cognitive load which can be observed through increased speech response delay (SRD), changes in speech production due to negative emotion towards the SDS, and decreased driving performance on lateral control tasks. From this study, guidelines are presented for designing systems which are to be used in vehicular environments.
Resumo:
The Autistic Behavioural Indicators Instrument (ABII) is an 18-item instrument developed to identify children with Autistic Disorder (AD) based on the presence of unique autistic behavioural indicators. The ABII was administered to 20 children with AD, 20 children with speech and language impairment (SLI) and 20 typically developing (TD) children aged 2-6 years. Results indicated that the ABII discriminated children diagnosed with AD from those diagnosed with SLI and those who were TD, based on the presence of specific social attention, sensory, and behavioural symptoms. A combination of symptomology across these domains correctly classified 100% of children with and without AD. The paper concludes that the ABII shows considerable promise as an instrument for the early identification of AD.
Resumo:
In an automotive environment, the performance of a speech recognition system is affected by environmental noise if the speech signal is acquired directly from a microphone. Speech enhancement techniques are therefore necessary to improve the speech recognition performance. In this paper, a field-programmable gate array (FPGA) implementation of dual-microphone delay-and-sum beamforming (DASB) for speech enhancement is presented. As the first step towards a cost-effective solution, the implementation described in this paper uses a relatively high-end FPGA device to facilitate the verification of various design strategies and parameters. Experimental results show that the proposed design can produce output waveforms close to those generated by a theoretical (floating-point) model with modest usage of FPGA resources. Speech recognition experiments are also conducted on enhanced in-car speech waveforms produced by the FPGA in order to compare recognition performance with the floating-point representation running on a PC.
Resumo:
Secondary tasks such as cell phone calls or interaction with automated speech dialog systems (SDSs) increase the driver’s cognitive load as well as the probability of driving errors. This study analyzes speech production variations due to cognitive load and emotional state of drivers in real driving conditions. Speech samples were acquired from 24 female and 17 male subjects (approximately 8.5 h of data) while talking to a co-driver and communicating with two automated call centers, with emotional states (neutral, negative) and the number of necessary SDS query repetitions also labeled. A consistent shift in a number of speech production parameters (pitch, first format center frequency, spectral center of gravity, spectral energy spread, and duration of voiced segments) was observed when comparing SDS interaction against co-driver interaction; further increases were observed when considering negative emotion segments and the number of requested SDS query repetitions. A mel frequency cepstral coefficient based Gaussian mixture classifier trained on 10 male and 10 female sessions provided 91% accuracy in the open test set task of distinguishing co-driver interactions from SDS interactions, suggesting—together with the acoustic analysis—that it is possible to monitor the level of driver distraction directly from their speech.
Resumo:
Purpose: The classic study of Sumby and Pollack (1954, JASA, 26(2), 212-215) demonstrated that visual information aided speech intelligibility under noisy auditory conditions. Their work showed that visual information is especially useful under low signal-to-noise conditions where the auditory signal leaves greater margins for improvement. We investigated whether simulated cataracts interfered with the ability of participants to use visual cues to help disambiguate the auditory signal in the presence of auditory noise. Methods: Participants in the study were screened to ensure normal visual acuity (mean of 20/20) and normal hearing (auditory threshold ≤ 20 dB HL). Speech intelligibility was tested under an auditory only condition and two visual conditions: normal vision and simulated cataracts. The light scattering effects of cataracts were imitated using cataract-simulating filters. Participants wore blacked-out glasses in the auditory only condition and lens-free frames in the normal auditory-visual condition. Individual sentences were spoken by a live speaker in the presence of prerecorded four-person background babble set to a speech-to-noise ratio (SNR) of -16 dB. The SNR was determined in a preliminary experiment to support 50% correct identification of sentence under the auditory only conditions. The speaker was trained to match the rate, intensity and inflections of a prerecorded audio track of everyday speech sentences. The speaker was blind to the visual conditions of the participant to control for bias.Participants’ speech intelligibility was measured by comparing the accuracy of their written account of what they believed the speaker to have said to the actual spoken sentence. Results: Relative to the normal vision condition, speech intelligibility was significantly poorer when participants wore simulated catarcts. Conclusions: The results suggest that cataracts may interfere with the acquisition of visual cues to speech perception.