Biblioteca Digital

988 resultados para Speech Rate

Assessment of Speech Dialog Systems using Multi-Modal Cognitive Load Analysis and Driving Performance Metrics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, cognitive load analysis via acoustic- and CAN-Bus-based driver performance metrics is employed to assess two different commercial speech dialog systems (SDS) during in-vehicle use. Several metrics are proposed to measure increases in stress, distraction and cognitive load and we compare these measures with statistical analysis of the speech recognition component of each SDS. It is found that care must be taken when designing an SDS as it may increase cognitive load which can be observed through increased speech response delay (SRD), changes in speech production due to negative emotion towards the SDS, and decreased driving performance on lateral control tasks. From this study, guidelines are presented for designing systems which are to be used in vehicular environments.

National consumer credit laws, financial exclusion and interest rate caps : The case for diversity within a centralised framework

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Australia is going through a major reform of consumer credit regulation, with the implementation of a proposal to transfer regulatory responsibility from the State and Territory Governments to the Commonwealth Government. While the broad policy approach is supported, the reform process has missed a significant opportunity to engage directly with issues of financial exclusion and with the potential role of regulation to reduce financial exclusion. The imposition of an interest rate cap can limit the impact of financial exclusion. However, the future of the existing interest rate caps is uncertain, given the diversity of approaches, and the heated debate that surrounds this issue. In the absence of support for regulatory initiatives to increase the availability of low cost, small loans, permitting regulatory diversity on this issue of interest rate caps, within an otherwise centralised regulatory framework., can minimise the impact of financial exclusion on consumers.

Controlled Rate Thermal analysis of sepiolite

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Controlled rate thermal analysis (CRTA) technology offers better resolution and a more detailed interpretation of the decomposition processes of a clay mineral such as sepiolite via approaching equilibrium conditions of decomposition through the elimination of the slow transfer of heat to the sample as a controlling parameter on the process of decomposition. Constant-rate decomposition processes of non-isothermal nature reveal changes in the sepiolite as the sepiolite is converted to an anhydride. In the dynamic experiment two dehydration steps are observed over the *20–170 and 170–350 �C temperature range. In the dynamic experiment three dehydroxylation steps are observed over the temperature ranges 201–337, 337–638 and 638–982 �C. The CRTA technology enables the separation of the thermal decomposition steps.

Unsupervised speaker adaptation for telephone call transcription

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Supervised and unsupervised experiments in acoustic model and language model adaptation are presented. Using one hour of automatically transcribed speech per speaker with a word error rate of 36.0%, unsupervised adaptation resulted in an absolute gain of 6.3%, equivalent to 70% of the gain from the supervised case, with additional adaptation data likely to yield further improvements. LM adaptation experiments suggested that although there seems to be a small degree of speaker idiolect, adaptation to the speaker alone, without considering the topic of the conversation, is in itself unlikely to improve transcription accuracy.

The Autistic Behavioural Indicators Instrument (ABII) : development and instrument utility in discriminating autistic disorder from speech and language impairment and typical development

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Autistic Behavioural Indicators Instrument (ABII) is an 18-item instrument developed to identify children with Autistic Disorder (AD) based on the presence of unique autistic behavioural indicators. The ABII was administered to 20 children with AD, 20 children with speech and language impairment (SLI) and 20 typically developing (TD) children aged 2-6 years. Results indicated that the ABII discriminated children diagnosed with AD from those diagnosed with SLI and those who were TD, based on the presence of specific social attention, sensory, and behavioural symptoms. A combination of symptomology across these domains correctly classified 100% of children with and without AD. The paper concludes that the ABII shows considerable promise as an instrument for the early identification of AD.

Visual speech recognition across multiple views

Relevância:

20.00% 20.00%

Publicador:

Impact of programming and application-specific knowledge on maintenance effort: A hazard rate model

Relevância:

20.00% 20.00%

Publicador:

Speech endpoint detection using gradient based edge detection techniques

Relevância:

20.00% 20.00%

Publicador:

Cardiac state diagnosis using higher order spectra of heart rate variability

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart, by the sympathetic and parasympathetic branches of the autonomic nervous system. Heart rate variability analysis is an important tool to observe the heart's ability to respond to normal regulatory impulses that affect its rhythm. A computer-based intelligent system for analysis of cardiac states is very useful in diagnostics and disease management. Like many bio-signals, HRV signals are nonlinear in nature. Higher order spectral analysis (HOS) is known to be a good tool for the analysis of nonlinear systems and provides good noise immunity. In this work, we studied the HOS of the HRV signals of normal heartbeat and seven classes of arrhythmia. We present some general characteristics for each of these classes of HRV signals in the bispectrum and bicoherence plots. We also extracted features from the HOS and performed an analysis of variance (ANOVA) test. The results are very promising for cardiac arrhythmia classification with a number of features yielding a p-value < 0.02 in the ANOVA test.

FPGA implementation of dual-microphone delay-and-sum beamforming for in-car speech enhancement and recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In an automotive environment, the performance of a speech recognition system is affected by environmental noise if the speech signal is acquired directly from a microphone. Speech enhancement techniques are therefore necessary to improve the speech recognition performance. In this paper, a field-programmable gate array (FPGA) implementation of dual-microphone delay-and-sum beamforming (DASB) for speech enhancement is presented. As the first step towards a cost-effective solution, the implementation described in this paper uses a relatively high-end FPGA device to facilitate the verification of various design strategies and parameters. Experimental results show that the proposed design can produce output waveforms close to those generated by a theoretical (floating-point) model with modest usage of FPGA resources. Speech recognition experiments are also conducted on enhanced in-car speech waveforms produced by the FPGA in order to compare recognition performance with the floating-point representation running on a PC.

Impact of cognitive load and frustration on drivers’ speech [Abstract]

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Secondary tasks such as cell phone calls or interaction with automated speech dialog systems (SDSs) increase the driver’s cognitive load as well as the probability of driving errors. This study analyzes speech production variations due to cognitive load and emotional state of drivers in real driving conditions. Speech samples were acquired from 24 female and 17 male subjects (approximately 8.5 h of data) while talking to a co-driver and communicating with two automated call centers, with emotional states (neutral, negative) and the number of necessary SDS query repetitions also labeled. A consistent shift in a number of speech production parameters (pitch, first format center frequency, spectral center of gravity, spectral energy spread, and duration of voiced segments) was observed when comparing SDS interaction against co-driver interaction; further increases were observed when considering negative emotion segments and the number of requested SDS query repetitions. A mel frequency cepstral coefficient based Gaussian mixture classifier trained on 10 male and 10 female sessions provided 91% accuracy in the open test set task of distinguishing co-driver interactions from SDS interactions, suggesting—together with the acoustic analysis—that it is possible to monitor the level of driver distraction directly from their speech.

Contrasting scenarios : embracing speech recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The purpose of this chapter is to describe the use of caricatured contrasting scenarios (Bødker, 2000) and how they can be used to consider potential designs for disruptive technologies. The disruptive technology in this case is Automatic Speech Recognition (ASR) software in workplace settings. The particular workplace is the Magistrates Court of the Australian Capital Territory.----- Caricatured contrasting scenarios are ideally suited to exploring how ASR might be implemented in a particular setting because they allow potential implementations to be “sketched” quickly and with little effort. This sketching of potential interactions and the emphasis of both positive and negative outcomes allows the benefits and pitfalls of design decisions to become apparent.----- A brief description of the Court is given, describing the reasons for choosing the Court for this case study. The work of the Court is framed as taking place in two modes: Front of house, where the courtroom itself is, and backstage, where documents are processed and the business of the court is recorded and encoded into various systems.----- Caricatured contrasting scenarios describing the introduction of ASR to the front of house are presented and then analysed. These scenarios show that the introduction of ASR to the court would be highly problematic.----- The final section describes how ASR could be re-imagined in order to make it useful for the court. A final scenario is presented that describes how this re-imagined ASR could be integrated into both the front of house and backstage of the court in a way that could strengthen both processes.

The rate-limiting mechanism for the heterogeneous burning of cylindrical iron rods

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the findings of an investigation into the rate-limiting mechanism for the heterogeneous burning in oxygen under normal gravity and microgravity of cylindrical iron rods. The original objective of the work was to determine why the observed melting rate for burning 3.2-mm diameter iron rods is significantly higher in microgravity than in normal gravity. This work, however, also provided fundamental insight into the rate-limiting mechanism for heterogeneous burning. The paper includes a summary of normal-gravity and microgravity experimental results, heat transfer analysis and post-test microanalysis of quenched samples. These results are then used to show that heat transfer across the solid/liquid interface is the rate-limiting mechanism for melting and burning, limited by the interfacial surface area between the molten drop and solid rod. In normal gravity, the work improves the understanding of trends reported during standard flammability testing for metallic materials, such as variations in melting rates between test specimens with the same cross-sectional area but different crosssectional shape. The work also provides insight into the effects of configuration and orientation, leading to an improved application of standard test results in the design of oxygen system components. For microgravity applications, the work enables the development of improved methods for lower cost metallic material flammability testing programs. In these ways, the work provides fundamental insight into the heterogeneous burning process and contributes to improved fire safety for oxygen systems in applications involving both normal-gravity and microgravity environments.

A proposed qualitative framework for heterogeneous burning of metallic materials : the 'melting rate triangle'

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a proposed qualitative framework to discuss the heterogeneous burning of metallic materials, through parameters and factors that influence the melting rate of the solid metallic fuel (either in a standard test or in service). During burning, the melting rate is related to the burning rate and is therefore an important parameter for describing and understanding the burning process, especially since the melting rate is commonly recorded during standard flammability testing for metallic materials and is incorporated into many relative flammability ranking schemes. However, whilst the factors that influence melting rate (such as oxygen pressure or specimen diameter) have been well characterized, there is a need for an improved understanding of how these parameters interact as part of the overall melting and burning of the system. Proposed here is the ‘Melting Rate Triangle’, which aims to provide this focus through a conceptual framework for understanding how the melting rate (of solid fuel) is determined and regulated during heterogeneous burning. In the paper, the proposed conceptual model is shown to be both (a) consistent with known trends and previously observed results, and (b)capable of being expanded to incorporate new data. Also shown are examples of how the Melting Rate Triangle can improve the interpretation of flammability test results. Slusser and Miller previously published an ‘Extended Fire Triangle’ as a useful conceptual model of ignition and the factors affecting ignition, providing industry with a framework for discussion. In this paper it is shown that a ‘Melting Rate Triangle’ provides a similar qualitative framework for burning, leading to an improved understanding of the factors affecting fire propagation and extinguishment.

Robust speech recognition using speech enhancement

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic Speech Recognition (ASR) has matured into a technology which is becoming more common in our everyday lives, and is emerging as a necessity to minimise driver distraction when operating in-car systems such as navigation and infotainment. In “noise-free” environments, word recognition performance of these systems has been shown to approach 100%, however this performance degrades rapidly as the level of background noise is increased. Speech enhancement is a popular method for making ASR systems more ro- bust. Single-channel spectral subtraction was originally designed to improve hu- man speech intelligibility and many attempts have been made to optimise this algorithm in terms of signal-based metrics such as maximised Signal-to-Noise Ratio (SNR) or minimised speech distortion. Such metrics are used to assess en- hancement performance for intelligibility not speech recognition, therefore mak- ing them sub-optimal ASR applications. This research investigates two methods for closely coupling subtractive-type enhancement algorithms with ASR: (a) a computationally-efficient Mel-filterbank noise subtraction technique based on likelihood-maximisation (LIMA), and (b) in- troducing phase spectrum information to enable spectral subtraction in the com- plex frequency domain. Likelihood-maximisation uses gradient-descent to optimise parameters of the enhancement algorithm to best fit the acoustic speech model given a word se- quence known a priori. Whilst this technique is shown to improve the ASR word accuracy performance, it is also identified to be particularly sensitive to non-noise mismatches between the training and testing data. Phase information has long been ignored in spectral subtraction as it is deemed to have little effect on human intelligibility. In this work it is shown that phase information is important in obtaining highly accurate estimates of clean speech magnitudes which are typically used in ASR feature extraction. Phase Estimation via Delay Projection is proposed based on the stationarity of sinusoidal signals, and demonstrates the potential to produce improvements in ASR word accuracy in a wide range of SNR. Throughout the dissertation, consideration is given to practical implemen- tation in vehicular environments which resulted in two novel contributions – a LIMA framework which takes advantage of the grounding procedure common to speech dialogue systems, and a resource-saving formulation of frequency-domain spectral subtraction for realisation in field-programmable gate array hardware. The techniques proposed in this dissertation were evaluated using the Aus- tralian English In-Car Speech Corpus which was collected as part of this work. This database is the first of its kind within Australia and captures real in-car speech of 50 native Australian speakers in seven driving conditions common to Australian environments.

«
1
2
...
8
9
10
11
12
13
14
...
65
66
»