255 resultados para Speech act


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Keyword Spotting is the task of detecting keywords of interest within continu- ous speech. The applications of this technology range from call centre dialogue systems to covert speech surveillance devices. Keyword spotting is particularly well suited to data mining tasks such as real-time keyword monitoring and unre- stricted vocabulary audio document indexing. However, to date, many keyword spotting approaches have su®ered from poor detection rates, high false alarm rates, or slow execution times, thus reducing their commercial viability. This work investigates the application of keyword spotting to data mining tasks. The thesis makes a number of major contributions to the ¯eld of keyword spotting. The ¯rst major contribution is the development of a novel keyword veri¯cation method named Cohort Word Veri¯cation. This method combines high level lin- guistic information with cohort-based veri¯cation techniques to obtain dramatic improvements in veri¯cation performance, in particular for the problematic short duration target word class. The second major contribution is the development of a novel audio document indexing technique named Dynamic Match Lattice Spotting. This technique aug- ments lattice-based audio indexing principles with dynamic sequence matching techniques to provide robustness to erroneous lattice realisations. The resulting algorithm obtains signi¯cant improvement in detection rate over lattice-based audio document indexing while still maintaining extremely fast search speeds. The third major contribution is the study of multiple veri¯er fusion for the task of keyword veri¯cation. The reported experiments demonstrate that substantial improvements in veri¯cation performance can be obtained through the fusion of multiple keyword veri¯ers. The research focuses on combinations of speech background model based veri¯ers and cohort word veri¯ers. The ¯nal major contribution is a comprehensive study of the e®ects of limited training data for keyword spotting. This study is performed with consideration as to how these e®ects impact the immediate development and deployment of speech technologies for non-English languages.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic spoken Language Identi¯cation (LID) is the process of identifying the language spoken within an utterance. The challenge that this task presents is that no prior information is available indicating the content of the utterance or the identity of the speaker. The trend of globalization and the pervasive popularity of the Internet will amplify the need for the capabilities spoken language identi¯ca- tion systems provide. A prominent application arises in call centers dealing with speakers speaking di®erent languages. Another important application is to index or search huge speech data archives and corpora that contain multiple languages. The aim of this research is to develop techniques targeted at producing a fast and more accurate automatic spoken LID system compared to the previous National Institute of Standards and Technology (NIST) Language Recognition Evaluation. Acoustic and phonetic speech information are targeted as the most suitable fea- tures for representing the characteristics of a language. To model the acoustic speech features a Gaussian Mixture Model based approach is employed. Pho- netic speech information is extracted using existing speech recognition technol- ogy. Various techniques to improve LID accuracy are also studied. One approach examined is the employment of Vocal Tract Length Normalization to reduce the speech variation caused by di®erent speakers. A linear data fusion technique is adopted to combine the various aspects of information extracted from speech. As a result of this research, a LID system was implemented and presented for evaluation in the 2003 Language Recognition Evaluation conducted by the NIST.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a new method for utilising phase information by complementing it with traditional magnitude-only spectral subtraction speech enhancement through Complex Spectrum Subtraction (CSS). The proposed approach has the following advantages over traditional magnitude-only spectral subtraction: (a) it introduces complementary information to the enhancement algorithm; (b) it reduces the total number of algorithmic parameters, and; (c) is designed for improving clean speech magnitude spectra and is therefore suitable for both automatic speech recognition (ASR) and speech perception applications. Oracle-based ASR experiments verify this approach, showing an average of 20% relative word accuracy improvements when accurate estimates of the phase spectrum are available. Based on sinusoidal analysis and assuming stationarity between observations (which is shown to be better approximated as the frame rate is increased), this paper also proposes a novel method for acquiring the phase information called Phase Estimation via Delay Projection (PEDEP). Further oracle ASR experiments validate the potential for the proposed PEDEP technique in ideal conditions. Realistic implementation of CSS with PEDEP shows performance comparable to state of the art spectral subtraction techniques in a range of 15-20 dB signal-to-noise ratio environments. These results clearly demonstrate the potential for using phase spectra in spectral subtractive enhancement applications, and at the same time highlight the need for deriving more accurate phase estimates in a wider range of noise conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a microphone array beamforming approach to blind speech separation. Unlike previous beamforming approaches, our system does not require a-priori knowledge of the microphone placement and speaker location, making the system directly comparable other blind source separation methods which require no prior knowledge of recording conditions. Microphone location is automatically estimated using an assumed noise field model, and speaker locations are estimated using cross correlation based methods. The system is evaluated on the data provided for the PASCAL Speech Separation Challenge 2 (SSC2), achieving a word error rate of 58% on the evaluation set.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Voice recognition is one of the key enablers to reduce driver distraction as in-vehicle systems become more and more complex. With the integration of voice recognition in vehicles, safety and usability are improved as the driver’s eyes and hands are not required to operate system controls. Whilst speaker independent voice recognition is well developed, performance in high noise environments (e.g. vehicles) is still limited. La Trobe University and Queensland University of Technology have developed a low-cost hardware-based speech enhancement system for automotive environments based on spectral subtraction and delay–sum beamforming techniques. The enhancement algorithms have been optimised using authentic Australian English collected under typical driving conditions. Performance tests conducted using speech data collected under variety of vehicle noise conditions demonstrate a word recognition rate improvement in the order of 10% or more under the noisiest conditions. Currently developed to a proof of concept stage there is potential for even greater performance improvement.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Emotions play a central role in mediation as they help to define the scope and direction of a conflict. When a party to mediation expresses (and hence entrusts) their emotions to those present in a mediation, a mediator must do more than simply listen - they must attend to these emotions. Mediator empathy is an essential skill for communicating to a party that their feelings have been heard and understood, but it can lead mediators into trouble. Whilst there might exist a theoretical divide between the notions of empathy and sympathy, the very best characteristics of mediators (caring and compassionate nature) may see empathy and sympathy merge - resulting in challenges to mediator neutrality. This article first outlines the semantic difference between empathy and sympathy and the role that intrapsychic conflict can play in the convergence of these behavioural phenomena. It then defines emotional intelligence in the context of a mediation, suggesting that only the most emotionally intelligent mediators are able to emotionally connect with the parties, but maintain an impression of impartiality – the quality of remaining ‘attached yet detached’ to the process. It is argued that these emotionally intelligent mediators have the common qualities of strong self-awareness and emotional self-regulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This book is designed with undergraduate university students in mind, with the aim of teaching you the importance of being an effective communicator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Interacting with technology within a vehicle environment using a voice interface can greatly reduce the effects of driver distraction. Most current approaches to this problem only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to circumvent this is to use the visual modality in addition. However, capturing, storing and distributing audio-visual data in a vehicle environment is very costly and difficult. One current dataset available for such research is the AVICAR [1] database. Unfortunately this database is largely unusable due to timing mismatch between the two streams and in addition, no protocol is available. We have overcome this problem by re-synchronising the streams on the phone-number portion of the dataset and established a protocol for further research. This paper presents the first audio-visual results on this dataset for speaker-independent speech recognition. We hope this will serve as a catalyst for future research in this area.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

At common law, a duty of care may be owed to a claimant who suffers nervous shock or pure mental harm due to witnessing, or hearing about, physical injury caused to another due to a defendant’s negligence. “Pure mental harm” is the ‘impairment of a person’s mental condition’ that is not suffered as a consequence of any other kind of personal injury to them. However, as many accidents have the potential to create a wide circle of mental suffering to bystanders, family members or others not physically injured themselves, it has traditionally been ‘thought impolitic that everybody so affected should be able to recover damages from the tortfeasor.’ ‘To allow such extended recovery would stretch liability too far.’ Nevertheless, whilst adopting a restrictive approach to liability, the common law courts have recognised that a defendant might owe a duty in relation to the pure mental harm suffered by one who foreseeably attends an accident scene to rescue another from a situation created by the defendant’s negligence.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Under a Services Agreement dated 16th April 2010 the Australian Capital Territory (ACT) engaged Knowledge Consulting Pty Ltd to conduct an independent review of operations at the Alexander Maconochie Centre (AMC) in the ACT. The Review was commissioned following a motion passed in the ACT Legislative Assembly as follows: “That this Assembly: (1) notes: (a) concerns regarding the operation of the AMC; (b) the unanimous findings of the Standing Committee on Justice and Community Safety report, Inquiry into the delay in the commencement of operations at the Alexander Maconochie Centre; and (c) the Government’s intention to have a review into the operation of the AMC after its first year of operation; and (2) calls on the Government to: (a) commission an independent reviewer to conduct the one year review into the AMC; (b) ensure that the review be open and transparent and public, and include input from community and non-government groups with an interest or involvement in the AMC, including on the terms of reference for the review; (c) ensure the review is completed in a timely manner and be tabled in the Legislative Assembly immediately upon completion; and (d) report upon the progress of the review in August 2010;”

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Tourism, Racing and Fair Trading (Miscellaneous Provisions) Act 2002 (“the Act”) which was passed on 18 April 2002 contains a number of significant amendments relevant to the operation of the Property Agents and Motor Dealers Act 2000. The main changes relevant to property transactions are: (i) Changes to the process for appointment of a real estate agent and consolidation of the appointment forms; (ii) Additions to the disclosure obligation of agents and property developers; (iii) Simplification of the process for commencing the cooling off period; (iv) Alteration of the common law position concerning when the parties are bound by a contract; (v) Removal of the requirement for a seller’s signature on the warning statement to be witnessed; (vi) Retrospective amendment of s 170 of the Body Corporate and Community Management Act 1997; (vii) Inclusion of a new power to allow inspectors to enter the place of business of a licensee or a marketeer without consent and without a warrant; and (viii) Inclusion of a new power for inspectors to require documents to be produced by marketeers. The majority of the amendments are effective from the date of assent, 24 April 2002, however, some of the amendments do not commence until a date fixed by proclamation. No proclamation has been made at the time of writing (2 May 2002). Where the amendments have not commenced this will be noted in the article. Before providing clients with advice, practitioners should carefully check proclamation details.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Property Agents and Motor Dealers Act 2000 commenced on 1 July 2001. Significant changes have now been made to the Act by the Property Agents and Motor Dealers Amendment Act 2001 (“the amending Act”). The amending Act contains two distinct parts. First, ss 11-19 of the amending Act provide for increased disclosure obligations on real estate agents, property developers and lawyers together with an extension of the 5 business day cooling-off period imposed by the original Act to all residential property (other than contracts formed on a sale by auction). These provisions commenced on 29 October 2001. The remaining provisions of the amending Act provide for increased jurisdiction and powers to the Property Agents and Motor Dealers Tribunal (“the Tribunal”) enabling the Tribunal to deal with claims against marketeers. These provisions commenced on the date of assent, 21 September 2001.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite many arguments to the contrary, the three-act story structure, as propounded and refined by Hollywood continues to dominate the blockbuster and independent film markets. Recent successes in post-modern cinema could indicate new directions and opportunities for low-budget national cinemas.