989 resultados para Spoken dialogue systems


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spoken term detection (STD) popularly involves performing word or sub-word level speech recognition and indexing the result. This work challenges the assumption that improved speech recognition accuracy implies better indexing for STD. Using an index derived from phone lattices, this paper examines the effect of language model selection on the relationship between phone recognition accuracy and STD accuracy. Results suggest that language models usually improve phone recognition accuracy but their inclusion does not always translate to improved STD accuracy. The findings suggest that using phone recognition accuracy to measure the quality of an STD index can be problematic, and highlight the need for an alternative that is more closely aligned with the goals of the specific detection task.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While spoken term detection (STD) systems based on word indices provide good accuracy, there are several practical applications where it is infeasible or too costly to employ an LVCSR engine. An STD system is presented, which is designed to incorporate a fast phonetic decoding front-end and be robust to decoding errors whilst still allowing for rapid search speeds. This goal is achieved through mono-phone open-loop decoding coupled with fast hierarchical phone lattice search. Results demonstrate that an STD system that is designed with the constraint of a fast and simple phonetic decoding front-end requires a compromise to be made between search speed and search accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces a novel technique to directly optimise the Figure of Merit (FOM) for phonetic spoken term detection. The FOM is a popular measure of sTD accuracy, making it an ideal candiate for use as an objective function. A simple linear model is introduced to transform the phone log-posterior probabilities output by a phe classifier to produce enhanced log-posterior features that are more suitable for the STD task. Direct optimisation of the FOM is then performed by training the parameters of this model using a non-linear gradient descent algorithm. Substantial FOM improvements of 11% relative are achieved on held-out evaluation data, demonstrating the generalisability of the approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic spoken Language Identi¯cation (LID) is the process of identifying the language spoken within an utterance. The challenge that this task presents is that no prior information is available indicating the content of the utterance or the identity of the speaker. The trend of globalization and the pervasive popularity of the Internet will amplify the need for the capabilities spoken language identi¯ca- tion systems provide. A prominent application arises in call centers dealing with speakers speaking di®erent languages. Another important application is to index or search huge speech data archives and corpora that contain multiple languages. The aim of this research is to develop techniques targeted at producing a fast and more accurate automatic spoken LID system compared to the previous National Institute of Standards and Technology (NIST) Language Recognition Evaluation. Acoustic and phonetic speech information are targeted as the most suitable fea- tures for representing the characteristics of a language. To model the acoustic speech features a Gaussian Mixture Model based approach is employed. Pho- netic speech information is extracted using existing speech recognition technol- ogy. Various techniques to improve LID accuracy are also studied. One approach examined is the employment of Vocal Tract Length Normalization to reduce the speech variation caused by di®erent speakers. A linear data fusion technique is adopted to combine the various aspects of information extracted from speech. As a result of this research, a LID system was implemented and presented for evaluation in the 2003 Language Recognition Evaluation conducted by the NIST.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

For the first time in human history, large volumes of spoken audio are being broadcast, made available on the internet, archived, and monitored for surveillance every day. New technologies are urgently required to unlock these vast and powerful stores of information. Spoken Term Detection (STD) systems provide access to speech collections by detecting individual occurrences of specified search terms. The aim of this work is to develop improved STD solutions based on phonetic indexing. In particular, this work aims to develop phonetic STD systems for applications that require open-vocabulary search, fast indexing and search speeds, and accurate term detection. Within this scope, novel contributions are made within two research themes, that is, accommodating phone recognition errors and, secondly, modelling uncertainty with probabilistic scores. A state-of-the-art Dynamic Match Lattice Spotting (DMLS) system is used to address the problem of accommodating phone recognition errors with approximate phone sequence matching. Extensive experimentation on the use of DMLS is carried out and a number of novel enhancements are developed that provide for faster indexing, faster search, and improved accuracy. Firstly, a novel comparison of methods for deriving a phone error cost model is presented to improve STD accuracy, resulting in up to a 33% improvement in the Figure of Merit. A method is also presented for drastically increasing the speed of DMLS search by at least an order of magnitude with no loss in search accuracy. An investigation is then presented of the effects of increasing indexing speed for DMLS, by using simpler modelling during phone decoding, with results highlighting the trade-off between indexing speed, search speed and search accuracy. The Figure of Merit is further improved by up to 25% using a novel proposal to utilise word-level language modelling during DMLS indexing. Analysis shows that this use of language modelling can, however, be unhelpful or even disadvantageous for terms with a very low language model probability. The DMLS approach to STD involves generating an index of phone sequences using phone recognition. An alternative approach to phonetic STD is also investigated that instead indexes probabilistic acoustic scores in the form of a posterior-feature matrix. A state-of-the-art system is described and its use for STD is explored through several experiments on spontaneous conversational telephone speech. A novel technique and framework is proposed for discriminatively training such a system to directly maximise the Figure of Merit. This results in a 13% improvement in the Figure of Merit on held-out data. The framework is also found to be particularly useful for index compression in conjunction with the proposed optimisation technique, providing for a substantial index compression factor in addition to an overall gain in the Figure of Merit. These contributions significantly advance the state-of-the-art in phonetic STD, by improving the utility of such systems in a wide range of applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work proposes to improve spoken term detection (STD) accuracy by optimising the Figure of Merit (FOM). In this article, the index takes the form of phonetic posterior-feature matrix. Accuracy is improved by formulating STD as a discriminative training problem and directly optimising the FOM, through its use as an objective function to train a transformation of the index. The outcome of indexing is then a matrix of enhanced posterior-features that are directly tailored for the STD task. The technique is shown to improve the FOM by up to 13% on held-out data. Additional analysis explores the effect of the technique on phone recognition accuracy, examines the actual values of the learned transform, and demonstrates that using an extended training data set results in further improvement in the FOM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The final shape of the "Internet of Things" ubiquitous computing promises relies on a cybernetic system of inputs (in the form of sensory information), computation or decision making (based on the prefiguration of rules, contexts, and user-generated or defined metadata), and outputs (associated action from ubiquitous computing devices). My interest in this paper lies in the computational intelligences that suture these positions together, and how positioning these intelligences as autonomous agents extends the dialogue between human-users and ubiquitous computing technology. Drawing specifically on the scenarios surrounding the employment of ubiquitous computing within aged care, I argue that agency is something that cannot be traded without serious consideration of the associated ethics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this paper is to demonstrate the efficacy of collaborative evidence based information practice (EBIP) as an organizational effectiveness model. Shared leadership, appreciative inquiry and knowledge creation theoretical frameworks provide the foundation for change toward the implementation of a collaborative EBIP workplace model. Collaborative EBIP reiterates the importance of gathering the best available evidence, but it differs by shifting decision-making authority from "library or employer centric" to "user or employee centric". University of Colorado Denver Auraria Library Technical Services department created a collaborative EBIP environment by flattening workplace hierarchies, distributing problem solving and encouraging reflective dialogue. By doing so, participants are empowered to identify problems, create solutions, and become valued and respected leaders and followers. In an environment where library budgets are in jeopardy, recruitment opportunities are limited and the workplace is in constant flux, the Auraria Library case study offers an approach that maximizes the capability of the current workforce and promotes agile responsiveness to industry and organizational challenges. Collaborative EBIP is an organizational model demonstrating a process focusing first on the individual and moving to the collective to develop a responsive and high performing business unit, and in turn, organization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Informed Systems Approach offers models for advancing workplace learning within collaboratively designed systems that promote using information to learn through collegial exchange and reflective dialogue. This systemic approach integrates theoretical antecedents and process models, including the learning theories of Peter Checkland (Soft Systems Methodology), which advance systems design and informed action, and Christine Bruce (informed learning), which generate information experiences and professional practices. Ikujiro Nonaka’s systems ideas (SECI model) and Mary Crossan’s learning framework (4i framework) further animate workplace knowledge creation through learning relationships engaging individuals with ideas.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been suggested that semantic information processing is modularized according to the input form (e.g., visual, verbal, non-verbal sound). A great deal of research has concentrated on detecting a separate verbal module. Also, it has traditionally been assumed in linguistics that the meaning of a single clause is computed before integration to a wider context. Recent research has called these views into question. The present study explored whether it is reasonable to assume separate verbal and nonverbal semantic systems in the light of the evidence from event-related potentials (ERPs). The study also provided information on whether the context influences processing of a single clause before the local meaning is computed. The focus was on an ERP called N400. Its amplitude is assumed to reflect the effort required to integrate an item to the preceding context. For instance, if a word is anomalous in its context, it will elicit a larger N400. N400 has been observed in experiments using both verbal and nonverbal stimuli. Contents of a single sentence were not hypothesized to influence the N400 amplitude. Only the combined contents of the sentence and the picture were hypothesized to influence the N400. The subjects (n = 17) viewed pictures on a computer screen while hearing sentences through headphones. Their task was to judge the congruency of the picture and the sentence. There were four conditions: 1) the picture and the sentence were congruent and sensible, 2) the sentence and the picture were congruent, but the sentence ended anomalously, 3) the picture and the sentence were incongruent but sensible, 4) the picture and the sentence were incongruent and anomalous. Stimuli from the four conditions were presented in a semi-randomized sequence. Their electroencephalography was simultaneously recorded. ERPs were computed for the four conditions. The amplitude of the N400 effect was largest in the incongruent sentence-picture -pairs. The anomalously ending sentences did not elicit a larger N400 than the sensible sentences. The results suggest that there is no separate verbal semantic system, and that the meaning of a single clause is not processed independent of the context.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract The modern food system and sustainable development form a conceptual combination that suggests sustainability deficits in the ways we deal with food consumption and production - in terms of economic relations, environmental impacts and nutritional status of western population. This study explores actors’ orientations towards sustainability by taking into account actors’ embedded positions within structures of the food system, actors’ economic relations and views about sustainability as well as their possibilities for progressive activities. The study looks particularly at social dynamics for sustainability within primary production and public consumption. If actors within these two worlds were to express converging orientations for sustainability, the system dynamics of the market would enable more sustainable growth in terms of production dictated by consumption. The study is based on a constructivist research approach with qualitative text analyses. The data consisted of three text corpora, the ‘local food corpus’, the ‘catering corpus’ and the ‘mixed corpus’. The local food actors were interviewed about their economic exchange relations. The caterers’ interviews dealt with their professional identity for sustainability. Finally, the mixed corpus assembled a dialogue as a participatory research approach, which was applied in order to enable researcher and caterer learning about the use of organic milk in public catering. The data were analysed for theoretically conceptualised relations, expressing behavioural patterns in actors’ everyday work as interpreted by the researcher. The findings were corroborated by the internal and external communities of food system actors. The interpretations have some validity, although they only present abstractions of everyday life and its rich, even opaque, fabric of meanings and aims. The key findings included primary producers’ social skilfulness, which enabled networking with other actors in very different paths of life, learning in order to promote one’s trade, and trusting reflectively in partners in order to extend business. These activities expanded the supply chain in a spiral fashion by horizontal and vertical forward integration, until large retailers were met for negotiations on a more equal or ‘other regarding’ basis. This kind of chain level coordination, typically building around the core of social and partnership relations, was coined as a socially overlaid network. It supported market access of local farmers, rooted in their farms, who were able to draw on local capital and labour in promotion of competitive business; the growth was endogenous. These kinds of chains – one conventional and one organic – were different from the strategic chain, which was more profit based and while highly competitive, presented exogenous growth as it depended on imported capital and local employees. However, the strategic chain offered learning opportunities and support for the local economy. The caterers exhibited more or less committed professional identity for sustainability within their reach. The facilitating and balanced approaches for professional identities dealt successfully with local and organic food in addition to domestic food, and also imported food. The co-operation with supply chains created innovative solutions and savings for the business parties to be shared. The rule-abiding approach for sustainability only made choices among organic supply chains without extending into co-operation with actors. There were also more complicated and troubled identities as juggling, critical and delimited approaches for sustainability, with less productive efforts due to restrictions such as absence of organisational sustainability strategy, weak presence of local and organic suppliers, limited understanding about sustainability and no organisational resources to develop changes towards a sustainable food system. Learning in the workplace about food system reality in terms of supply chain co-operation may prove to be a change engine that leads to advanced network operations and a more sustainable food system. The convergence between primary producers and caterers existed to an extent allowing suggestion that increased clarity about sustainable consumption and production by actors could be approached using advanced tools. The study looks for introduction of more profound environmental and socio-economic knowledge through participatory research with supply chain actors in order to promote more sustainable food systems. Summary of original publications and the authors’ contribution I Mikkola, M. & Seppänen, L. 2006. Farmers’ new participation in food chains: making horizontal and vertical progress by networking. In: Langeveld, H. & Röling N. (Eds.). Changing European farming systems for a better future. New visions for rural areas. Wageningen, The Netherlands. Wageningen Academic Publishers: 267–271. II Mikkola, M. 2008. Coordinative structures and development of food supply chains. British Food Journal 110 (2): 189–205. III Mikkola, M. 2009. Shaping professional identity for sustainability. Evidence in Finnish public catering. Appetite 53 (1): 56–65. IV Mikkola, M. 2009. Catering for sustainability: building a dialogue on organic milk. Agronomy Research 7 (Special issue 2): 668–676. Minna Mikkola has been responsible for developing the generic research frame, particular research questions, the planning and collection of the data, their qualitative analysis and writing the articles I, II, III and IV. Dr Laura Seppänen has contributed to the development of the generic research frame and article I by introducing the author to the basic concepts of economic sociology and by supporting the writing of article II with her critical comments. Articles are printed with permission from the publishers.