973 resultados para Automatic Animal Call Recognition


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Includes bibliographical references.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Includes bibliographical references.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The role of polarisation in late time complex resonance based target identification is investigated numerically for the case of an L-shaped wire. While repeated extraction of the resonances for varying polarisation allows for better signal-to-noise immunity, it is also found that there are preferred polarisations for each complex resonance. The first few of these polarisations are extracted for the sample target.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis addresses the viability of automatic speech recognition for control room systems; with careful system design, automatic speech recognition (ASR) devices can be useful means for human computer interaction in specific types of task. These tasks can be defined as complex verbal activities, such as command and control, and can be paired with spatial tasks, such as monitoring, without detriment. It is suggested that ASR use be confined to routine plant operation, as opposed the critical incidents, due to possible problems of stress on the operators' speech.  It is proposed that using ASR will require operators to adapt a commonly used skill to cater for a novel use of speech. Before using the ASR device, new operators will require some form of training. It is shown that a demonstration by an experienced user of the device can lead to superior performance than instructions. Thus, a relatively cheap and very efficient form of operator training can be supplied by demonstration by experienced ASR operators. From a series of studies into speech based interaction with computers, it is concluded that the interaction be designed to capitalise upon the tendency of operators to use short, succinct, task specific styles of speech. From studies comparing different types of feedback, it is concluded that operators be given screen based feedback, rather than auditory feedback, for control room operation. Feedback will take two forms: the use of the ASR device will require recognition feedback, which will be best supplied using text; the performance of a process control task will require task feedback integrated into the mimic display. This latter feedback can be either textual or symbolic, but it is suggested that symbolic feedback will be more beneficial. Related to both interaction style and feedback is the issue of handling recognition errors. These should be corrected by simple command repetition practices, rather than use error handling dialogues. This method of error correction is held to be non intrusive to primary command and control operations. This thesis also addresses some of the problems of user error in ASR use, and provides a number of recommendations for its reduction.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Animal rights positions face the ‘predator problem’: the suggestion that if the rights of nonhuman animals are to be protected, then we are obliged to interfere in natural ecosystems to protect prey from predators. Generally, rather than embracing this conclusion, animal ethicists have rejected it, basing this objection on a number of different arguments. This paper considers but challenges three such arguments, before defending a fourth possibility. Rejected are Peter Singer’s suggestion that interference will lead to more harm than good, Sue Donaldson and Will Kymlicka’s suggestion that respect for nonhuman sovereignty necessitates non-interference in normal circumstances, and Alasdair Cochrane’s solution based on the claim that predators cannot survive without killing prey. The possibility defended builds upon Tom Regan’s suggestion that predators, as moral patients but not moral agents, cannot violate the rights of their prey, and so the rights of the prey, while they do exist, do not call for intervention. This idea is developed by a consideration of how moral agents can be more or less responsible for a given event, and defended against criticisms offered by thinkers including Alasdair Cochrane and Dale Jamieson.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The purpose of this paper is to survey and assess the state-of-the-art in automatic target recognition for synthetic aperture radar imagery (SAR-ATR). The aim is not to develop an exhaustive survey of the voluminous literature, but rather to capture in one place the various approaches for implementing the SAR-ATR system. This paper is meant to be as self-contained as possible, and it approaches the SAR-ATR problem from a holistic end-to-end perspective. A brief overview for the breadth of the SAR-ATR challenges is conducted. This is couched in terms of a single-channel SAR, and it is extendable to multi-channel SAR systems. Stages pertinent to the basic SAR-ATR system structure are defined, and the motivations of the requirements and constraints on the system constituents are addressed. For each stage in the SAR-ATR processing chain, a taxonomization methodology for surveying the numerous methods published in the open literature is proposed. Carefully selected works from the literature are presented under the taxa proposed. Novel comparisons, discussions, and comments are pinpointed throughout this paper. A two-fold benchmarking scheme for evaluating existing SAR-ATR systems and motivating new system designs is proposed. The scheme is applied to the works surveyed in this paper. Finally, a discussion is presented in which various interrelated issues, such as standard operating conditions, extended operating conditions, and target-model design, are addressed. This paper is a contribution toward fulfilling an objective of end-to-end SAR-ATR system design.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Supervised and unsupervised experiments in acoustic model and language model adaptation are presented. Using one hour of automatically transcribed speech per speaker with a word error rate of 36.0%, unsupervised adaptation resulted in an absolute gain of 6.3%, equivalent to 70% of the gain from the supervised case, with additional adaptation data likely to yield further improvements. LM adaptation experiments suggested that although there seems to be a small degree of speaker idiolect, adaptation to the speaker alone, without considering the topic of the conversation, is in itself unlikely to improve transcription accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this chapter is to describe the use of caricatured contrasting scenarios (Bødker, 2000) and how they can be used to consider potential designs for disruptive technologies. The disruptive technology in this case is Automatic Speech Recognition (ASR) software in workplace settings. The particular workplace is the Magistrates Court of the Australian Capital Territory.----- Caricatured contrasting scenarios are ideally suited to exploring how ASR might be implemented in a particular setting because they allow potential implementations to be “sketched” quickly and with little effort. This sketching of potential interactions and the emphasis of both positive and negative outcomes allows the benefits and pitfalls of design decisions to become apparent.----- A brief description of the Court is given, describing the reasons for choosing the Court for this case study. The work of the Court is framed as taking place in two modes: Front of house, where the courtroom itself is, and backstage, where documents are processed and the business of the court is recorded and encoded into various systems.----- Caricatured contrasting scenarios describing the introduction of ASR to the front of house are presented and then analysed. These scenarios show that the introduction of ASR to the court would be highly problematic.----- The final section describes how ASR could be re-imagined in order to make it useful for the court. A final scenario is presented that describes how this re-imagined ASR could be integrated into both the front of house and backstage of the court in a way that could strengthen both processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic Speech Recognition (ASR) has matured into a technology which is becoming more common in our everyday lives, and is emerging as a necessity to minimise driver distraction when operating in-car systems such as navigation and infotainment. In “noise-free” environments, word recognition performance of these systems has been shown to approach 100%, however this performance degrades rapidly as the level of background noise is increased. Speech enhancement is a popular method for making ASR systems more ro- bust. Single-channel spectral subtraction was originally designed to improve hu- man speech intelligibility and many attempts have been made to optimise this algorithm in terms of signal-based metrics such as maximised Signal-to-Noise Ratio (SNR) or minimised speech distortion. Such metrics are used to assess en- hancement performance for intelligibility not speech recognition, therefore mak- ing them sub-optimal ASR applications. This research investigates two methods for closely coupling subtractive-type enhancement algorithms with ASR: (a) a computationally-efficient Mel-filterbank noise subtraction technique based on likelihood-maximisation (LIMA), and (b) in- troducing phase spectrum information to enable spectral subtraction in the com- plex frequency domain. Likelihood-maximisation uses gradient-descent to optimise parameters of the enhancement algorithm to best fit the acoustic speech model given a word se- quence known a priori. Whilst this technique is shown to improve the ASR word accuracy performance, it is also identified to be particularly sensitive to non-noise mismatches between the training and testing data. Phase information has long been ignored in spectral subtraction as it is deemed to have little effect on human intelligibility. In this work it is shown that phase information is important in obtaining highly accurate estimates of clean speech magnitudes which are typically used in ASR feature extraction. Phase Estimation via Delay Projection is proposed based on the stationarity of sinusoidal signals, and demonstrates the potential to produce improvements in ASR word accuracy in a wide range of SNR. Throughout the dissertation, consideration is given to practical implemen- tation in vehicular environments which resulted in two novel contributions – a LIMA framework which takes advantage of the grounding procedure common to speech dialogue systems, and a resource-saving formulation of frequency-domain spectral subtraction for realisation in field-programmable gate array hardware. The techniques proposed in this dissertation were evaluated using the Aus- tralian English In-Car Speech Corpus which was collected as part of this work. This database is the first of its kind within Australia and captures real in-car speech of 50 native Australian speakers in seven driving conditions common to Australian environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Calibration of movement tracking systems is a difficult problem faced by both animals and robots. The ability to continuously calibrate changing systems is essential for animals as they grow or are injured, and highly desirable for robot control or mapping systems due to the possibility of component wear, modification, damage and their deployment on varied robotic platforms. In this paper we use inspiration from the animal head direction tracking system to implement a self-calibrating, neurally-based robot orientation tracking system. Using real robot data we demonstrate how the system can remove tracking drift and learn to consistently track rotation over a large range of velocities. The neural tracking system provides the first steps towards a fully neural SLAM system with improved practical applicability through selftuning and adaptation.