162 resultados para Telephone, Automatic
Resumo:
Automatic Call Recognition is vital for environmental monitoring. Patten recognition has been applied in automatic species recognition for years. However, few studies have applied formal syntactic methods to species call structure analysis. This paper introduces a novel method to adopt timed and probabilistic automata in automatic species recognition based upon acoustic components as the primitives. We demonstrate this through one kind of birds in Australia: Eastern Yellow Robin.
Resumo:
This paper presents a novel technique for segmenting an audio stream into homogeneous regions according to speaker identities, background noise, music, environmental and channel conditions. Audio segmentation is useful in audio diarization systems, which aim to annotate an input audio stream with information that attributes temporal regions of the audio into their specific sources. The segmentation method introduced in this paper is performed using the Generalized Likelihood Ratio (GLR), computed between two adjacent sliding windows over preprocessed speech. This approach is inspired by the popular segmentation method proposed by the pioneering work of Chen and Gopalakrishnan, using the Bayesian Information Criterion (BIC) with an expanding search window. This paper will aim to identify and address the shortcomings associated with such an approach. The result obtained by the proposed segmentation strategy is evaluated on the 2002 Rich Transcription (RT-02) Evaluation dataset, and a miss rate of 19.47% and a false alarm rate of 16.94% is achieved at the optimal threshold.
Resumo:
Summary Background The final phase of a three phase study analysing the implementation and impact of the nurse practitioner role in Australia (the Australian Nurse Practitioner Project or AUSPRAC) was undertaken in 2009, requiring nurse telephone interviewers to gather information about health outcomes directly from patients and their treating nurse practitioners. A team of several registered nurses was recruited and trained as telephone interviewers. The aim of this paper is to report on development and evaluation of the training process for telephone interviewers. Methods The training process involved planning the content and methods to be used in the training session; delivering the session; testing skills and understanding of interviewers post-training; collecting and analysing data to determine the degree to which the training process was successful in meeting objectives and post-training follow-up. All aspects of the training process were informed by established educational principles. Results Interrater reliability between interviewers was high for well-validated sections of the survey instrument resulting in 100% agreement between interviewers. Other sections with unvalidated questions showed lower agreement (between 75% and 90%). Overall the agreement between interviewers was 92%. Each interviewer was also measured against a specifically developed master script or gold standard and for this each interviewer achieved a percentage of correct answers of 94.7% or better. This equated to a Kappa value of 0.92 or better. Conclusion The telephone interviewer training process was very effective and achieved high interrater reliability. We argue that the high reliability was due to the use of well validated instruments and the carefully planned programme based on established educational principles. There is limited published literature on how to successfully operationalise educational principles and tailor them for specific research studies; this report addresses this knowledge gap.
Speaker attribution of multiple telephone conversations using a complete-linkage clustering approach
Resumo:
In this paper we propose and evaluate a speaker attribution system using a complete-linkage clustering method. Speaker attribution refers to the annotation of a collection of spoken audio based on speaker identities. This can be achieved using diarization and speaker linking. The main challenge associated with attribution is achieving computational efficiency when dealing with large audio archives. Traditional agglomerative clustering methods with model merging and retraining are not feasible for this purpose. This has motivated the use of linkage clustering methods without retraining. We first propose a diarization system using complete-linkage clustering and show that it outperforms traditional agglomerative and single-linkage clustering based diarization systems with a relative improvement of 40% and 68%, respectively. We then propose a complete-linkage speaker linking system to achieve attribution and demonstrate a 26% relative improvement in attribution error rate (AER) over the single-linkage speaker linking approach.
Resumo:
In the last decade, smartphones have gained widespread usage. Since the advent of online application stores, hundreds of thousands of applications have become instantly available to millions of smart-phone users. Within the Android ecosystem, application security is governed by digital signatures and a list of coarse-grained permissions. However, this mechanism is not fine-grained enough to provide the user with a sufficient means of control of the applications' activities. Abuse of highly sensible private information such as phone numbers without users' notice is the result. We show that there is a high frequency of privacy leaks even among widely popular applications. Together with the fact that the majority of the users are not proficient in computer security, this presents a challenge to the engineers developing security solutions for the platform. Our contribution is twofold: first, we propose a service which is able to assess Android Market applications via static analysis and provide detailed, but readable reports to the user. Second, we describe a means to mitigate security and privacy threats by automated reverse-engineering and refactoring binary application packages according to the users' security preferences.
Resumo:
Speaker diarization is the process of annotating an input audio with information that attributes temporal regions of the audio signal to their respective sources, which may include both speech and non-speech events. For speech regions, the diarization system also specifies the locations of speaker boundaries and assign relative speaker labels to each homogeneous segment of speech. In short, speaker diarization systems effectively answer the question of ‘who spoke when’. There are several important applications for speaker diarization technology, such as facilitating speaker indexing systems to allow users to directly access the relevant segments of interest within a given audio, and assisting with other downstream processes such as summarizing and parsing. When combined with automatic speech recognition (ASR) systems, the metadata extracted from a speaker diarization system can provide complementary information for ASR transcripts including the location of speaker turns and relative speaker segment labels, making the transcripts more readable. Speaker diarization output can also be used to localize the instances of specific speakers to pool data for model adaptation, which in turn boosts transcription accuracies. Speaker diarization therefore plays an important role as a preliminary step in automatic transcription of audio data. The aim of this work is to improve the usefulness and practicality of speaker diarization technology, through the reduction of diarization error rates. In particular, this research is focused on the segmentation and clustering stages within a diarization system. Although particular emphasis is placed on the broadcast news audio domain and systems developed throughout this work are also trained and tested on broadcast news data, the techniques proposed in this dissertation are also applicable to other domains including telephone conversations and meetings audio. Three main research themes were pursued: heuristic rules for speaker segmentation, modelling uncertainty in speaker model estimates, and modelling uncertainty in eigenvoice speaker modelling. The use of heuristic approaches for the speaker segmentation task was first investigated, with emphasis placed on minimizing missed boundary detections. A set of heuristic rules was proposed, to govern the detection and heuristic selection of candidate speaker segment boundaries. A second pass, using the same heuristic algorithm with a smaller window, was also proposed with the aim of improving detection of boundaries around short speaker segments. Compared to single threshold based methods, the proposed heuristic approach was shown to provide improved segmentation performance, leading to a reduction in the overall diarization error rate. Methods to model the uncertainty in speaker model estimates were developed, to address the difficulties associated with making segmentation and clustering decisions with limited data in the speaker segments. The Bayes factor, derived specifically for multivariate Gaussian speaker modelling, was introduced to account for the uncertainty of the speaker model estimates. The use of the Bayes factor also enabled the incorporation of prior information regarding the audio to aid segmentation and clustering decisions. The idea of modelling uncertainty in speaker model estimates was also extended to the eigenvoice speaker modelling framework for the speaker clustering task. Building on the application of Bayesian approaches to the speaker diarization problem, the proposed approach takes into account the uncertainty associated with the explicit estimation of the speaker factors. The proposed decision criteria, based on Bayesian theory, was shown to generally outperform their non- Bayesian counterparts.
Resumo:
Eco-driving instructions could reduce fuel consumption to up to 20% (EcoMove, 2010). Participants (N=13) drove an instrumented vehicle (i.e. Toyota Camry 2007) with an automatic transmission. Fuel consumption of the participants were compared before and after they received eco-driving instructions. Participants drove the same vehicle on the same urban route under similar traffic conditions. Results show that, on free flow sections of the track, all participants drove slightly faster (on average, 0.7 Km/h faster), during the lap for which they were instructed to drive in an eco-friendly manner as compared to when they were not given the eco-driving instruction. Suprisingly, eco-driving instructions increased the RPM significantly in most cases. Fuel consumption slightly decreased (6%) after the eco-driving instructions. We have found strong evidence showing that the fuel saving observed in our experiment (urban environment, automatic transmission) fall short of the 20% reduction claimed in other international trials.
Resumo:
This research makes a major contribution which enables efficient searching and indexing of large archives of spoken audio based on speaker identity. It introduces a novel technique dubbed as “speaker attribution” which is the task of automatically determining ‘who spoke when?’ in recordings and then automatically linking the unique speaker identities within each recording across multiple recordings. The outcome of the research will also have significant impact in improving the performance of automatic speech recognition systems through the extracted speaker identities.
Resumo:
A user’s query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques ignore information about the dependencies that exist between words in natural language. However, more recent approaches have demonstrated that by explicitly modeling associations between terms significant improvements in retrieval effectiveness can be achieved over those that ignore these dependencies. State-of-the-art dependency-based approaches have been shown to primarily model syntagmatic associations. Syntagmatic associations infer a likelihood that two terms co-occur more often than by chance. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process will improve retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
Resumo:
This study presents a segmentation pipeline that fuses colour and depth information to automatically separate objects of interest in video sequences captured from a quadcopter. Many approaches assume that cameras are static with known position, a condition which cannot be preserved in most outdoor robotic applications. In this study, the authors compute depth information and camera positions from a monocular video sequence using structure from motion and use this information as an additional cue to colour for accurate segmentation. The authors model the problem similarly to standard segmentation routines as a Markov random field and perform the segmentation using graph cuts optimisation. Manual intervention is minimised and is only required to determine pixel seeds in the first frame which are then automatically reprojected into the remaining frames of the sequence. The authors also describe an automated method to adjust the relative weights for colour and depth according to their discriminative properties in each frame. Experimental results are presented for two video sequences captured using a quadcopter. The quality of the segmentation is compared to a ground truth and other state-of-the-art methods with consistently accurate results.
Resumo:
This study presents a disturbance attenuation controller for horizontal position stabilisation for hover and automatic landings of a rotary-wing unmanned aerial vehicle (RUAV) operating close to the landing deck in rough seas. Based on a helicopter model representing aerodynamics during the landing phase, a non-linear state feedback H∞ controller is designed to achieve rapid horizontal position tracking in a gusty environment. Practical constraints including flapping dynamics, servo dynamics and time lag effect are considered. A high-fidelity closed-loop simulation using parameters of the Vario XLC gas-turbine helicopter verifies performance of the proposed horizontal position controller. The proposed controller not only increases the disturbance attenuation capability of the RUAV, but also enables rapid position response when gusts occur. Comparative studies show that the H∞ controller exhibits performance improvement and can be applied to ship/RUAV landing systems.
Resumo:
Objectives Early childhood caries is a highly destructive dental disease which is compounded by the need for young children to be treated under general anaesthesia. In Australia, there are long waiting periods for treatment at public hospitals. In this paper, we examined the costs and patient outcomes of a prevention programme for early childhood caries to assess its value for government services. Design Cost-effectiveness analysis using a Markov model. Setting Public dental patients in a low socioeconomic, socially disadvantaged area in the State of Queensland, Australia. Participants Children aged 6 months to 6 years received either a telephone prevention programme or usual care. Primary and secondary outcome measures A mathematical model was used to assess caries incidence and public dental treatment costs for a cohort of children. Healthcare costs, treatment probabilities and caries incidence were modelled from 6 months to 6 years of age based on trial data from mothers and their children who received either a telephone prevention programme or usual care. Sensitivity analyses were used to assess the robustness of the findings to uncertainty in the model estimates. Results By age 6 years, the telephone intervention programme had prevented an estimated 43 carious teeth and saved £69 984 in healthcare costs per 100 children. The results were sensitive to the cost of general anaesthesia (cost-savings range £36 043–£97 298) and the incidence of caries in the prevention group (cost-savings range £59 496–£83 368) and usual care (cost-savings range £46 833–£93 328), but there were cost savings in all scenarios. Conclusions A telephone intervention that aims to prevent early childhood caries is likely to generate considerable and immediate patient benefits and cost savings to the public dental health service in disadvantaged communities.
Resumo:
This paper presents a novel and practical procedure for estimating the mean deck height to assist in automatic landing operations of a Rotorcraft Unmanned Aerial Vehicle (RUAV) in harsh sea environments. A modified Prony Analysis (PA) procedure is outlined to deal with real-time observations of deck displacement, which involves developing an appropriate dynamic model to approach real deck motion with parameters identified through implementing the Forgetting Factor Recursive Least Square (FFRLS) method. The model order is specified using a proper order-selection criterion based on minimizing the summation of accumulated estimation errors. In addition, a feasible threshold criterion is proposed to separate the dominant components of deck displacement, which results in an accurate instantaneous estimation of the mean deck position. Simulation results demonstrate that the proposed recursive procedure exhibits satisfactory estimation performance when applied to real-time deck displacement measurements, making it well suited for integration into ship-RUAV approach and landing guidance systems.