406 resultados para Speech-processing technologies
Resumo:
Several protocols for isolation of mycobacteria from water exist, but there is no established standard method. This study compared methods of processing potable water samples for the isolation of Mycobacterium avium and Mycobacterium intracellulare using spiked sterilized water and tap water decontaminated using 0.005% cetylpyridinium chloride (CPC). Samples were concentrated by centrifugation or filtration and inoculated onto Middlebrook 7H10 and 7H11 plates and Lowenstein-Jensen slants and into mycobacterial growth indicator tubes with or without polymyxin, azlocillin, nalidixic acid, trimethoprim, and amphotericin B. The solid media were incubated at 32°C, at 35°C, and at 35°C with CO2 and read weekly. The results suggest that filtration of water for the isolation of mycobacteria is a more sensitive method for concentration than centrifugation. The addition of sodium thiosulfate may not be necessary and may reduce the yield. Middlebrook M7H10 and 7H11 were equally sensitive culture media. CPC decontamination, while effective for reducing growth of contaminants, also significantly reduces mycobacterial numbers. There was no difference at 3 weeks between the different incubation temperatures.
Resumo:
Surveillance networks are typically monitored by a few people, viewing several monitors displaying the camera feeds. It is then very difficult for a human operator to effectively detect events as they happen. Recently, computer vision research has begun to address ways to automatically process some of this data, to assist human operators. Object tracking, event recognition, crowd analysis and human identification at a distance are being pursued as a means to aid human operators and improve the security of areas such as transport hubs. The task of object tracking is key to the effective use of more advanced technologies. To recognize an event people and objects must be tracked. Tracking also enhances the performance of tasks such as crowd analysis or human identification. Before an object can be tracked, it must be detected. Motion segmentation techniques, widely employed in tracking systems, produce a binary image in which objects can be located. However, these techniques are prone to errors caused by shadows and lighting changes. Detection routines often fail, either due to erroneous motion caused by noise and lighting effects, or due to the detection routines being unable to split occluded regions into their component objects. Particle filters can be used as a self contained tracking system, and make it unnecessary for the task of detection to be carried out separately except for an initial (often manual) detection to initialise the filter. Particle filters use one or more extracted features to evaluate the likelihood of an object existing at a given point each frame. Such systems however do not easily allow for multiple objects to be tracked robustly, and do not explicitly maintain the identity of tracked objects. This dissertation investigates improvements to the performance of object tracking algorithms through improved motion segmentation and the use of a particle filter. A novel hybrid motion segmentation / optical flow algorithm, capable of simultaneously extracting multiple layers of foreground and optical flow in surveillance video frames is proposed. The algorithm is shown to perform well in the presence of adverse lighting conditions, and the optical flow is capable of extracting a moving object. The proposed algorithm is integrated within a tracking system and evaluated using the ETISEO (Evaluation du Traitement et de lInterpretation de Sequences vidEO - Evaluation for video understanding) database, and significant improvement in detection and tracking performance is demonstrated when compared to a baseline system. A Scalable Condensation Filter (SCF), a particle filter designed to work within an existing tracking system, is also developed. The creation and deletion of modes and maintenance of identity is handled by the underlying tracking system; and the tracking system is able to benefit from the improved performance in uncertain conditions arising from occlusion and noise provided by a particle filter. The system is evaluated using the ETISEO database. The dissertation then investigates fusion schemes for multi-spectral tracking systems. Four fusion schemes for combining a thermal and visual colour modality are evaluated using the OTCBVS (Object Tracking and Classification in and Beyond the Visible Spectrum) database. It is shown that a middle fusion scheme yields the best results and demonstrates a significant improvement in performance when compared to a system using either mode individually. Findings from the thesis contribute to improve the performance of semi-automated video processing and therefore improve security in areas under surveillance.
Resumo:
In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", or NGMI, which is used to segment Chinese documents into n-character words or phrases, using language statistics drawn from the Chinese Wikipedia corpus. The approach alleviates the tremendous effort that is required in preparing and maintaining the manually segmented Chinese text for training purposes, and manually maintaining ever expanding lexicons. Previously, mutual information was used to achieve automated segmentation into 2-character words. The NGMI approach extends the approach to handle longer n-character words. Experiments with heterogeneous documents from the Chinese Wikipedia collection show good results.
Resumo:
A laboratory scale twin screw extruder has been interfaced with a near infrared (NIR) spectrometer via a fibre optic link so that NIR spectra can be collected continuously during the small scale experimental melt state processing of polymeric materials. This system can be used to investigate melt state processes such as reactive extrusion, in real time, in order to explore the kinetics and mechanism of the reaction. A further advantage of the system is that it has the capability to measure apparent viscosity simultaneously which gives important additional information about molecular weight changes and polymer degradation during processing. The system was used to study the melt processing of a nanocomposite consisting of a thermoplastic polyurethane and an organically modified layered silicate.
Resumo:
A successful urban management system for a Ubiquitous Eco City requires an integrated approach. This integration includes bringing together economic, socio-cultural and urban development with a well orchestrated, transparent and open decision making mechanism and necessary infrastructure and technologies. Rapidly developing information and telecommunication technologies and their platforms in the late 20th Century improves urban management and enhances the quality of life and place. Telecommunication technologies provide an important base for monitoring and managing activities over wired, wireless or fibre-optic networks. Particularly technology convergence creates new ways in which the information and telecommunication technologies are used. The 21st Century is an era where information has converged, in which people are able to access a variety of services, including internet and location based services, through multi-functional devices such as mobile phones and provides opportunities in the management of Ubiquitous Eco Cities. This paper discusses the recent developments in telecommunication networks and trends in convergence technologies and their implications on the management of Ubiquitous Eco Cities and how this technological shift is likely to be beneficial in improving the quality of life and place. The paper also introduces recent approaches on urban management systems, such as intelligent urban management systems, that are suitable for Ubiquitous Eco Cities.
Resumo:
This work presents an extended Joint Factor Analysis model including explicit modelling of unwanted within-session variability. The goals of the proposed extended JFA model are to improve verification performance with short utterances by compensating for the effects of limited or imbalanced phonetic coverage, and to produce a flexible JFA model that is effective over a wide range of utterance lengths without adjusting model parameters such as retraining session subspaces. Experimental results on the 2006 NIST SRE corpus demonstrate the flexibility of the proposed model by providing competitive results over a wide range of utterance lengths without retraining and also yielding modest improvements in a number of conditions over current state-of-the-art.
Resumo:
This paper presents a novel approach of estimating the confidence interval of speaker verification scores. This approach is utilised to minimise the utterance lengths required in order to produce a confident verification decision. The confidence estimation method is also extended to address both the problem of high correlation in consecutive frame scores, and robustness with very limited training samples. The proposed technique achieves a drastic reduction in the typical data requirements for producing confident decisions in an automatic speaker verification system. When evaluated on the NIST 2005 SRE, the early verification decision method demonstrates that an average of 5–10 seconds of speech is sufficient to produce verification rates approaching those achieved previously using an average in excess of 100 seconds of speech.
Resumo:
New models of human cognition inspired by quantum theory could underpin information technologies that are better aligned with howwe recall information.
Resumo:
We argue that web service discovery technology should help the user navigate a complex problem space by providing suggestions for services which they may not be able to formulate themselves as (s)he lacks the epistemic resources to do so. Free text documents in service environments provide an untapped source of information for augmenting the epistemic state of the user and hence their ability to search effectively for services. A quantitative approach to semantic knowledge representation is adopted in the form of semantic space models computed from these free text documents. Knowledge of the user’s agenda is promoted by associational inferences computed from the semantic space. The inferences are suggestive and aim to promote human abductive reasoning to guide the user from fuzzy search goals into a better understanding of the problem space surrounding the given agenda. Experimental results are discussed based on a complex and realistic planning activity.
Resumo:
Spoken term detection (STD) popularly involves performing word or sub-word level speech recognition and indexing the result. This work challenges the assumption that improved speech recognition accuracy implies better indexing for STD. Using an index derived from phone lattices, this paper examines the effect of language model selection on the relationship between phone recognition accuracy and STD accuracy. Results suggest that language models usually improve phone recognition accuracy but their inclusion does not always translate to improved STD accuracy. The findings suggest that using phone recognition accuracy to measure the quality of an STD index can be problematic, and highlight the need for an alternative that is more closely aligned with the goals of the specific detection task.
Resumo:
While spoken term detection (STD) systems based on word indices provide good accuracy, there are several practical applications where it is infeasible or too costly to employ an LVCSR engine. An STD system is presented, which is designed to incorporate a fast phonetic decoding front-end and be robust to decoding errors whilst still allowing for rapid search speeds. This goal is achieved through mono-phone open-loop decoding coupled with fast hierarchical phone lattice search. Results demonstrate that an STD system that is designed with the constraint of a fast and simple phonetic decoding front-end requires a compromise to be made between search speed and search accuracy.
Resumo:
The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Supervised and unsupervised experiments in acoustic model and language model adaptation are presented. Using one hour of automatically transcribed speech per speaker with a word error rate of 36.0%, unsupervised adaptation resulted in an absolute gain of 6.3%, equivalent to 70% of the gain from the supervised case, with additional adaptation data likely to yield further improvements. LM adaptation experiments suggested that although there seems to be a small degree of speaker idiolect, adaptation to the speaker alone, without considering the topic of the conversation, is in itself unlikely to improve transcription accuracy.
Resumo:
The Autistic Behavioural Indicators Instrument (ABII) is an 18-item instrument developed to identify children with Autistic Disorder (AD) based on the presence of unique autistic behavioural indicators. The ABII was administered to 20 children with AD, 20 children with speech and language impairment (SLI) and 20 typically developing (TD) children aged 2-6 years. Results indicated that the ABII discriminated children diagnosed with AD from those diagnosed with SLI and those who were TD, based on the presence of specific social attention, sensory, and behavioural symptoms. A combination of symptomology across these domains correctly classified 100% of children with and without AD. The paper concludes that the ABII shows considerable promise as an instrument for the early identification of AD.
Resumo:
A method of improving the security of biometric templates which satisfies desirable properties such as (a) irreversibility of the template, (b) revocability and assignment of a new template to the same biometric input, (c) matching in the secure transformed domain is presented. It makes use of an iterative procedure based on the bispectrum that serves as an irreversible transformation for biometric features because signal phase is discarded each iteration. Unlike the usual hash function, this transformation preserves closeness in the transformed domain for similar biometric inputs. A number of such templates can be generated from the same input. These properties are illustrated using synthetic data and applied to images from the FRGC 3D database with Gabor features. Verification can be successfully performed using these secure templates with an EER of 5.85%
Resumo:
De récentes recherches ont mis l’accent sur l’importance pour les nouvelles entreprises internationales de l’existence de ressources et de compétences spécifiques. La présente recherche, qui s’inscrit dans ce courant, montre en particulier l’importance du capital humain acquis par les entrepreneurs sur base de leur expérience internationale passée. Mais nous montrons en même temps que ces organisations sont soutenues par une intention délibérée d’internationalisation dès l’origine. Notre recherche empirique est basée sur l’analyse d’un échantillon de 466 nouvelles entreprises de hautes technologies anglaises et allemandes. Nous montrons que ce capital humain est un actif qui facilite la pénétration rapide des marchés étrangers, et plus encore quand l’entreprise nouvelle est accompagnée d’une intention stratégique délibérée d’internationalisation. Des conclusions similaires peuvent être étendues au niveau des ressources que l’entrepreneur consacre à la start-up : plus ces ressources sont importantes, plus le processus d’internationalisation tend à se faire à grande échelle ; et là aussi, l’influence de ces ressources est augmenté par l’intention stratégique d’internationalisation. Dans le cadre des études empiriques sur les born-globals (entreprises qui démarrent sur un marché globalisé), cette recherche fournit une des premières études empiriques reliant l’influence des conditions initiales de création aux probabilités de croissance internationale rapide.