228 resultados para noisy speaker verification


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Robust image hashing seeks to transform a given input image into a shorter hashed version using a key-dependent non-invertible transform. These image hashes can be used for watermarking, image integrity authentication or image indexing for fast retrieval. This paper introduces a new method of generating image hashes based on extracting Higher Order Spectral features from the Radon projection of an input image. The feature extraction process is non-invertible, non-linear and different hashes can be produced from the same image through the use of random permutations of the input. We show that the transform is robust to typical image transformations such as JPEG compression, noise, scaling, rotation, smoothing and cropping. We evaluate our system using a verification-style framework based on calculating false match, false non-match likelihoods using the publicly available Uncompressed Colour Image database (UCID) of 1320 images. We also compare our results to Swaminathan’s Fourier-Mellin based hashing method with at least 1% EER improvement under noise, scaling and sharpening.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but these approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks are an alternative that optimise parameters of enhancement algorithms based on state sequences generated for utterances with known transcriptions. Previous reports of LIMA frameworks have shown significant promise for improving speech recognition accuracies under additive background noise for a range of speech enhancement techniques. In this paper we discuss the drawbacks of the LIMA approach when multiple layers of acoustic mismatch are present – namely background noise and speaker accent. Experimentation using LIMA-based Mel-filterbank noise subtraction on American and Australian English in-car speech databases supports this discussion, demonstrating that inferior speech recognition performance occurs when a second layer of mismatch is seen during evaluation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but such approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks on the other hand, optimise the parameters of speech enhancement algorithms based on state sequences generated by a speech recogniser for utterances of known transcriptions. Previous applications of LIMA frameworks have generated a set of global enhancement parameters for all model states without taking in account the distribution of model occurrence, making optimisation susceptible to favouring frequently occurring models, in particular silence. In this paper, we demonstrate the existence of highly disproportionate phonetic distributions on two corpora with distinct speech tasks, and propose to normalise the influence of each phone based on a priori occurrence probabilities. Likelihood analysis and speech recognition experiments verify this approach for improving ASR performance in noisy environments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Shell structures find use in many fields of engineering, notably structural, mechanical, aerospace and nuclear-reactor disciplines. Axisymmetric shell structures are used as dome type of roofs, hyperbolic cooling towers, silos for storage of grain, oil and industrial chemicals and water tanks. Despite their thin walls, strength is derived due to the curvature. The generally high strength-to-weight ratio of the shell form, combined with its inherent stiffness, has formed the basis of this vast application. With the advent in computation technology, the finite element method and optimisation techniques, structural engineers have extremely versatile tools for the optimum design of such structures. Optimisation of shell structures can result not only in improved designs, but also in a large saving of material. The finite element method being a general numerical procedure that could be used to treat any shell problem to any desired degree of accuracy, requires several runs in order to obtain a complete picture of the effect of one parameter on the shell structure. This redesign I re-analysis cycle has been achieved via structural optimisation in the present research, and MSC/NASTRAN (a commercially available finite element code) has been used in this context for volume optimisation of axisymmetric shell structures under axisymmetric and non-axisymmetric loading conditions. The parametric study of different axisymmetric shell structures has revealed that the hyperbolic shape is the most economical solution of shells of revolution. To establish this, axisymmetric loading; self-weight and hydrostatic pressure, and non-axisymmetric loading; wind pressure and earthquake dynamic forces have been modelled on graphical pre and post processor (PATRAN) and analysis has been performed on two finite element codes (ABAQUS and NASTRAN), numerical model verification studies are performed, and optimum material volume required in the walls of cylindrical, conical, parabolic and hyperbolic forms of axisymmetric shell structures are evaluated and reviewed. Free vibration and transient earthquake analysis of hyperbolic shells have been performed once it was established that hyperbolic shape is the most economical under all possible loading conditions. Effect of important parameters of hyperbolic shell structures; shell wall thickness, height and curvature, have been evaluated and empirical relationships have been developed to estimate an approximate value of the lowest (first) natural frequency of vibration. The outcome of this thesis has been the generation of new research information on performance characteristics of axisymmetric shell structures that will facilitate improved designs of shells with better choice of shapes and enhanced levels of economy and performance. Key words; Axisymmetric shell structures, Finite element analysis, Volume Optimisation_ Free vibration_ Transient response.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The material presented in this thesis may be viewed as comprising two key parts, the first part concerns batch cryptography specifically, whilst the second deals with how this form of cryptography may be applied to security related applications such as electronic cash for improving efficiency of the protocols. The objective of batch cryptography is to devise more efficient primitive cryptographic protocols. In general, these primitives make use of some property such as homomorphism to perform a computationally expensive operation on a collective input set. The idea is to amortise an expensive operation, such as modular exponentiation, over the input. Most of the research work in this field has concentrated on its employment as a batch verifier of digital signatures. It is shown that several new attacks may be launched against these published schemes as some weaknesses are exposed. Another common use of batch cryptography is the simultaneous generation of digital signatures. There is significantly less previous work on this area, and the present schemes have some limited use in practical applications. Several new batch signatures schemes are introduced that improve upon the existing techniques and some practical uses are illustrated. Electronic cash is a technology that demands complex protocols in order to furnish several security properties. These typically include anonymity, traceability of a double spender, and off-line payment features. Presently, the most efficient schemes make use of coin divisibility to withdraw one large financial amount that may be progressively spent with one or more merchants. Several new cash schemes are introduced here that make use of batch cryptography for improving the withdrawal, payment, and deposit of electronic coins. The devised schemes apply both to the batch signature and verification techniques introduced, demonstrating improved performance over the contemporary divisible based structures. The solutions also provide an alternative paradigm for the construction of electronic cash systems. Whilst electronic cash is used as the vehicle for demonstrating the relevance of batch cryptography to security related applications, the applicability of the techniques introduced extends well beyond this.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Literally, the word compliance suggests conformity in fulfilling official requirements. The thesis presents the results of the analysis and design of a class of protocols called compliant cryptologic protocols (CCP). The thesis presents a notion for compliance in cryptosystems that is conducive as a cryptologic goal. CCP are employed in security systems used by at least two mutually mistrusting sets of entities. The individuals in the sets of entities only trust the design of the security system and any trusted third party the security system may include. Such a security system can be thought of as a broker between the mistrusting sets of entities. In order to provide confidence in operation for the mistrusting sets of entities, CCP must provide compliance verification mechanisms. These mechanisms are employed either by all the entities or a set of authorised entities in the system to verify the compliance of the behaviour of various participating entities with the rules of the system. It is often stated that confidentiality, integrity and authentication are the primary interests of cryptology. It is evident from the literature that authentication mechanisms employ confidentiality and integrity services to achieve their goal. Therefore, the fundamental services that any cryptographic algorithm may provide are confidentiality and integrity only. Since controlling the behaviour of the entities is not a feasible cryptologic goal,the verification of the confidentiality of any data is a futile cryptologic exercise. For example, there exists no cryptologic mechanism that would prevent an entity from willingly or unwillingly exposing its private key corresponding to a certified public key. The confidentiality of the data can only be assumed. Therefore, any verification in cryptologic protocols must take the form of integrity verification mechanisms. Thus, compliance verification must take the form of integrity verification in cryptologic protocols. A definition of compliance that is conducive as a cryptologic goal is presented as a guarantee on the confidentiality and integrity services. The definitions are employed to provide a classification mechanism for various message formats in a cryptologic protocol. The classification assists in the characterisation of protocols, which assists in providing a focus for the goals of the research. The resulting concrete goal of the research is the study of those protocols that employ message formats to provide restricted confidentiality and universal integrity services to selected data. The thesis proposes an informal technique to understand, analyse and synthesise the integrity goals of a protocol system. The thesis contains a study of key recovery,electronic cash, peer-review, electronic auction, and electronic voting protocols. All these protocols contain message format that provide restricted confidentiality and universal integrity services to selected data. The study of key recovery systems aims to achieve robust key recovery relying only on the certification procedure and without the need for tamper-resistant system modules. The result of this study is a new technique for the design of key recovery systems called hybrid key escrow. The thesis identifies a class of compliant cryptologic protocols called secure selection protocols (SSP). The uniqueness of this class of protocols is the similarity in the goals of the member protocols, namely peer-review, electronic auction and electronic voting. The problem statement describing the goals of these protocols contain a tuple,(I, D), where I usually refers to an identity of a participant and D usually refers to the data selected by the participant. SSP are interested in providing confidentiality service to the tuple for hiding the relationship between I and D, and integrity service to the tuple after its formation to prevent the modification of the tuple. The thesis provides a schema to solve the instances of SSP by employing the electronic cash technology. The thesis makes a distinction between electronic cash technology and electronic payment technology. It will treat electronic cash technology to be a certification mechanism that allows the participants to obtain a certificate on their public key, without revealing the certificate or the public key to the certifier. The thesis abstracts the certificate and the public key as the data structure called anonymous token. It proposes design schemes for the peer-review, e-auction and e-voting protocols by employing the schema with the anonymous token abstraction. The thesis concludes by providing a variety of problem statements for future research that would further enrich the literature.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Keyword Spotting is the task of detecting keywords of interest within continu- ous speech. The applications of this technology range from call centre dialogue systems to covert speech surveillance devices. Keyword spotting is particularly well suited to data mining tasks such as real-time keyword monitoring and unre- stricted vocabulary audio document indexing. However, to date, many keyword spotting approaches have su®ered from poor detection rates, high false alarm rates, or slow execution times, thus reducing their commercial viability. This work investigates the application of keyword spotting to data mining tasks. The thesis makes a number of major contributions to the ¯eld of keyword spotting. The ¯rst major contribution is the development of a novel keyword veri¯cation method named Cohort Word Veri¯cation. This method combines high level lin- guistic information with cohort-based veri¯cation techniques to obtain dramatic improvements in veri¯cation performance, in particular for the problematic short duration target word class. The second major contribution is the development of a novel audio document indexing technique named Dynamic Match Lattice Spotting. This technique aug- ments lattice-based audio indexing principles with dynamic sequence matching techniques to provide robustness to erroneous lattice realisations. The resulting algorithm obtains signi¯cant improvement in detection rate over lattice-based audio document indexing while still maintaining extremely fast search speeds. The third major contribution is the study of multiple veri¯er fusion for the task of keyword veri¯cation. The reported experiments demonstrate that substantial improvements in veri¯cation performance can be obtained through the fusion of multiple keyword veri¯ers. The research focuses on combinations of speech background model based veri¯ers and cohort word veri¯ers. The ¯nal major contribution is a comprehensive study of the e®ects of limited training data for keyword spotting. This study is performed with consideration as to how these e®ects impact the immediate development and deployment of speech technologies for non-English languages.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic spoken Language Identi¯cation (LID) is the process of identifying the language spoken within an utterance. The challenge that this task presents is that no prior information is available indicating the content of the utterance or the identity of the speaker. The trend of globalization and the pervasive popularity of the Internet will amplify the need for the capabilities spoken language identi¯ca- tion systems provide. A prominent application arises in call centers dealing with speakers speaking di®erent languages. Another important application is to index or search huge speech data archives and corpora that contain multiple languages. The aim of this research is to develop techniques targeted at producing a fast and more accurate automatic spoken LID system compared to the previous National Institute of Standards and Technology (NIST) Language Recognition Evaluation. Acoustic and phonetic speech information are targeted as the most suitable fea- tures for representing the characteristics of a language. To model the acoustic speech features a Gaussian Mixture Model based approach is employed. Pho- netic speech information is extracted using existing speech recognition technol- ogy. Various techniques to improve LID accuracy are also studied. One approach examined is the employment of Vocal Tract Length Normalization to reduce the speech variation caused by di®erent speakers. A linear data fusion technique is adopted to combine the various aspects of information extracted from speech. As a result of this research, a LID system was implemented and presented for evaluation in the 2003 Language Recognition Evaluation conducted by the NIST.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, the commonly used switching schemes for sliding mode control of power converters is analyzed and designed in the frequency domain. Particular application of a distribution static compensator (DSTATCOM) in voltage control mode is investigated in a power distribution system. Tsypkin's method and describing function is used to obtain the switching conditions for the two-level and three-level voltage source inverters. Magnitude conditions of carrier signals are developed for robust switching of the inverter under carrier-based modulation scheme of sliding mode control. The existence of border collision bifurcation is identified to avoid the complex switching states of the inverter. The load bus voltage of an unbalanced three-phase nonstiff radial distribution system is controlled using the proposed carrier-based design. The results are validated using PSCAD/EMTDC simulation studies and through a scaled laboratory model of DSTATCOM that is developed for experimental verification

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The QUT-NOISE-TIMIT corpus consists of 600 hours of noisy speech sequences designed to enable a thorough evaluation of voice activity detection (VAD) algorithms across a wide variety of common background noise scenarios. In order to construct the final mixed-speech database, a collection of over 10 hours of background noise was conducted across 10 unique locations covering 5 common noise scenarios, to create the QUT-NOISE corpus. This background noise corpus was then mixed with speech events chosen from the TIMIT clean speech corpus over a wide variety of noise lengths, signal-to-noise ratios (SNRs) and active speech proportions to form the mixed-speech QUT-NOISE-TIMIT corpus. The evaluation of five baseline VAD systems on the QUT-NOISE-TIMIT corpus is conducted to validate the data and show that the variety of noise available will allow for better evaluation of VAD systems than existing approaches in the literature.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present a microphone array beamforming approach to blind speech separation. Unlike previous beamforming approaches, our system does not require a-priori knowledge of the microphone placement and speaker location, making the system directly comparable other blind source separation methods which require no prior knowledge of recording conditions. Microphone location is automatically estimated using an assumed noise field model, and speaker locations are estimated using cross correlation based methods. The system is evaluated on the data provided for the PASCAL Speech Separation Challenge 2 (SSC2), achieving a word error rate of 58% on the evaluation set.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Voice recognition is one of the key enablers to reduce driver distraction as in-vehicle systems become more and more complex. With the integration of voice recognition in vehicles, safety and usability are improved as the driver’s eyes and hands are not required to operate system controls. Whilst speaker independent voice recognition is well developed, performance in high noise environments (e.g. vehicles) is still limited. La Trobe University and Queensland University of Technology have developed a low-cost hardware-based speech enhancement system for automotive environments based on spectral subtraction and delay–sum beamforming techniques. The enhancement algorithms have been optimised using authentic Australian English collected under typical driving conditions. Performance tests conducted using speech data collected under variety of vehicle noise conditions demonstrate a word recognition rate improvement in the order of 10% or more under the noisiest conditions. Currently developed to a proof of concept stage there is potential for even greater performance improvement.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose: The purpose of this study was to determine whether adiposity affects the attainment of VO2max. Methods: Sixty-seven male and 68 female overweight (body mass index (BMI) = 25-29.9 kg·m-2) and obese (BMI >= 30 kg·m-2) participants undertook a graded treadmill test to volitional exhaustion (phase 1) followed by a verification test (phase 2) to determine the proportion who could achieve a plateau in VO2 and other "maximal" markers (RER, lactate, HR, RPE). Results: At the end of phase 1, 46% of the participants reached a plateau in VO2, 83% increased HR to within 11 beats of age-predicted maximum, 89% reached an RER of >=1.15, 70% reached a blood lactate concentration of >=8 mmol·L-1, and 74% reached an RPE of >=18. No significant differences between genders and between BMI groups were found with the exception of blood lactate concentration (males = 84% vs females = 56%, P < 0.05). Neither gender nor fatness predicted the number of other markers attained, and attainment of other markers did not differentiate whether a VO2 plateau was achieved. The verification test (phase 2) revealed that an additional 52 individuals (39%) who did not exhibit a plateau in V·O2 in phase 1 had no further increase in VO2 in phase 2 despite an increase in workload. Conclusions: These findings indicate that the absence of a plateau in VO2 alone is not indicative of a failure to reach a true maximal VO2 and that individuals with excessive body fat are no less likely than "normal-weight" individuals to exhibit a plateau in VO2 provided that the protocol is appropriate and encouragement to exercise to maximal exertion is provided.