846 resultados para robust speaker verification


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presently different audio watermarking methods are available; most of them inclined towards copyright protection and copy protection. This is the key motive for the notion to develop a speaker verification scheme that guar- antees non-repudiation services and the thesis is its outcome. The research presented in this thesis scrutinizes the field of audio water- marking and the outcome is a speaker verification scheme that is proficient in addressing issues allied to non-repudiation to a great extent. This work aimed in developing novel audio watermarking schemes utilizing the fun- damental ideas of Fast-Fourier Transform (FFT) or Fast Walsh-Hadamard Transform (FWHT). The Mel-Frequency Cepstral Coefficients (MFCC) the best parametric representation of the acoustic signals along with few other key acoustic characteristics is employed in crafting of new schemes. The au- dio watermark created is entirely dependent to the acoustic features, hence named as FeatureMark and is crucial in this work. In any watermarking scheme, the quality of the extracted watermark de- pends exclusively on the pre-processing action and in this work framing and windowing techniques are involved. The theme non-repudiation provides immense significance in the audio watermarking schemes proposed in this work. Modification of the signal spectrum is achieved in a variety of ways by selecting appropriate FFT/FWHT coefficients and the watermarking schemes were evaluated for imperceptibility, robustness and capacity char- acteristics. The proposed schemes are unequivocally effective in terms of maintaining the sound quality, retrieving the embedded FeatureMark and in terms of the capacity to hold the mark bits. Robust nature of these marking schemes is achieved with the help of syn- chronization codes such as Barker Code with FFT based FeatureMarking scheme and Walsh Code with FWHT based FeatureMarking scheme. An- other important feature associated with this scheme is the employment of an encryption scheme towards the preparation of its FeatureMark that scrambles the signal features that helps to keep the signal features unreve- laed. A comparative study with the existing watermarking schemes and the ex- periments to evaluate imperceptibility, robustness and capacity tests guar- antee that the proposed schemes can be baselined as efficient audio water- marking schemes. The four new digital audio watermarking algorithms in terms of their performance are remarkable thereby opening more opportu- nities for further research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a study on wavelets and their characteristics for the specific purpose of serving as a feature extraction tool for speaker verification (SV), considering a Radial Basis Function (RBF) classifier, which is a particular type of Artificial Neural Network (ANN). Examining characteristics such as support-size, frequency and phase responses, amongst others, we show how Discrete Wavelet Transforms (DWTs), particularly the ones which derive from Finite Impulse Response (FIR) filters, can be used to extract important features from a speech signal which are useful for SV. Lastly, an SV algorithm based on the concepts presented is described.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Speaker Recognition, Speaker Verification, Sparse Kernel Logistic Regression, Support Vector Machine

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Phonation distortion leaves relevant marks in a speaker's biometric profile. Dysphonic voice production may be used for biometrical speaker characterization. In the present paper phonation features derived from the glottal source (GS) parameterization, after vocal tract inversion, is proposed for dysphonic voice characterization in Speaker Verification tasks. The glottal source derived parameters are matched in a forensic evaluation framework defining a distance-based metric specification. The phonation segments used in the study are derived from fillers, long vowels, and other phonation segments produced in spontaneous telephone conversations. Phonated segments from a telephonic database of 100 male Spanish native speakers are combined in a 10-fold cross-validation task to produce the set of quality measurements outlined in the paper. Shimmer, mucosal wave correlate, vocal fold cover biomechanical parameter unbalance and a subset of the GS cepstral profile produce accuracy rates as high as 99.57 for a wide threshold interval (62.08-75.04%). An Equal Error Rate of 0.64 % can be granted. The proposed metric framework is shown to behave more fairly than classical likelihood ratios in supporting the hypothesis of the defense vs that of the prosecution, thus ofering a more reliable evaluation scoring. Possible applications are Speaker Verification and Dysphonic Voice Grading.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tässä diplomityössä perehdytään puhujantunnistukseen ja sen käyttökelpoisuuteen käyttäjän henkilöllisyyden todentamisessa osana puhelinverkon lisäarvopalveluja. Puhelimitse ohjattavat palvelut ovat yleensä perustuneet puhelimen näppäimillä lähetettäviin äänitaajuusvalintoihin. Käyttäjän henkilöllisyydestä on voitu varmistua esimerkiksi käyttäjätunnuksen ja salaisen tunnusluvun perusteella. Tulevaisuudessa palvelut voivat perustua puheentunnistukseen, jolloin myös käyttäjän todentaminen äänen perusteella vaikuttaa järkevältä. Työssä esitellään aluksi erilaisia biometrisiä tunnistamismenetelmiä. Työssä perehdytään tarkemmin äänen perusteella tapahtuvaan puhujan todentamiseen. Työn käytännön osuudessa toteutettiin puhelinverkon palveluihin soveltuva puhujantodennussovelluksen prototyyppi. Työn tarkoituksena oli selvittää teknologian käyttömahdollisuuksia sekä kerätä kokemusta puhujantodennuspalvelun toteuttamisesta tulevaisuutta silmällä pitäen. Prototyypin toteutuksessa ohjelmointikielenä käytettiin Javaa.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Situational awareness is achieved naturally by the human senses of sight and hearing in combination. Automatic scene understanding aims at replicating this human ability using microphones and cameras in cooperation. In this paper, audio and video signals are fused and integrated at different levels of semantic abstractions. We detect and track a speaker who is relatively unconstrained, i.e., free to move indoors within an area larger than the comparable reported work, which is usually limited to round table meetings. The system is relatively simple: consisting of just 4 microphone pairs and a single camera. Results show that the overall multimodal tracker is more reliable than single modality systems, tolerating large occlusions and cross-talk. System evaluation is performed on both single and multi-modality tracking. The performance improvement given by the audio–video integration and fusion is quantified in terms of tracking precision and accuracy as well as speaker diarisation error rate and precision–recall (recognition). Improvements vs. the closest works are evaluated: 56% sound source localisation computational cost over an audio only system, 8% speaker diarisation error rate over an audio only speaker recognition unit and 36% on the precision–recall metric over an audio–video dominant speaker recognition method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Doutor em Engenharia Biomédica

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biometric system performance can be improved by means of data fusion. Several kinds of information can be fused in order to obtain a more accurate classification (identification or verification) of an input sample. In this paper we present a method for computing the weights in a weighted sum fusion for score combinations, by means of a likelihood model. The maximum likelihood estimation is set as a linear programming problem. The scores are derived from a GMM classifier working on a different feature extractor. Our experimental results assesed the robustness of the system in front a changes on time (different sessions) and robustness in front a change of microphone. The improvements obtained were significantly better (error bars of two standard deviations) than a uniform weighted sum or a uniform weighted product or the best single classifier. The proposed method scales computationaly with the number of scores to be fussioned as the simplex method for linear programming.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The verification possibilities of dynamically collimated treatment beams with a scanning liquid ionization chamber electronic portal image device (SLIC-EPID) are investigated. The ion concentration in the liquid of a SLIC-EPID and therefore the read-out signal is determined by two parameters of a differential equation describing the creation and recombination of the ions. Due to the form of this equation, the portal image detector describes a nonlinear dynamic system with memory. In this work, the parameters of the differential equation were experimentally determined for the particular chamber in use and for an incident open 6 MV photon beam. The mathematical description of the ion concentration was then used to predict portal images of intensity-modulated photon beams produced by a dynamic delivery technique, the sliding window approach. Due to the nature of the differential equation, a mathematical condition for 'reliable leaf motion verification' in the sliding window technique can be formulated. It is shown that the time constants for both formation and decay of the equilibrium concentration in the chamber is in the order of seconds. In order to guarantee reliable leaf motion verification, these time constants impose a constraint on the rapidity of the image-read out for a given maximum leaf speed. For a leaf speed of 2 cm s(-1), a minimum image acquisition frequency of about 2 Hz is required. Current SLIC-EPID systems are usually too slow since they need about a second to acquire a portal image. However, if the condition is fulfilled, the memory property of the system can be used to reconstruct the leaf motion. It is shown that a simple edge detecting algorithm can be employed to determine the leaf positions. The method is also very robust against image noise.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a global description of a telematic voting system based on advanced cryptography and on the use of smart cards (VOTESCRIPT system) whose most outstanding characteristic is the ability to verify that the tally carried out by the system is correct, meaning that the results published by the system correspond with votes cast. The VOTESCRIPT system provides an individual verification mechanism allowing each Voter to confirm whether his vote has been correctly counted. The innovation with respect to other solutions lies in the fact that the verification process is private so that Voters have no way of proving what they voted in the presence of a non-authorized third party. Vote buying and selling or any other kind of extortion are prevented. The existence of the Intervention Systems allows the whole electoral process to be controlled by groups of citizens or authorized candidatures. In addition to this the system can simply make an audit not only of the final results, but also of the whole process. Global verification provides the Scrutineers with robust cryptographic evidence which enables unequivocal proof if the system has operated in a fraudulent way.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Five axis machine tools are increasing and becoming more popular as customers demand more complex machined parts. In high value manufacturing, the importance of machine tools in producing high accuracy products is essential. High accuracy manufacturing requires producing parts in a repeatable manner and precision in compliance to the defined design specifications. The performance of the machine tools is often affected by geometrical errors due to a variety of causes including incorrect tool offsets, errors in the centres of rotation and thermal growth. As a consequence, it can be difficult to produce highly accurate parts consistently. It is, therefore, essential to ensure that machine tools are verified in terms of their geometric and positioning accuracy. When machine tools are verified in terms of their accuracy, the resulting numerical values of positional accuracy and process capability can be used to define design for verification rules and algorithms so that machined parts can be easily produced without scrap and little or no after process measurement. In this paper the benefits of machine tool verification are listed and a case study is used to demonstrate the implementation of robust machine tool performance measurement and diagnostics using a ballbar system.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our modular approach to data hiding is an innovative concept in the data hiding research field. It enables the creation of modular digital watermarking methods that have extendable features and are designed for use in web applications. The methods consist of two types of modules – a basic module and an application-specific module. The basic module mainly provides features which are connected with the specific image format. As JPEG is a preferred image format on the Internet, we have put a focus on the achievement of a robust and error-free embedding and retrieval of the embedded data in JPEG images. The application-specific modules are adaptable to user requirements in the concrete web application. The experimental results of the modular data watermarking are very promising. They indicate excellent image quality, satisfactory size of the embedded data and perfect robustness against JPEG transformations with prespecified compression ratios. ACM Computing Classification System (1998): C.2.0.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

People depend on various sources of information when trying to verify their autobiographical memories. Yet recent research shows that people prefer to use cheap-and-easy verification strategies, even when these strategies are not reliable. We examined the robustness of this cheap strategy bias, with scenarios designed to encourage greater emphasis on source reliability. In three experiments, subjects described real (Experiments 1 and 2) or hypothetical (Experiment 3) autobiographical events, and proposed strategies they might use to verify their memories of those events. Subjects also rated the reliability, cost, and the likelihood that they would use each strategy. In line with previous work, we found that the preference for cheap information held when people described how they would verify childhood or recent memories (Experiment 1); personally-important or trivial memories (Experiment 2), and even when the consequences of relying on incorrect information could be significant (Experiment 3). Taken together, our findings fit with an account of source monitoring in which the tendency to trust one’s own autobiographical memories can discourage people from systematically testing or accepting strong disconfirmatory evidence.