898 resultados para QUT Speaker Identity Verification System
Resumo:
Literally, the word compliance suggests conformity in fulfilling official requirements. The thesis presents the results of the analysis and design of a class of protocols called compliant cryptologic protocols (CCP). The thesis presents a notion for compliance in cryptosystems that is conducive as a cryptologic goal. CCP are employed in security systems used by at least two mutually mistrusting sets of entities. The individuals in the sets of entities only trust the design of the security system and any trusted third party the security system may include. Such a security system can be thought of as a broker between the mistrusting sets of entities. In order to provide confidence in operation for the mistrusting sets of entities, CCP must provide compliance verification mechanisms. These mechanisms are employed either by all the entities or a set of authorised entities in the system to verify the compliance of the behaviour of various participating entities with the rules of the system. It is often stated that confidentiality, integrity and authentication are the primary interests of cryptology. It is evident from the literature that authentication mechanisms employ confidentiality and integrity services to achieve their goal. Therefore, the fundamental services that any cryptographic algorithm may provide are confidentiality and integrity only. Since controlling the behaviour of the entities is not a feasible cryptologic goal,the verification of the confidentiality of any data is a futile cryptologic exercise. For example, there exists no cryptologic mechanism that would prevent an entity from willingly or unwillingly exposing its private key corresponding to a certified public key. The confidentiality of the data can only be assumed. Therefore, any verification in cryptologic protocols must take the form of integrity verification mechanisms. Thus, compliance verification must take the form of integrity verification in cryptologic protocols. A definition of compliance that is conducive as a cryptologic goal is presented as a guarantee on the confidentiality and integrity services. The definitions are employed to provide a classification mechanism for various message formats in a cryptologic protocol. The classification assists in the characterisation of protocols, which assists in providing a focus for the goals of the research. The resulting concrete goal of the research is the study of those protocols that employ message formats to provide restricted confidentiality and universal integrity services to selected data. The thesis proposes an informal technique to understand, analyse and synthesise the integrity goals of a protocol system. The thesis contains a study of key recovery,electronic cash, peer-review, electronic auction, and electronic voting protocols. All these protocols contain message format that provide restricted confidentiality and universal integrity services to selected data. The study of key recovery systems aims to achieve robust key recovery relying only on the certification procedure and without the need for tamper-resistant system modules. The result of this study is a new technique for the design of key recovery systems called hybrid key escrow. The thesis identifies a class of compliant cryptologic protocols called secure selection protocols (SSP). The uniqueness of this class of protocols is the similarity in the goals of the member protocols, namely peer-review, electronic auction and electronic voting. The problem statement describing the goals of these protocols contain a tuple,(I, D), where I usually refers to an identity of a participant and D usually refers to the data selected by the participant. SSP are interested in providing confidentiality service to the tuple for hiding the relationship between I and D, and integrity service to the tuple after its formation to prevent the modification of the tuple. The thesis provides a schema to solve the instances of SSP by employing the electronic cash technology. The thesis makes a distinction between electronic cash technology and electronic payment technology. It will treat electronic cash technology to be a certification mechanism that allows the participants to obtain a certificate on their public key, without revealing the certificate or the public key to the certifier. The thesis abstracts the certificate and the public key as the data structure called anonymous token. It proposes design schemes for the peer-review, e-auction and e-voting protocols by employing the schema with the anonymous token abstraction. The thesis concludes by providing a variety of problem statements for future research that would further enrich the literature.
Resumo:
Automatic spoken Language Identi¯cation (LID) is the process of identifying the language spoken within an utterance. The challenge that this task presents is that no prior information is available indicating the content of the utterance or the identity of the speaker. The trend of globalization and the pervasive popularity of the Internet will amplify the need for the capabilities spoken language identi¯ca- tion systems provide. A prominent application arises in call centers dealing with speakers speaking di®erent languages. Another important application is to index or search huge speech data archives and corpora that contain multiple languages. The aim of this research is to develop techniques targeted at producing a fast and more accurate automatic spoken LID system compared to the previous National Institute of Standards and Technology (NIST) Language Recognition Evaluation. Acoustic and phonetic speech information are targeted as the most suitable fea- tures for representing the characteristics of a language. To model the acoustic speech features a Gaussian Mixture Model based approach is employed. Pho- netic speech information is extracted using existing speech recognition technol- ogy. Various techniques to improve LID accuracy are also studied. One approach examined is the employment of Vocal Tract Length Normalization to reduce the speech variation caused by di®erent speakers. A linear data fusion technique is adopted to combine the various aspects of information extracted from speech. As a result of this research, a LID system was implemented and presented for evaluation in the 2003 Language Recognition Evaluation conducted by the NIST.
Resumo:
This paper proposes the use of the Bayes Factor as a distance metric for speaker segmentation within a speaker diarization system. The proposed approach uses a pair of constant sized, sliding windows to compute the value of the Bayes Factor between the adjacent windows over the entire audio. Results obtained on the 2002 Rich Transcription Evaluation dataset show an improved segmentation performance compared to previous approaches reported in literature using the Generalized Likelihood Ratio. When applied in a speaker diarization system, this approach results in a 5.1% relative improvement in the overall Diarization Error Rate compared to the baseline.
Resumo:
In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoken audio across multiple recordings attributed to speaker identities. In this paper, an attribution system is proposed using mean-only MAP adaptation of a combined-gender UBM to model clusters from a perfect diarization system, as well as a JFA-based system with session variability compensation. The normalized cross-likelihood ratio is calculated for each pair of clusters to construct an attribution matrix and the complete linkage algorithm is employed to conduct clustering of the inter-session clusters. A matched cluster purity and coverage of 87.1% was obtained on the NIST 2008 SRE corpus.
Resumo:
The major purpose of Vehicular Ad Hoc Networks (VANETs) is to provide safety-related message access for motorists to react or make a life-critical decision for road safety enhancement. Accessing safety-related information through the use of VANET communications, therefore, must be protected, as motorists may make critical decisions in response to emergency situations in VANETs. If introducing security services into VANETs causes considerable transmission latency or processing delays, this would defeat the purpose of using VANETs to improve road safety. Current research in secure messaging for VANETs appears to focus on employing certificate-based Public Key Cryptosystem (PKC) to support security. The security overhead of such a scheme, however, creates a transmission delay and introduces a time-consuming verification process to VANET communications. This paper proposes an efficient public key management system for VANETs: the Public Key Registry (PKR) system. Not only does this paper demonstrate that the proposed PKR system can maintain security, but it also asserts that it can improve overall performance and scalability at a lower cost, compared to the certificate-based PKC scheme. It is believed that the proposed PKR system will create a new dimension to the key management and verification services for VANETs.
An approach to statistical lip modelling for speaker identification via chromatic feature extraction
Resumo:
This paper presents a novel technique for the tracking of moving lips for the purpose of speaker identification. In our system, a model of the lip contour is formed directly from chromatic information in the lip region. Iterative refinement of contour point estimates is not required. Colour features are extracted from the lips via concatenated profiles taken around the lip contour. Reduction of order in lip features is obtained via principal component analysis (PCA) followed by linear discriminant analysis (LDA). Statistical speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments performed on the M2VTS1 database, show encouraging results
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.
Resumo:
Usability in HCI (Human-Computer Interaction) is normally understood as the simplicity and clarity with which the interaction with a computer program or a web site is designed. Identity management systems need to provide adequate usability and should have a simple and intuitive interface. The system should not only be designed to satisfy service provider requirements but it has to consider user requirements, otherwise it will lead to inconvenience and poor usability for users when managing their identities. With poor usability and a poor user interface with regard to security, it is highly likely that the system will have poor security. The rapid growth in the number of online services leads to an increasing number of different digital identities each user needs to manage. As a result, many people feel overloaded with credentials, which in turn negatively impacts their ability to manage them securely. Passwords are perhaps the most common type of credential used today. To avoid the tedious task of remembering difficult passwords, users often behave less securely by using low entropy and weak passwords. Weak passwords and bad password habits represent security threats to online services. Some solutions have been developed to eliminate the need for users to create and manage passwords. A typical solution is based on generating one-time passwords, i.e. passwords for single session or transaction usage. Unfortunately, most of these solutions do not satisfy scalability and/or usability requirements, or they are simply insecure. In this thesis, the security and usability aspects of contemporary methods for authentication based on one-time passwords (OTP) are examined and analyzed. In addition, more scalable solutions that provide a good user experience while at the same time preserving strong security are proposed.
Resumo:
Robust speaker verification on short utterances remains a key consideration when deploying automatic speaker recognition, as many real world applications often have access to only limited duration speech data. This paper explores how the recent technologies focused around total variability modeling behave when training and testing utterance lengths are reduced. Results are presented which provide a comparison of Joint Factor Analysis (JFA) and i-vector based systems including various compensation techniques; Within-Class Covariance Normalization (WCCN), LDA, Scatter Difference Nuisance Attribute Projection (SDNAP) and Gaussian Probabilistic Linear Discriminant Analysis (GPLDA). Speaker verification performance for utterances with as little as 2 sec of data taken from the NIST Speaker Recognition Evaluations are presented to provide a clearer picture of the current performance characteristics of these techniques in short utterance conditions.
Resumo:
Radiotherapy is a cancer treatment modality in which a dose of ionising radiation is delivered to a tumour. The accurate calculation of the dose to the patient is very important in the design of an effective therapeutic strategy. This study aimed to systematically examine the accuracy of the radiotherapy dose calculations performed by clinical treatment planning systems by comparison againstMonte Carlo simulations of the treatment delivery. A suite of software tools known as MCDTK (Monte Carlo DICOM ToolKit) was developed for this purpose, and is capable of: • Importing DICOM-format radiotherapy treatment plans and producing Monte Carlo simulation input files (allowing simple simulation of complex treatments), and calibrating the results; • Analysing the predicted doses of and deviations between the Monte Carlo simulation results and treatment planning system calculations in regions of interest (tumours and organs-at-risk) and generating dose-volume histograms, so that conformity with dose prescriptions can be evaluated. The code has been tested against various treatment planning systems, linear acceleratormodels and treatment complexities. Six clinical head and neck cancer treatments were simulated and the results analysed using this software. The deviations were greatest where the treatment volume encompassed tissues on both sides of an air cavity. This was likely due to the method the planning system used to model low density media.
Resumo:
For teachers working in a standards-based assessment system, professional conversations through organised social moderation meetings are a vital element. This qualitative research investigated the learning that occurred as a result of online moderation discussions. Findings illustrate how participating in social moderation meetings in an online context can support teachers to understand themselves as assessors, and can provide opportunities for teachers to imagine possibilities for their teaching that move beyond the moderation practice.
Resumo:
The quality assurance of stereotactic radiotherapy and radiosurgery treatments requires the use of small-field dose measurements that can be experimentally challenging. This study used Monte Carlo simulations to establish that PAGAT dosimetry gel can be used to provide accurate, high resolution, three-dimensional dose measurements of stereotactic radiotherapy fields. A small cylindrical container (4 cm height, 4.2 cm diameter) was filled with PAGAT gel, placed in the parietal region inside a CIRS head phantom, and irradiated with a 12 field stereotactic radiotherapy plan. The resulting three-dimensional dose measurement was read out using an optical CT scanner and compared with the treatment planning prediction of the dose delivered to the gel during the treatment. A BEAMnrc DOSXYZnrc simulation of this treatment was completed, to provide a standard against which the accuracy of the gel measurement could be gauged. The three dimensional dose distributions obtained from Monte Carlo and from the gel measurement were found to be in better agreement with each other than with the dose distribution provided by the treatment planning system's pencil beam calculation. Both sets of data showed close agreement with the treatment planning system's dose distribution through the centre of the irradiated volume and substantial disagreement with the treatment planning system at the penumbrae. The Monte Carlo calculations and gel measurements both indicated that the treated volume was up to 3 mm narrower, with steeper penumbrae and more variable out-of-field dose, than predicted by the treatment planning system. The Monte Carlo simulations allowed the accuracy of the PAGAT gel dosimeter to be verified in this case, allowing PAGAT gel to be utilised in the measurement of dose from stereotactic and other radiotherapy treatments, with greater confidence in the future.
Resumo:
Is it possible to control identities using performance management systems (PMSs)? This paper explores the theoretical fusion of management accounting and identity studies, providing a synthesised view of control, PMSs and identification processes. It argues that the effective use of PMSs generates a range of obtrusive mechanistic and unobtrusive organic controls that mediate identification processes to achieve a high level of identity congruency between individuals and collectives—groups and organisations. This paper contends that mechanistic control of PMSs provides sensebreaking effects and also creates structural conditions for sensegiving in top-down identification processes. These processes encourage individuals to continue the bottom-up processes of sensemaking, enacting identity and constructing identity narratives. Over time, PMS activities and conversations periodically mediate several episode(s) of identification to connect past, current and future identities. To explore this relationship, the dual locus of control—collectives and individuals—is emphasised to explicate their interplay. This multidisciplinary approach contributes to explaining the multidirectional effects of PMSs in obtrusive as well as unobtrusive ways, in order to control the nature of collectives and individuals in organisations.