898 resultados para QUT Speaker Identity Verification System
Resumo:
Reliability of the performance of biometric identity verification systems remains a significant challenge. Individual biometric samples of the same person (identity class) are not identical at each presentation and performance degradation arises from intra-class variability and inter-class similarity. These limitations lead to false accepts and false rejects that are dependent. It is therefore difficult to reduce the rate of one type of error without increasing the other. The focus of this dissertation is to investigate a method based on classifier fusion techniques to better control the trade-off between the verification errors using text-dependent speaker verification as the test platform. A sequential classifier fusion architecture that integrates multi-instance and multisample fusion schemes is proposed. This fusion method enables a controlled trade-off between false alarms and false rejects. For statistically independent classifier decisions, analytical expressions for each type of verification error are derived using base classifier performances. As this assumption may not be always valid, these expressions are modified to incorporate the correlation between statistically dependent decisions from clients and impostors. The architecture is empirically evaluated by applying the proposed architecture for text dependent speaker verification using the Hidden Markov Model based digit dependent speaker models in each stage with multiple attempts for each digit utterance. The trade-off between the verification errors is controlled using the parameters, number of decision stages (instances) and the number of attempts at each decision stage (samples), fine-tuned on evaluation/tune set. The statistical validation of the derived expressions for error estimates is evaluated on test data. The performance of the sequential method is further demonstrated to depend on the order of the combination of digits (instances) and the nature of repetitive attempts (samples). The false rejection and false acceptance rates for proposed fusion are estimated using the base classifier performances, the variance in correlation between classifier decisions and the sequence of classifiers with favourable dependence selected using the 'Sequential Error Ratio' criteria. The error rates are better estimated by incorporating user-dependent (such as speaker-dependent thresholds and speaker-specific digit combinations) and class-dependent (such as clientimpostor dependent favourable combinations and class-error based threshold estimation) information. The proposed architecture is desirable in most of the speaker verification applications such as remote authentication, telephone and internet shopping applications. The tuning of parameters - the number of instances and samples - serve both the security and user convenience requirements of speaker-specific verification. The architecture investigated here is applicable to verification using other biometric modalities such as handwriting, fingerprints and key strokes.
Resumo:
In a digital world, users’ Personally Identifiable Information (PII) is normally managed with a system called an Identity Management System (IMS). There are many types of IMSs. There are situations when two or more IMSs need to communicate with each other (such as when a service provider needs to obtain some identity information about a user from a trusted identity provider). There could be interoperability issues when communicating parties use different types of IMS. To facilitate interoperability between different IMSs, an Identity Meta System (IMetS) is normally used. An IMetS can, at least theoretically, join various types of IMSs to make them interoperable and give users the illusion that they are interacting with just one IMS. However, due to the complexity of an IMS, attempting to join various types of IMSs is a technically challenging task, let alone assessing how well an IMetS manages to integrate these IMSs. The first contribution of this thesis is the development of a generic IMS model called the Layered Identity Infrastructure Model (LIIM). Using this model, we develop a set of properties that an ideal IMetS should provide. This idealized form is then used as a benchmark to evaluate existing IMetSs. Different types of IMS provide varying levels of privacy protection support. Unfortunately, as observed by Jøsang et al (2007), there is insufficient privacy protection in many of the existing IMSs. In this thesis, we study and extend a type of privacy enhancing technology known as an Anonymous Credential System (ACS). In particular, we extend the ACS which is built on the cryptographic primitives proposed by Camenisch, Lysyanskaya, and Shoup. We call this system the Camenisch, Lysyanskaya, Shoup - Anonymous Credential System (CLS-ACS). The goal of CLS-ACS is to let users be as anonymous as possible. Unfortunately, CLS-ACS has problems, including (1) the concentration of power to a single entity - known as the Anonymity Revocation Manager (ARM) - who, if malicious, can trivially reveal a user’s PII (resulting in an illegal revocation of the user’s anonymity), and (2) poor performance due to the resource-intensive cryptographic operations required. The second and third contributions of this thesis are the proposal of two protocols that reduce the trust dependencies on the ARM during users’ anonymity revocation. Both protocols distribute trust from the ARM to a set of n referees (n > 1), resulting in a significant reduction of the probability of an anonymity revocation being performed illegally. The first protocol, called the User Centric Anonymity Revocation Protocol (UCARP), allows a user’s anonymity to be revoked in a user-centric manner (that is, the user is aware that his/her anonymity is about to be revoked). The second protocol, called the Anonymity Revocation Protocol with Re-encryption (ARPR), allows a user’s anonymity to be revoked by a service provider in an accountable manner (that is, there is a clear mechanism to determine which entity who can eventually learn - and possibly misuse - the identity of the user). The fourth contribution of this thesis is the proposal of a protocol called the Private Information Escrow bound to Multiple Conditions Protocol (PIEMCP). This protocol is designed to address the performance issue of CLS-ACS by applying the CLS-ACS in a federated single sign-on (FSSO) environment. Our analysis shows that PIEMCP can both reduce the amount of expensive modular exponentiation operations required and lower the risk of illegal revocation of users’ anonymity. Finally, the protocols proposed in this thesis are complex and need to be formally evaluated to ensure that their required security properties are satisfied. In this thesis, we use Coloured Petri nets (CPNs) and its corresponding state space analysis techniques. All of the protocols proposed in this thesis have been formally modeled and verified using these formal techniques. Therefore, the fifth contribution of this thesis is a demonstration of the applicability of CPN and its corresponding analysis techniques in modeling and verifying privacy enhancing protocols. To our knowledge, this is the first time that CPN has been comprehensively applied to model and verify privacy enhancing protocols. From our experience, we also propose several CPN modeling approaches, including complex cryptographic primitives (such as zero-knowledge proof protocol) modeling, attack parameterization, and others. The proposed approaches can be applied to other security protocols, not just privacy enhancing protocols.
Resumo:
There is no doubt that fraud in relation to land transactions is a problem that resonates amongst land academics, practitioners, and stakeholders involved in conveyancing. As each land registration and conveyancing process increasingly moves towards a fully electronic environment, we need to make sure that we understand and guard against the frauds that can occur. What this paper does is examine the types of fraud that have occurred in paper-based conveyancing systems in Australia and considers how they might be undertaken in the National Electronic Conveyancing System (NECS) that is currently under development. Whilst no system can ever be infallible, it is suggested that by correctly imposing the responsibility for identity verification on the appropriate individual, the conveyancing system adopted can achieve the optimum level of fairness in terms of allocation of responsibility and loss. As we sit on the cusp of a new era of electronic conveyancing, the framework suggested here provides a model for minimising the risks of forged mortgages and appropriately allocating the loss. Importantly it also recognises that the electronic environment will see new opportunities for those with criminal intent to undermine the integrity of land transactions. An appreciation of this now, can see the appropriate measures put in place to minimise the risk.
Resumo:
Fusion techniques have received considerable attention for achieving performance improvement with biometrics. While a multi-sample fusion architecture reduces false rejects, it also increases false accepts. This impact on performance also depends on the nature of subsequent attempts, i.e., random or adaptive. Expressions for error rates are presented and experimentally evaluated in this work by considering the multi-sample fusion architecture for text-dependent speaker verification using HMM based digit dependent speaker models. Analysis incorporating correlation modeling demonstrates that the use of adaptive samples improves overall fusion performance compared to randomly repeated samples. For a text dependent speaker verification system using digit strings, sequential decision fusion of seven instances with three random samples is shown to reduce the overall error of the verification system by 26% which can be further reduced by 6% for adaptive samples. This analysis novel in its treatment of random and adaptive multiple presentations within a sequential fused decision architecture, is also applicable to other biometric modalities such as finger prints and handwriting samples.
Resumo:
Statistical dependence between classifier decisions is often shown to improve performance over statistically independent decisions. Though the solution for favourable dependence between two classifier decisions has been derived, the theoretical analysis for the general case of 'n' client and impostor decision fusion has not been presented before. This paper presents the expressions developed for favourable dependence of multi-instance and multi-sample fusion schemes that employ 'AND' and 'OR' rules. The expressions are experimentally evaluated by considering the proposed architecture for text-dependent speaker verification using HMM based digit dependent speaker models. The improvement in fusion performance is found to be higher when digit combinations with favourable client and impostor decisions are used for speaker verification. The total error rate of 20% for fusion of independent decisions is reduced to 2.1% for fusion of decisions that are favourable for both client and impostors. The expressions developed here are also applicable to other biometric modalities, such as finger prints and handwriting samples, for reliable identity verification.
Resumo:
This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.
Resumo:
This paper investigates the use of mel-frequency deltaphase (MFDP) features in comparison to, and in fusion with, traditional mel-frequency cepstral coefficient (MFCC) features within joint factor analysis (JFA) speaker verification. MFCC features, commonly used in speaker recognition systems, are derived purely from the magnitude spectrum, with the phase spectrum completely discarded. In this paper, we investigate if features derived from the phase spectrum can provide additional speaker discriminant information to the traditional MFCC approach in a JFA based speaker verification system. Results are presented which provide a comparison of MFCC-only, MFDPonly and score fusion of the two approaches within a JFA speaker verification approach. Based upon the results presented using the NIST 2008 Speaker Recognition Evaluation (SRE) dataset, we believe that, while MFDP features alone cannot compete with MFCC features, MFDP can provide complementary information that result in improved speaker verification performance when both approaches are combined in score fusion, particularly in the case of shorter utterances.
Resumo:
A significant amount of speech is typically required for speaker verification system development and evaluation, especially in the presence of large intersession variability. This paper introduces a source and utterance duration normalized linear discriminant analysis (SUN-LDA) approaches to compensate session variability in short-utterance i-vector speaker verification systems. Two variations of SUN-LDA are proposed where normalization techniques are used to capture source variation from both short and full-length development i-vectors, one based upon pooling (SUN-LDA-pooled) and the other on concatenation (SUN-LDA-concat) across the duration and source-dependent session variation. Both the SUN-LDA-pooled and SUN-LDA-concat techniques are shown to provide improvement over traditional LDA on NIST 08 truncated 10sec-10sec evaluation conditions, with the highest improvement obtained with the SUN-LDA-concat technique achieving a relative improvement of 8% in EER for mis-matched conditions and over 3% for matched conditions over traditional LDA approaches.
Resumo:
The thesis presented in this paper is that the land fraud committed by Matthew Perrin in Queensland and inflicted upon Roger Mildenhall in Western Australia demonstrates the need for urgent procedural reform to the conveyancing process. Should this not occur, then calls to reform the substantive principles of the Torrens system will be heard throughout the jurisdictions that adopt title by registration, particularly in those places where immediate indefeasibility is still the norm. This paper closely examines the factual matrix behind both of these frauds, and asks what steps should have been taken to prevent them occurring. With 2012 bringing us Australian legislation embedding a national e-conveyancing system and a new Land Transfer Act for New Zealand we ask what legislative measures should be introduced to minimise the potential for such fraud. In undertaking this study, we reflect on whether the activities of Perrin and the criminals responsible for stealing Mildenhall's land would have succeeded under the present system for automated registration utilised in New Zealand.
Resumo:
This research makes a major contribution which enables efficient searching and indexing of large archives of spoken audio based on speaker identity. It introduces a novel technique dubbed as “speaker attribution” which is the task of automatically determining ‘who spoke when?’ in recordings and then automatically linking the unique speaker identities within each recording across multiple recordings. The outcome of the research will also have significant impact in improving the performance of automatic speech recognition systems through the extracted speaker identities.
Resumo:
Classifier selection is a problem encountered by multi-biometric systems that aim to improve performance through fusion of decisions. A particular decision fusion architecture that combines multiple instances (n classifiers) and multiple samples (m attempts at each classifier) has been proposed in previous work to achieve controlled trade-off between false alarms and false rejects. Although analysis on text-dependent speaker verification has demonstrated better performance for fusion of decisions with favourable dependence compared to statistically independent decisions, the performance is not always optimal. Given a pool of instances, best performance with this architecture is obtained for certain combination of instances. Heuristic rules and diversity measures have been commonly used for classifier selection but it is shown that optimal performance is achieved for the `best combination performance' rule. As the search complexity for this rule increases exponentially with the addition of classifiers, a measure - the sequential error ratio (SER) - is proposed in this work that is specifically adapted to the characteristics of sequential fusion architecture. The proposed measure can be used to select a classifier that is most likely to produce a correct decision at each stage. Error rates for fusion of text-dependent HMM based speaker models using SER are compared with other classifier selection methodologies. SER is shown to achieve near optimal performance for sequential fusion of multiple instances with or without the use of multiple samples. The methodology applies to multiple speech utterances for telephone or internet based access control and to other systems such as multiple finger print and multiple handwriting sample based identity verification systems.
Resumo:
New Zealand and Australia are leading the world in terms of automated land registry systems. Landonline was introduced some ten years ago for New Zealand, and the Electronic Conveyancing National Law (ECNL) is to be released over the next few years in support of a national electronic conveyancing system to be used throughout Australia. With the assistance of three proof requirements, developed for this purpose, this article measures the integrity of both systems as against the old, manual Torrens system. The authors take the position that any introduced system should at least have the same level of integrity and safety as the originally conceived manual system. The authors argue both Landonline and ECNL, as presently set up, have less credibility than the manual system as it was designed to operate, leading to the possibility of increased fraud or misuse.
Resumo:
This paper analyzes the limitations upon the amount of in- domain (NIST SREs) data required for training a probabilistic linear discriminant analysis (PLDA) speaker verification system based on out-domain (Switchboard) total variability subspaces. By limiting the number of speakers, the number of sessions per speaker and the length of active speech per session available in the target domain for PLDA training, we investigated the relative effect of these three parameters on PLDA speaker verification performance in the NIST 2008 and NIST 2010 speaker recognition evaluation datasets. Experimental results indicate that while these parameters depend highly on each other, to beat out-domain PLDA training, more than 10 seconds of active speech should be available for at least 4 sessions/speaker for a minimum of 800 speakers. If further data is available, considerable improvement can be made over solely out-domain PLDA training.
Resumo:
In recent years a great deal of case law has been generated in relation to mortgages where the mortgagee has not engaged in adequate identity verification of the mortgagor and the mortgage has subsequently been found to be forged. As a result, careless mortgagee provisions operate in Queensland as an exception to indefeasibility. Similar provisions are expected to commence soon in New South Wales. This article examines the mortgagee’s position with the benefit of indefeasibility and then considers the impact of the careless mortgagee provisions on the rights of a mortgagee under a forged mortgage, concluding that the provisions significantly change the dynamic between a registered mortgagee and registered owner who has not signed the mortgage. These provisions appear to give the mortgagee a conditional indefeasibility, with the intention of reducing the State’s exposure to the payment of compensation in the case of identity fraud. They are however, more successful in the case of forgery by a third party rather than forgery by a co-owner.
Resumo:
Texture information in the iris image is not uniform in discriminatory information content for biometric identity verification. The bits in an iris code obtained from the image differ in their consistency from one sample to another for the same identity. In this work, errors in bit strings are systematically analysed in order to investigate the effect of light-induced and drug-induced pupil dilation and constriction on the consistency of iris texture information. The statistics of bit errors are computed for client and impostor distributions as functions of radius and angle. Under normal conditions, a V-shaped radial trend of decreasing bit errors towards the central region of the iris is obtained for client matching, and it is observed that the distribution of errors as a function of angle is uniform. When iris images are affected by pupil dilation or constriction the radial distribution of bit errors is altered. A decreasing trend from the pupil outwards is observed for constriction, whereas a more uniform trend is observed for dilation. The main increase in bit errors occurs closer to the pupil in both cases.