109 resultados para OC-SVM

em Queensland University of Technology - ePrints Archive


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of impostor dataset selection for GMM-based speaker verification is addressed through the recently proposed data-driven background dataset refinement technique. The SVM-based refinement technique selects from a candidate impostor dataset those examples that are most frequently selected as support vectors when training a set of SVMs on a development corpus. This study demonstrates the versatility of dataset refinement in the task of selecting suitable impostor datasets for use in GMM-based speaker verification. The use of refined Z- and T-norm datasets provided performance gains of 15% in EER in the NIST 2006 SRE over the use of heuristically selected datasets. The refined datasets were shown to generalise well to the unseen data of the NIST 2008 SRE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A data-driven background dataset refinement technique was recently proposed for SVM based speaker verification. This method selects a refined SVM background dataset from a set of candidate impostor examples after individually ranking examples by their relevance. This paper extends this technique to the refinement of the T-norm dataset for SVM-based speaker verification. The independent refinement of the background and T-norm datasets provides a means of investigating the sensitivity of SVM-based speaker verification performance to the selection of each of these datasets. Using refined datasets provided improvements of 13% in min. DCF and 9% in EER over the full set of impostor examples on the 2006 SRE corpus with the majority of these gains due to refinement of the T-norm dataset. Similar trends were observed for the unseen data of the NIST 2008 SRE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents Scatter Difference Nuisance Attribute Projection (SD-NAP) as an enhancement to NAP for SVM-based speaker verification. While standard NAP may inadvertently remove desirable speaker variability, SD-NAP explicitly de-emphasises this variability by incorporating a weighted version of the between-class scatter into the NAP optimisation criterion. Experimental evaluation of SD-NAP with a variety of SVM systems on the 2006 and 2008 NIST SRE corpora demonstrate that SD-NAP provides improved verification performance over standard NAP in most cases, particularly at the EER operating point.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recently proposed data-driven background dataset refinement technique provides a means of selecting an informative background for support vector machine (SVM)-based speaker verification systems. This paper investigates the characteristics of the impostor examples in such highly-informative background datasets. Data-driven dataset refinement individually evaluates the suitability of candidate impostor examples for the SVM background prior to selecting the highest-ranking examples as a refined background dataset. Further, the characteristics of the refined dataset were analysed to investigate the desired traits of an informative SVM background. The most informative examples of the refined dataset were found to consist of large amounts of active speech and distinctive language characteristics. The data-driven refinement technique was shown to filter the set of candidate impostor examples to produce a more disperse representation of the impostor population in the SVM kernel space, thereby reducing the number of redundant and less-informative examples in the background dataset. Furthermore, data-driven refinement was shown to provide performance gains when applied to the difficult task of refining a small candidate dataset that was mis-matched to the evaluation conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ability to accurately predict the remaining useful life of machine components is critical for machine continuous operation, and can also improve productivity and enhance system safety. In condition-based maintenance (CBM), maintenance is performed based on information collected through condition monitoring and an assessment of the machine health. Effective diagnostics and prognostics are important aspects of CBM for maintenance engineers to schedule a repair and to acquire replacement components before the components actually fail. All machine components are subjected to degradation processes in real environments and they have certain failure characteristics which can be related to the operating conditions. This paper describes a technique for accurate assessment of the remnant life of machines based on health state probability estimation and involving historical knowledge embedded in the closed loop diagnostics and prognostics systems. The technique uses a Support Vector Machine (SVM) classifier as a tool for estimating health state probability of machine degradation, which can affect the accuracy of prediction. To validate the feasibility of the proposed model, real life historical data from bearings of High Pressure Liquefied Natural Gas (HP-LNG) pumps were analysed and used to obtain the optimal prediction of remaining useful life. The results obtained were very encouraging and showed that the proposed prognostic system based on health state probability estimation has the potential to be used as an estimation tool for remnant life prediction in industrial machinery.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The extensive use of alkoxyamines in controlled radical polymerisation and polymer stabilisation is based on rapid cycling between the alkoxyamine (R1R2NO–R3) and a stable nitroxyl radical (R1R2NO•) via homolysis of the labile O–C bond. Competing homolysis of the alkoxyamine N–O bond has been predicted to occur for some substituents leading to production of aminyl and alkoxyl radicals. This intrinsic competition between the O–C and N–O bond homolysis processes has to this point been difficult to probe experimentally. Herein we examine the effect of local molecular structure on the competition between N–O and O–C bond cleavage in the gas phase by variable energy tandem mass spectrometry in a triple quadrupole mass spectrometer. A suite of cyclic alkoxyamines with remote carboxylic acid moieties (HOOC–R1R2NO–R3) were synthesised and subjected to negative ion electrospray ionisation to yield [M – H]− anions where the charge is remote from the alkoxyamine moiety. Collision-induced dissociation of these anions yield product ions resulting, almost exclusively, from homolysis of O–C and/or N–O bonds. The relative efficacy of N–O and O–C bond homolysis was examined for alkoxyamines incorporating different R3 substituents by varying the potential difference applied to the collision cell, and comparing dissociation thresholds of each product ion channel. For most R3 substituents, product ions from homolysis of the O–C bond are observed and product ions resulting from cleavage of the N–O bond are minor or absent. A limited number of examples were encountered however, where N–O homolysis is a competitive dissociation pathway because the O–C bond is stabilised by adjacent heteroatom(s) (e.g., R3 = CH2F). The dissociation threshold energies were compared for different alkoxyamine substituents (R3) and the relative ordering of these experimentally determined energies is shown to correlate with the bond dissociation free energies, calculated by ab initio methods. Understanding the structure-dependent relationship between these rival processes will assist in the design and selection of alkoxyamine motifs that selectively promote the desirable O–C homolysis pathway.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lipooligosaccharide (LOS) is a complex surface structure that is linked to many pathogenic properties of Acinetobacter baumannii. In A. baumannii, the genes responsible for the synthesis of the outer core (OC) component of the LOS are located between ilvE and aspS. The content of the OC locus is usually variable within a species, and examination of 6 complete and 227 draft A. baumannii genome sequences available in GenBank non-redundant and Whole Genome Shotgun databases revealed nine distinct new types, OCL4-OCL12, in addition to the three known ones. The twelve gene clusters fell into two distinct groups, designated Group A and Group B, based on similarities in the genes present. OCL6 (Group B) was unique in that it included genes for the synthesis of L-Rhamnosep. Genetic exchange of the different configurations between strains has occurred as some OC forms were found in several different sequence types (STs). OCL1 (Group A) was the most widely distributed being present in 18 STs, and OCL6 was found in 16 STs. Variation within clones was also observed, with more than one OC locus type found in the two globally disseminated clones, GC1 and GC2, that include the majority of multiply antibiotic resistant isolates. OCL1 was the most abundant gene cluster in both GC1 and GC2 genomes but GC1 isolates also carried OCL2, OCL3 or OCL5, and OCL3 was also present in GC2. As replacement of the OC locus in the major global clones indicates the presence of sub-lineages, a PCR typing scheme was developed to rapidly distinguish Group A and Group B types, and to distinguish the specific forms found in GC1 and GC2 isolates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Membrane proteins play important roles in many biochemical processes and are also attractive targets of drug discovery for various diseases. The elucidation of membrane protein types provides clues for understanding the structure and function of proteins. Recently we developed a novel system for predicting protein subnuclear localizations. In this paper, we propose a simplified version of our system for predicting membrane protein types directly from primary protein structures, which incorporates amino acid classifications and physicochemical properties into a general form of pseudo-amino acid composition. In this simplified system, we will design a two-stage multi-class support vector machine combined with a two-step optimal feature selection process, which proves very effective in our experiments. The performance of the present method is evaluated on two benchmark datasets consisting of five types of membrane proteins. The overall accuracies of prediction for five types are 93.25% and 96.61% via the jackknife test and independent dataset test, respectively. These results indicate that our method is effective and valuable for predicting membrane protein types. A web server for the proposed method is available at http://www.juemengt.com/jcc/memty_page.php

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an effective classification method based on Support Vector Machines (SVM) in the context of activity recognition. Local features that capture both spatial and temporal information in activity videos have made significant progress recently. Efficient and effective features, feature representation and classification plays a crucial role in activity recognition. For classification, SVMs are popularly used because of their simplicity and efficiency; however the common multi-class SVM approaches applied suffer from limitations including having easily confused classes and been computationally inefficient. We propose using a binary tree SVM to address the shortcomings of multi-class SVMs in activity recognition. We proposed constructing a binary tree using Gaussian Mixture Models (GMM), where activities are repeatedly allocated to subnodes until every new created node contains only one activity. Then, for each internal node a separate SVM is learned to classify activities, which significantly reduces the training time and increases the speed of testing compared to popular the `one-against-the-rest' multi-class SVM classifier. Experiments carried out on the challenging and complex Hollywood dataset demonstrates comparable performance over the baseline bag-of-features method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes a new prognosis model based on the technique for health state estimation of machines for accurate assessment of the remnant life. For the evaluation of health stages of machines, the Support Vector Machine (SVM) classifier was employed to obtain the probability of each health state. Two case studies involving bearing failures were used to validate the proposed model. Simulated bearing failure data and experimental data from an accelerated bearing test rig were used to train and test the model. The result obtained is very encouraging and shows that the proposed prognostic model produces promising results and has the potential to be used as an estimation tool for machine remnant life prediction.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In condition-based maintenance (CBM), effective diagnostics and prognostics are essential tools for maintenance engineers to identify imminent fault and to predict the remaining useful life before the components finally fail. This enables remedial actions to be taken in advance and reschedules production if necessary. This paper presents a technique for accurate assessment of the remnant life of machines based on historical failure knowledge embedded in the closed loop diagnostic and prognostic system. The technique uses the Support Vector Machine (SVM) classifier for both fault diagnosis and evaluation of health stages of machine degradation. To validate the feasibility of the proposed model, the five different level data of typical four faults from High Pressure Liquefied Natural Gas (HP-LNG) pumps were used for multi-class fault diagnosis. In addition, two sets of impeller-rub data were analysed and employed to predict the remnant life of pump based on estimation of health state. The results obtained were very encouraging and showed that the proposed prognosis system has the potential to be used as an estimation tool for machine remnant life prediction in real life industrial applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Summary of findings: QISU estimates that approximately 1,000 skateboard-associated injuries are seen at emergency departments each year in Queensland. 10-14 year old males are the most likely group to present to a QISU ED with a skateboard related injury. The peak time for skateboard injuries to occur is on the weekend and in the late afternoon. Only 19% of skateboard injuries occur at skate park facilities, with the remainder oc-curring in non-skate parks, on roads and footpaths. Risk factors associated with more severe injuries are, age less than 10 years and involvement in a motor vehicle crash The most common types of injuries are fractures and sprains of the upper limbs. Isolated head injuries represent approximately 5% of skateboarding injuries, but 60% of serious injuries requiring resuscitation. These injuries may be minimized or pre-vented with helmet use.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pure Tungsten Oxide (WO3) and Iron-doped (10 at%) Tungsten Oxide (WO3:Fe) nanostructured thin films were prepared using a dual crucible Electron Beam Evaporation techniques. The films were deposited at room temperature in high vacuum condition on glass substrate and post-heat treated at 300 oC for 1 hour. From the study of X-ray diffraction and Raman the characteristics of the as-deposited WO3 and WO3:Fe films indicated non-crystalline nature. The surface roughness of all the films showed in the order of 2.5 nm as observed using Atomic Force Microscopy (AFM). X-Ray Photoelectron Spectroscopy (XPS) analysis revealed tungsten oxide films with stoichiometry close to WO3. The addition of Fe to WO3 produced a smaller particle size and lower porosity as observed using Transmission Electron Microscopy (TEM). A slight difference in optical band gap energies of 3.22 eV and 3.12 eV were found between the as-deposited WO3 and WO3:Fe films, respectively. However, the difference in the band gap energies of the annealed films were significantly higher having values of 3.12 eV and 2.61 eV for the WO3 and WO3:Fe films, respectively. The heat treated samples were investigated for gas sensing applications using noise spectroscopy and doping of Fe to WO3 reduced the sensitivity to certain gasses. Detailed study of the WO3 and WO3:Fe films gas sensing properties is the subject of another paper.