969 resultados para Vector analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Gaussian mixture models (GMMs) have become an established means of modeling feature distributions in speaker recognition systems. It is useful for experimentation and practical implementation purposes to develop and test these models in an efficient manner particularly when computational resources are limited. A method of combining vector quantization (VQ) with single multi-dimensional Gaussians is proposed to rapidly generate a robust model approximation to the Gaussian mixture model. A fast method of testing these systems is also proposed and implemented. Results on the NIST 1996 Speaker Recognition Database suggest comparable and in some cases an improved verification performance to the traditional GMM based analysis scheme. In addition, previous research for the task of speaker identification indicated a similar system perfomance between the VQ Gaussian based technique and GMMs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Robust speaker verification on short utterances remains a key consideration when deploying automatic speaker recognition, as many real world applications often have access to only limited duration speech data. This paper explores how the recent technologies focused around total variability modeling behave when training and testing utterance lengths are reduced. Results are presented which provide a comparison of Joint Factor Analysis (JFA) and i-vector based systems including various compensation techniques; Within-Class Covariance Normalization (WCCN), LDA, Scatter Difference Nuisance Attribute Projection (SDNAP) and Gaussian Probabilistic Linear Discriminant Analysis (GPLDA). Speaker verification performance for utterances with as little as 2 sec of data taken from the NIST Speaker Recognition Evaluations are presented to provide a clearer picture of the current performance characteristics of these techniques in short utterance conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a method for the recovery of position and absolute attitude (including pitch, roll and yaw) using a novel fusion of monocular Visual Odometry and GPS measurements in a similar manner to a classic loosely-coupled GPS/INS error state navigation filter. The proposed filter does not require additional restrictions or assumptions such as platform-specific dynamics, map-matching, feature-tracking, visual loop-closing, gravity vector or additional sensors such as an IMU or magnetic compass. An observability analysis of the proposed filter is performed, showing that the scale factor, position and attitude errors are fully observable under acceleration that is non-parallel to velocity vector in the navigation frame. The observability properties of the proposed filter are demonstrated using numerical simulations. We conclude the article with an implementation of the proposed filter using real flight data collected from a Cessna 172 equipped with a downwards-looking camera and GPS, showing the feasibility of the algorithm in real-world conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of the large number of terms, patterns, and noise. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern-based methods should perform better than term- based ones in describing user preferences, but many experiments do not support this hypothesis. This research presents a promising method, Relevance Feature Discovery (RFD), for solving this challenging issue. It discovers both positive and negative patterns in text documents as high-level features in order to accurately weight low-level features (terms) based on their specificity and their distributions in the high-level features. The thesis also introduces an adaptive model (called ARFD) to enhance the exibility of using RFD in adaptive environment. ARFD automatically updates the system's knowledge based on a sliding window over new incoming feedback documents. It can efficiently decide which incoming documents can bring in new knowledge into the system. Substantial experiments using the proposed models on Reuters Corpus Volume 1 and TREC topics show that the proposed models significantly outperform both the state-of-the-art term-based methods underpinned by Okapi BM25, Rocchio or Support Vector Machine and other pattern-based methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces the Weighted Linear Discriminant Analysis (WLDA) technique, based upon the weighted pairwise Fisher criterion, for the purposes of improving i-vector speaker verification in the presence of high intersession variability. By taking advantage of the speaker discriminative information that is available in the distances between pairs of speakers clustered in the development i-vector space, the WLDA technique is shown to provide an improvement in speaker verification performance over traditional Linear Discriminant Analysis (LDA) approaches. A similar approach is also taken to extend the recently developed Source Normalised LDA (SNLDA) into Weighted SNLDA (WSNLDA) which, similarly, shows an improvement in speaker verification performance in both matched and mismatched enrolment/verification conditions. Based upon the results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset, we believe that both WLDA and WSNLDA are viable as replacement techniques to improve the performance of LDA and SNLDA-based i-vector speaker verification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Panicum streak virus (PanSV; Family Geminiviridae; Genus Mastrevirus) is a close relative of Maize streak virus (MSV), the most serious viral threat to maize production in Africa. PanSV and MSV have the same leafhopper vector species, largely overlapping natural host ranges and similar geographical distributions across Africa and its associated Indian Ocean Islands. Unlike MSV, however, PanSV has no known economic relevance. Results: Here we report on 16 new PanSV full genome sequences sampled throughout Africa and use these together with others in public databases to reveal that PanSV and MSV populations in general share very similar patterns of genetic exchange and geographically structured diversity. A potentially important difference between the species, however, is that the movement of MSV strains throughout Africa is apparently less constrained than that of PanSV strains. Interestingly the MSV-A strain which causes maize streak disease is apparently the most mobile of all the PanSV and MSV strains investigated. Conclusion: We therefore hypothesize that the generally increased mobility of MSV relative to other closely related species such as PanSV, may have been an important evolutionary step in the eventual emergence of MSV-A as a serious agricultural pathogen. The GenBank accession numbers for the sequences reported in this paper are GQ415386-GQ415401. © 2009 Varsani et al; licensee BioMed Central Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this paper is to provide a comparison of various algorithms and parameters to build reduced semantic spaces. The effect of dimension reduction, the stability of the representation and the effect of word order are examined in the context of the five algorithms bearing on semantic vectors: Random projection (RP), singular value decom- position (SVD), non-negative matrix factorization (NMF), permutations and holographic reduced representations (HRR). The quality of semantic representation was tested by means of synonym finding task using the TOEFL test on the TASA corpus. Dimension reduction was found to improve the quality of semantic representation but it is hard to find the optimal parameter settings. Even though dimension reduction by RP was found to be more generally applicable than SVD, the semantic vectors produced by RP are somewhat unstable. The effect of encoding word order into the semantic vector representation via HRR did not lead to any increase in scores over vectors constructed from word co-occurrence in context information. In this regard, very small context windows resulted in better semantic vectors for the TOEFL test.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques: (a) weighted maximum margin criterion (WMMC), (b) source-normalized WMMC (SN-WMMC), (c) weighted linear discriminant analysis (WLDA), and; (d) source-normalized WLDA (SN-WLDA) have been investigated. We show that, by extracting the discriminatory information between pairs of speakers as well as capturing the source variation information in the development i-vector space, the SN-WLDA based cosine similarity scoring (CSS) i-vector system is shown to provide over 20% improvement in EER for NIST 2008 interview and microphone verification and over 10% improvement in EER for NIST 2008 telephone verification, when compared to SN-LDA based CSS i-vector system. Further, score-level fusion techniques are analyzed to combine the best channel compensation approaches, to provide over 8% improvement in DCF over the best single approach, (SN-WLDA), for NIST 2008 interview/ telephone enrolment-verification condition. Finally, we demonstrate that the improvements found in the context of CSS also generalize to state-of-the-art GPLDA with up to 14% relative improvement in EER for NIST SRE 2010 interview and microphone verification and over 7% relative improvement in EER for NIST SRE 2010 telephone verification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Authenticated Encryption (AE) is the cryptographic process of providing simultaneous confidentiality and integrity protection to messages. This approach is more efficient than applying a two-step process of providing confidentiality for a message by encrypting the message, and in a separate pass providing integrity protection by generating a Message Authentication Code (MAC). AE using symmetric ciphers can be provided by either stream ciphers with built in authentication mechanisms or block ciphers using appropriate modes of operation. However, stream ciphers have the potential for higher performance and smaller footprint in hardware and/or software than block ciphers. This property makes stream ciphers suitable for resource constrained environments, where storage and computational power are limited. There have been several recent stream cipher proposals that claim to provide AE. These ciphers can be analysed using existing techniques that consider confidentiality or integrity separately; however currently there is no existing framework for the analysis of AE stream ciphers that analyses these two properties simultaneously. This thesis introduces a novel framework for the analysis of AE using stream cipher algorithms. This thesis analyzes the mechanisms for providing confidentiality and for providing integrity in AE algorithms using stream ciphers. There is a greater emphasis on the analysis of the integrity mechanisms, as there is little in the public literature on this, in the context of authenticated encryption. The thesis has four main contributions as follows. The first contribution is the design of a framework that can be used to classify AE stream ciphers based on three characteristics. The first classification applies Bellare and Namprempre's work on the the order in which encryption and authentication processes take place. The second classification is based on the method used for accumulating the input message (either directly or indirectly) into the into the internal states of the cipher to generate a MAC. The third classification is based on whether the sequence that is used to provide encryption and authentication is generated using a single key and initial vector, or two keys and two initial vectors. The second contribution is the application of an existing algebraic method to analyse the confidentiality algorithms of two AE stream ciphers; namely SSS and ZUC. The algebraic method is based on considering the nonlinear filter (NLF) of these ciphers as a combiner with memory. This method enables us to construct equations for the NLF that relate the (inputs, outputs and memory of the combiner) to the output keystream. We show that both of these ciphers are secure from this type of algebraic attack. We conclude that using a keydependent SBox in the NLF twice, and using two different SBoxes in the NLF of ZUC, prevents this type of algebraic attack. The third contribution is a new general matrix based model for MAC generation where the input message is injected directly into the internal state. This model describes the accumulation process when the input message is injected directly into the internal state of a nonlinear filter generator. We show that three recently proposed AE stream ciphers can be considered as instances of this model; namely SSS, NLSv2 and SOBER-128. Our model is more general than a previous investigations into direct injection. Possible forgery attacks against this model are investigated. It is shown that using a nonlinear filter in the accumulation process of the input message when either the input message or the initial states of the register is unknown prevents forgery attacks based on collisions. The last contribution is a new general matrix based model for MAC generation where the input message is injected indirectly into the internal state. This model uses the input message as a controller to accumulate a keystream sequence into an accumulation register. We show that three current AE stream ciphers can be considered as instances of this model; namely ZUC, Grain-128a and Sfinks. We establish the conditions under which the model is susceptible to forgery and side-channel attacks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A significant amount of speech is typically required for speaker verification system development and evaluation, especially in the presence of large intersession variability. This paper introduces a source and utterance duration normalized linear discriminant analysis (SUN-LDA) approaches to compensate session variability in short-utterance i-vector speaker verification systems. Two variations of SUN-LDA are proposed where normalization techniques are used to capture source variation from both short and full-length development i-vectors, one based upon pooling (SUN-LDA-pooled) and the other on concatenation (SUN-LDA-concat) across the duration and source-dependent session variation. Both the SUN-LDA-pooled and SUN-LDA-concat techniques are shown to provide improvement over traditional LDA on NIST 08 truncated 10sec-10sec evaluation conditions, with the highest improvement obtained with the SUN-LDA-concat technique achieving a relative improvement of 8% in EER for mis-matched conditions and over 3% for matched conditions over traditional LDA approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Spatial organisation of proteins according to their function plays an important role in the specificity of their molecular interactions. Emerging proteomics methods seek to assign proteins to sub-cellular locations by partial separation of organelles and computational analysis of protein abundance distributions among partially separated fractions. Such methods permit simultaneous analysis of unpurified organelles and promise proteome-wide localisation in scenarios wherein perturbation may prompt dynamic re-distribution. Resolving organelles that display similar behavior during a protocol designed to provide partial enrichment represents a possible shortcoming. We employ the Localisation of Organelle Proteins by Isotope Tagging (LOPIT) organelle proteomics platform to demonstrate that combining information from distinct separations of the same material can improve organelle resolution and assignment of proteins to sub-cellular locations. Two previously published experiments, whose distinct gradients are alone unable to fully resolve six known protein-organelle groupings, are subjected to a rigorous analysis to assess protein-organelle association via a contemporary pattern recognition algorithm. Upon straightforward combination of single-gradient data, we observe significant improvement in protein-organelle association via both a non-linear support vector machine algorithm and partial least-squares discriminant analysis. The outcome yields suggestions for further improvements to present organelle proteomics platforms, and a robust analytical methodology via which to associate proteins with sub-cellular organelles.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have tested a methodology for the elimination of the selectable marker gene after Agrobacterium-mediated transformation of barley. This involves segregation of the selectable marker gene away from the gene of interest following co-transformation using a plasmid carrying two T-DNAs, which were located adjacent to each other with no intervening region. A standard binary transformation vector was modified by insertion of a small section composed of an additional left and right T-DNA border, so that the selectable marker gene and the site for insertion of the gene of interest (GOI) were each flanked by a left and right border. Using this vector three different GOIs were transformed into barley. Analysis of transgene inheritance was facilitated by a novel and rapid assay utilizing PCR amplification from macerated leaf tissue. Co-insertion was observed in two thirds of transformants, and among these approximately one quarter had transgene inserts which segregated in the next generation to yield selectable marker-free transgenic plants. Insertion of non-T-DNA plasmid sequences was observed in only one of fourteen SMF lines tested. This technique thus provides a workable system for generating transgenic barley free from selectable marker genes, thereby obviating public concerns regarding proliferation of these genes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Near-infrared spectroscopy (NIRS) calibrations were developed for the discrimination of Chinese hawthorn (Crataegus pinnatifida Bge. var. major) fruit from three geographical regions as well as for the estimation of the total sugar, total acid, total phenolic content, and total antioxidant activity. Principal component analysis (PCA) was used for the discrimination of the fruit on the basis of their geographical origin. Three pattern recognition methods, linear discriminant analysis, partial least-squares-discriminant analysis, and back-propagation artificial neural networks, were applied to classify and compare these samples. Furthermore, three multivariate calibration models based on the first derivative NIR spectroscopy, partial least-squares regression, back-propagation artificial neural networks, and least-squares-support vector machines, were constructed for quantitative analysis of the four analytes, total sugar, total acid, total phenolic content, and total antioxidant activity, and validated by prediction data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes techniques to improve the performance of i-vector based speaker verification systems when only short utterances are available. Short-length utterance i-vectors vary with speaker, session variations, and the phonetic content of the utterance. Well established methods such as linear discriminant analysis (LDA), source-normalized LDA (SN-LDA) and within-class covariance normalisation (WCCN) exist for compensating the session variation but we have identified the variability introduced by phonetic content due to utterance variation as an additional source of degradation when short-duration utterances are used. To compensate for utterance variations in short i-vector speaker verification systems using cosine similarity scoring (CSS), we have introduced a short utterance variance normalization (SUVN) technique and a short utterance variance (SUV) modelling approach at the i-vector feature level. A combination of SUVN with LDA and SN-LDA is proposed to compensate the session and utterance variations and is shown to provide improvement in performance over the traditional approach of using LDA and/or SN-LDA followed by WCCN. An alternative approach is also introduced using probabilistic linear discriminant analysis (PLDA) approach to directly model the SUV. The combination of SUVN, LDA and SN-LDA followed by SUV PLDA modelling provides an improvement over the baseline PLDA approach. We also show that for this combination of techniques, the utterance variation information needs to be artificially added to full-length i-vectors for PLDA modelling.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS–SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS–SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65–85% for hybrid PLS–SVM model respectively. Also it was found that the hybrid PLS–SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS–SVM model.