995 resultados para centrality measures


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new approach to spoken language modeling for language identification (LID) using the Lempel-Ziv-Welch (LZW) algorithm. The LZW technique is applicable to any kind of tokenization of the speech signal. Because of the efficiency of LZW algorithm to obtain variable length symbol strings in the training data, the LZW codebook captures the essentials of a language effectively. We develop two new deterministic measures for LID based on the LZW algorithm namely: (i) Compression ratio score (LZW-CR) and (ii) weighted discriminant score (LZW-WDS). To assess these measures, we consider error-free tokenization of speech as well as artificially induced noise in the tokenization. It is shown that for a 6 language LID task of OGI-TS database with clean tokenization, the new model (LZW-WDS) performs slightly better than the conventional bigram model. For noisy tokenization, which is the more realistic case, LZW-WDS significantly outperforms the bigram technique

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Time series classification deals with the problem of classification of data that is multivariate in nature. This means that one or more of the attributes is in the form of a sequence. The notion of similarity or distance, used in time series data, is significant and affects the accuracy, time, and space complexity of the classification algorithm. There exist numerous similarity measures for time series data, but each of them has its own disadvantages. Instead of relying upon a single similarity measure, our aim is to find the near optimal solution to the classification problem by combining different similarity measures. In this work, we use genetic algorithms to combine the similarity measures so as to get the best performance. The weightage given to different similarity measures evolves over a number of generations so as to get the best combination. We test our approach on a number of benchmark time series datasets and present promising results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes an automatic acoustic-phonetic method for estimating voice-onset time of stops. This method requires neither transcription of the utterance nor training of a classifier. It makes use of the plosion index for the automatic detection of burst onsets of stops. Having detected the burst onset, the onset of the voicing following the burst is detected using the epochal information and a temporal measure named the maximum weighted inner product. For validation, several experiments are carried out on the entire TIMIT database and two of the CMU Arctic corpora. The performance of the proposed method compares well with three state-of-the-art techniques. (C) 2014 Acoustical Society of America

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The performance of prediction models is often based on ``abstract metrics'' that estimate the model's ability to limit residual errors between the observed and predicted values. However, meaningful evaluation and selection of prediction models for end-user domains requires holistic and application-sensitive performance measures. Inspired by energy consumption prediction models used in the emerging ``big data'' domain of Smart Power Grids, we propose a suite of performance measures to rationally compare models along the dimensions of scale independence, reliability, volatility and cost. We include both application independent and dependent measures, the latter parameterized to allow customization by domain experts to fit their scenario. While our measures are generalizable to other domains, we offer an empirical analysis using real energy use data for three Smart Grid applications: planning, customer education and demand response, which are relevant for energy sustainability. Our results underscore the value of the proposed measures to offer a deeper insight into models' behavior and their impact on real applications, which benefit both data mining researchers and practitioners.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Serovars of Salmonella enterica, namely Typhi and Typhimurium, reportedly, are the bacterial pathogens causing systemic infections like gastroenteritis and typhoid fever. To elucidate the role and importance in such infection, the proteins of the Type III secretion system of Salmonella pathogenicity islands and two component signal transduction systems, have been mainly focused. However, the most indispensable of these virulent ones and their hierarchical role has not yet been studied extensively. Results: We have adopted a theoretical approach to build an interactome comprising the proteins from the Salmonella pathogeneicity islands (SPI) and two component signal transduction systems. This interactome was then analyzed by using network parameters like centrality and k-core measures. An initial step to capture the fingerprint of the core network resulted in a set of proteins which are involved in the process of invasion and colonization, thereby becoming more important in the process of infection. These proteins pertained to the Inv, Org, Prg, Sip, Spa, Ssa and Sse operons along with chaperone protein SicA. Amongst them, SicA was figured out to be the most indispensable protein from different network parametric analyses. Subsequently, the gene expression levels of all these theoretically identified important proteins were confirmed by microarray data analysis. Finally, we have proposed a hierarchy of the proteins involved in the total infection process. This theoretical approach is the first of its kind to figure out potential virulence determinants encoded by SPI for therapeutic targets for enteric infection. Conclusions: A set of responsible virulent proteins was identified and the expression level of their genes was validated by using independent, published microarray data. The result was a targeted set of proteins that could serve as sensitive predictors and form the foundation for a series of trials in the wet-lab setting. Understanding these regulatory and virulent proteins would provide insight into conditions which are encountered by this intracellular enteric pathogen during the course of infection. This would further contribute in identifying novel targets for antimicrobial agents. (C) 2014 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The study introduces two new alternatives for global response sensitivity analysis based on the application of the L-2-norm and Hellinger's metric for measuring distance between two probabilistic models. Both the procedures are shown to be capable of treating dependent non-Gaussian random variable models for the input variables. The sensitivity indices obtained based on the L2-norm involve second order moments of the response, and, when applied for the case of independent and identically distributed sequence of input random variables, it is shown to be related to the classical Sobol's response sensitivity indices. The analysis based on Hellinger's metric addresses variability across entire range or segments of the response probability density function. The measure is shown to be conceptually a more satisfying alternative to the Kullback-Leibler divergence based analysis which has been reported in the existing literature. Other issues addressed in the study cover Monte Carlo simulation based methods for computing the sensitivity indices and sensitivity analysis with respect to grouped variables. Illustrative examples consist of studies on global sensitivity analysis of natural frequencies of a random multi-degree of freedom system, response of a nonlinear frame, and safety margin associated with a nonlinear performance function. (C) 2015 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador: