819 resultados para Classification error rate


Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are several papers on pruning methods in the artificial neural networks area. However, with rare exceptions, none of them presents an appropriate statistical evaluation of such methods. In this article, we proved statistically the ability of some methods to reduce the number of neurons of the hidden layer of a multilayer perceptron neural network (MLP), and to maintain the same landing of classification error of the initial net. They are evaluated seven pruning methods. The experimental investigation was accomplished on five groups of generated data and in two groups of real data. Three variables were accompanied in the study: apparent classification error rate in the test group (REA); number of hidden neurons, obtained after the application of the pruning method; and number of training/retraining epochs, to evaluate the computational effort. The non-parametric Friedman's test was used to do the statistical analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Para compor um sistema de Reconhecimento Automático de Voz, pode ser utilizada uma tarefa chamada Classificação Fonética, onde a partir de uma amostra de voz decide-se qual fonema foi emitido por um interlocutor. Para facilitar a classificação e realçar as características mais marcantes dos fonemas, normalmente, as amostras de voz são pré- processadas através de um fronl-en'L Um fron:-end, geralmente, extrai um conjunto de parâmetros para cada amostra de voz. Após este processamento, estes parâmetros são insendos em um algoritmo classificador que (já devidamente treinado) procurará decidir qual o fonema emitido. Existe uma tendência de que quanto maior a quantidade de parâmetros utilizados no sistema, melhor será a taxa de acertos na classificação. A contrapartida para esta tendência é o maior custo computacional envolvido. A técnica de Seleção de Parâmetros tem como função mostrar quais os parâmetros mais relevantes (ou mais utilizados) em uma tarefa de classificação, possibilitando, assim, descobrir quais os parâmetros redundantes, que trazem pouca (ou nenhuma) contribuição à tarefa de classificação. A proposta deste trabalho é aplicar o classificador SVM à classificação fonética, utilizando a base de dados TIMIT, e descobrir os parâmetros mais relevantes na classificação, aplicando a técnica Boosting de Seleção de Parâmetros.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes a method for data clustering based on complex networks theory. A data set is represented as a network by considering different metrics to establish the connection between each pair of objects. The clusters are obtained by taking into account five community detection algorithms. The network-based clustering approach is applied in two real-world databases and two sets of artificially generated data. The obtained results suggest that the exponential of the Minkowski distance is the most suitable metric to quantify the similarities between pairs of objects. In addition, the community identification method based on the greedy optimization provides the best cluster solution. We compare the network-based clustering approach with some traditional clustering algorithms and verify that it provides the lowest classification error rate. (C) 2012 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last few years the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how good is a voice when the application is a speech based interface. In this paper we present a new automatic voice pleasantness classification system based on prosodic and acoustic patterns of voice preference. Our study is based on a multi-language database composed by female voices. In the objective performance evaluation the system achieved a 7.3% error rate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Affiliation: Centre Robert-Cedergren de l'Université de Montréal en bio-informatique et génomique & Département de biochimie, Université de Montréal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, an improved stochastic discrimination (SD) is introduced to reduce the error rate of the standard SD in the context of multi-class classification problem. The learning procedure of the improved SD consists of two stages. In the first stage, a standard SD, but with shorter learning period is carried out to identify an important space where all the misclassified samples are located. In the second stage, the standard SD is modified by (i) restricting sampling in the important space; and (ii) introducing a new discriminant function for samples in the important space. It is shown by mathematical derivation that the new discriminant function has the same mean, but smaller variance than that of standard SD for samples in the important space. It is also analyzed that the smaller the variance of the discriminant function, the lower the error rate of the classifier. Consequently, the proposed improved SD improves standard SD by its capability of achieving higher classification accuracy. Illustrative examples axe provided to demonstrate the effectiveness of the proposed improved SD.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Stochastic discrimination (SD) depends on a discriminant function for classification. In this paper, an improved SD is introduced to reduce the error rate of the standard SD in the context of a two-class classification problem. The learning procedure of the improved SD consists of two stages. Initially a standard SD, but with shorter learning period is carried out to identify an important space where all the misclassified samples are located. Then the standard SD is modified by 1) restricting sampling in the important space, and 2) introducing a new discriminant function for samples in the important space. It is shown by mathematical derivation that the new discriminant function has the same mean, but with a smaller variance than that of the standard SD for samples in the important space. It is also analyzed that the smaller the variance of the discriminant function, the lower the error rate of the classifier. Consequently, the proposed improved SD improves standard SD by its capability of achieving higher classification accuracy. Illustrative examples are provided to demonstrate the effectiveness of the proposed improved SD.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Predictive performance evaluation is a fundamental issue in design, development, and deployment of classification systems. As predictive performance evaluation is a multidimensional problem, single scalar summaries such as error rate, although quite convenient due to its simplicity, can seldom evaluate all the aspects that a complete and reliable evaluation must consider. Due to this, various graphical performance evaluation methods are increasingly drawing the attention of machine learning, data mining, and pattern recognition communities. The main advantage of these types of methods resides in their ability to depict the trade-offs between evaluation aspects in a multidimensional space rather than reducing these aspects to an arbitrarily chosen (and often biased) single scalar measure. Furthermore, to appropriately select a suitable graphical method for a given task, it is crucial to identify its strengths and weaknesses. This paper surveys various graphical methods often used for predictive performance evaluation. By presenting these methods in the same framework, we hope this paper may shed some light on deciding which methods are more suitable to use in different situations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]In this paper, we address the challenge of gender classi - cation using large databases of images with two goals. The rst objective is to evaluate whether the error rate decreases compared to smaller databases. The second goal is to determine if the classi er that provides the best classi cation rate for one database, improves the classi cation results for other databases, that is, the cross-database performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation focuses on two vital challenges in relation to whale acoustic signals: detection and classification.

In detection, we evaluated the influence of the uncertain ocean environment on the spectrogram-based detector, and derived the likelihood ratio of the proposed Short Time Fourier Transform detector. Experimental results showed that the proposed detector outperforms detectors based on the spectrogram. The proposed detector is more sensitive to environmental changes because it includes phase information.

In classification, our focus is on finding a robust and sparse representation of whale vocalizations. Because whale vocalizations can be modeled as polynomial phase signals, we can represent the whale calls by their polynomial phase coefficients. In this dissertation, we used the Weyl transform to capture chirp rate information, and used a two dimensional feature set to represent whale vocalizations globally. Experimental results showed that our Weyl feature set outperforms chirplet coefficients and MFCC (Mel Frequency Cepstral Coefficients) when applied to our collected data.

Since whale vocalizations can be represented by polynomial phase coefficients, it is plausible that the signals lie on a manifold parameterized by these coefficients. We also studied the intrinsic structure of high dimensional whale data by exploiting its geometry. Experimental results showed that nonlinear mappings such as Laplacian Eigenmap and ISOMAP outperform linear mappings such as PCA and MDS, suggesting that the whale acoustic data is nonlinear.

We also explored deep learning algorithms on whale acoustic data. We built each layer as convolutions with either a PCA filter bank (PCANet) or a DCT filter bank (DCTNet). With the DCT filter bank, each layer has different a time-frequency scale representation, and from this, one can extract different physical information. Experimental results showed that our PCANet and DCTNet achieve high classification rate on the whale vocalization data set. The word error rate of the DCTNet feature is similar to the MFSC in speech recognition tasks, suggesting that the convolutional network is able to reveal acoustic content of speech signals.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Sequences of two chloroplast photosystem genes, psaA and psbB, together comprising about 3,500 bp, were obtained for all five major groups of extant seed plants and several outgroups among other vascular plants. Strongly supported, but significantly conflicting, phylogenetic signals were obtained in parsimony analyses from partitions of the data into first and second codon positions versus third positions. In the former, both genes agreed on a monophyletic gymnosperms, with Gnetales closely related to certain conifers. In the latter, Gnetales are inferred to be the sister group of all other seed plants, with gymnosperms paraphyletic. None of the data supported the modern ‘‘anthophyte hypothesis,’’ which places Gnetales as the sister group of flowering plants. A series of simulation studies were undertaken to examine the error rate for parsimony inference. Three kinds of errors were examined: random error, systematic bias (both properties of finite data sets), and statistical inconsistency owing to long-branch attraction (an asymptotic property). Parsimony reconstructions were extremely biased for third-position data for psbB. Regardless of the true underlying tree, a tree in which Gnetales are sister to all other seed plants was likely to be reconstructed for these data. None of the combinations of genes or partitions permits the anthophyte tree to be reconstructed with high probability. Simulations of progressively larger data sets indicate the existence of long-branch attraction (statistical inconsistency) for third-position psbB data if either the anthophyte tree or the gymnosperm tree is correct. This is also true for the anthophyte tree using either psaA third positions or psbB first and second positions. A factor contributing to bias and inconsistency is extremely short branches at the base of the seed plant radiation, coupled with extremely high rates in Gnetales and nonseed plant outgroups. M. J. Sanderson,* M. F. Wojciechowski,*† J.-M. Hu,* T. Sher Khan,* and S. G. Brady

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we generalize the existing rate-one space frequency (SF) and space-time frequency (STF) code constructions. The objective of this exercise is to provide a systematic design of full-diversity STF codes with high coding gain. Under this generalization, STF codes are formulated as linear transformations of data. Conditions on these linear transforms are then derived so that the resulting STF codes achieve full diversity and high coding gain with a moderate decoding complexity. Many of these conditions involve channel parameters like delay profile (DP) and temporal correlation. When these quantities are not available at the transmitter, design of codes that exploit full diversity on channels with arbitrary DIP and temporal correlation is considered. Complete characterization of a class of such robust codes is provided and their bit error rate (BER) performance is evaluated. On the other hand, when channel DIP and temporal correlation are available at the transmitter, linear transforms are optimized to maximize the coding gain of full-diversity STF codes. BER performance of such optimized codes is shown to be better than those of existing codes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents a validation study on the application of a novel interslice interpolation technique for musculoskeletal structure segmentation of articulated joints and muscles on human magnetic resonance imaging data. The interpolation technique is based on morphological shape-based interpolation combined with intensity based voxel classification. Shape-based interpolation in the absence of the original intensity image has been investigated intensively. However, in some applications of medical image analysis, the intensity image of the slice to be interpolated is available. For example, when manual segmentation is conducted on selected slices, the segmentation on those unselected slices can be obtained by interpolation. We proposed a two- step interpolation method to utilize both the shape information in the manual segmentation and local intensity information in the image. The method was tested on segmentations of knee, hip and shoulder joint bones and hamstring muscles. The results were compared with two existing interpolation methods. Based on the calculated Dice similarity coefficient and normalized error rate, the proposed method outperformed the other two methods.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we present a low-complexity algorithm for detection in high-rate, non-orthogonal space-time block coded (STBC) large-multiple-input multiple-output (MIMO) systems that achieve high spectral efficiencies of the order of tens of bps/Hz. We also present a training-based iterative detection/channel estimation scheme for such large STBC MIMO systems. Our simulation results show that excellent bit error rate and nearness-to-capacity performance are achieved by the proposed multistage likelihood ascent search (M-LAS) detector in conjunction with the proposed iterative detection/channel estimation scheme at low complexities. The fact that we could show such good results for large STBCs like 16 X 16 and 32 X 32 STBCs from Cyclic Division Algebras (CDA) operating at spectral efficiencies in excess of 20 bps/Hz (even after accounting for the overheads meant for pilot based training for channel estimation and turbo coding) establishes the effectiveness of the proposed detector and channel estimator. We decode perfect codes of large dimensions using the proposed detector. With the feasibility of such a low-complexity detection/channel estimation scheme, large-MIMO systems with tens of antennas operating at several tens of bps/Hz spectral efficiencies can become practical, enabling interesting high data rate wireless applications.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

802.11 WLANs are characterized by high bit error rate and frequent changes in network topology. The key feature that distinguishes WLANs from wired networks is the multi-rate transmission capability, which helps to accommodate a wide range of channel conditions. This has a significant impact on higher layers such as routing and transport levels. While many WLAN products provide rate control at the hardware level to adapt to the channel conditions, some chipsets like Atheros do not have support for automatic rate control. We first present a design and implementation of an FER-based automatic rate control state machine, which utilizes the statistics available at the device driver to find the optimal rate. The results show that the proposed rate switching mechanism adapts quite fast to the channel conditions. The hop count metric used by current routing protocols has proven itself for single rate networks. But it fails to take into account other important factors in a multi-rate network environment. We propose transmission time as a better path quality metric to guide routing decisions. It incorporates the effects of contention for the channel, the air time to send the data and the asymmetry of links. In this paper, we present a new design for a multi-rate mechanism as well as a new routing metric that is responsive to the rate. We address the issues involved in using transmission time as a metric and presents a comparison of the performance of different metrics for dynamic routing.