25 resultados para Bayesian classifier


Relevância:

10.00% 10.00%

Publicador:

Resumo:

PURPOSE: Fatty liver disease (FLD) is an increasing prevalent disease that can be reversed if detected early. Ultrasound is the safest and ubiquitous method for identifying FLD. Since expert sonographers are required to accurately interpret the liver ultrasound images, lack of the same will result in interobserver variability. For more objective interpretation, high accuracy, and quick second opinions, computer aided diagnostic (CAD) techniques may be exploited. The purpose of this work is to develop one such CAD technique for accurate classification of normal livers and abnormal livers affected by FLD. METHODS: In this paper, the authors present a CAD technique (called Symtosis) that uses a novel combination of significant features based on the texture, wavelet transform, and higher order spectra of the liver ultrasound images in various supervised learning-based classifiers in order to determine parameters that classify normal and FLD-affected abnormal livers. RESULTS: On evaluating the proposed technique on a database of 58 abnormal and 42 normal liver ultrasound images, the authors were able to achieve a high classification accuracy of 93.3% using the decision tree classifier. CONCLUSIONS: This high accuracy added to the completely automated classification procedure makes the authors' proposed technique highly suitable for clinical deployment and usage.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this work the identification and diagnosis of various stages of chronic liver disease is addressed. The classification results of a support vector machine, a decision tree and a k-nearest neighbor classifier are compared. Ultrasound image intensity and textural features are jointly used with clinical and laboratorial data in the staging process. The classifiers training is performed by using a population of 97 patients at six different stages of chronic liver disease and a leave-one-out cross-validation strategy. The best results are obtained using the support vector machine with a radial-basis kernel, with 73.20% of overall accuracy. The good performance of the method is a promising indicator that it can be used, in a non invasive way, to provide reliable information about the chronic liver disease staging.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the last decade, local image features have been widely used in robot visual localization. To assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image to those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, we compare several candidate combiners with respect to their performance in the visual localization task. A deeper insight into the potential of the sum and product combiners is provided by testing two extensions of these algebraic rules: threshold and weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance. The voting method, whilst competitive to the algebraic rules in their standard form, is shown to be outperformed by both their modified versions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Discrete data representations are necessary, or at least convenient, in many machine learning problems. While feature selection (FS) techniques aim at finding relevant subsets of features, the goal of feature discretization (FD) is to find concise (quantized) data representations, adequate for the learning task at hand. In this paper, we propose two incremental methods for FD. The first method belongs to the filter family, in which the quality of the discretization is assessed by a (supervised or unsupervised) relevance criterion. The second method is a wrapper, where discretized features are assessed using a classifier. Both methods can be coupled with any static (unsupervised or supervised) discretization procedure and can be used to perform FS as pre-processing or post-processing stages. The proposed methods attain efficient representations suitable for binary and multi-class problems with different types of data, being competitive with existing methods. Moreover, using well-known FS methods with the features discretized by our techniques leads to better accuracy than with the features discretized by other methods or with the original features. (C) 2013 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study focus on the probabilistic modelling of mechanical properties of prestressing strands based on data collected from tensile tests carried out in Laboratório Nacional de Engenharia Civil (LNEC), Portugal, for certification purposes, and covers a period of about 9 years of production. The strands studied were produced by six manufacturers from four countries, namely Portugal, Spain, Italy and Thailand. Variability of the most important mechanical properties is examined and the results are compared with the recommendations of the Probabilistic Model Code, as well as the Eurocodes and earlier studies. The obtained results show a very low variability which, of course, benefits structural safety. Based on those results, probabilistic models for the most important mechanical properties of prestressing strands are proposed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this article, we present the first study on probabilistic tsunami hazard assessment for the Northeast (NE) Atlantic region related to earthquake sources. The methodology combines the probabilistic seismic hazard assessment, tsunami numerical modeling, and statistical approaches. We consider three main tsunamigenic areas, namely the Southwest Iberian Margin, the Gloria, and the Caribbean. For each tsunamigenic zone, we derive the annual recurrence rate for each magnitude range, from Mw 8.0 up to Mw 9.0, with a regular interval, using the Bayesian method, which incorporates seismic information from historical and instrumental catalogs. A numerical code, solving the shallow water equations, is employed to simulate the tsunami propagation and compute near shore wave heights. The probability of exceeding a specific tsunami hazard level during a given time period is calculated using the Poisson distribution. The results are presented in terms of the probability of exceedance of a given tsunami amplitude for 100- and 500-year return periods. The hazard level varies along the NE Atlantic coast, being maximum along the northern segment of the Morocco Atlantic coast, the southern Portuguese coast, and the Spanish coast of the Gulf of Cadiz. We find that the probability that a maximum wave height exceeds 1 m somewhere in the NE Atlantic region reaches 60 and 100 % for 100- and 500-year return periods, respectively. These probability values decrease, respectively, to about 15 and 50 % when considering the exceedance threshold of 5 m for the same return periods of 100 and 500 years.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Electrocardiogram (ECG) biometrics are a relatively recent trend in biometric recognition, with at least 13 years of development in peer-reviewed literature. Most of the proposed biometric techniques perform classifi-cation on features extracted from either heartbeats or from ECG based transformed signals. The best representation is yet to be decided. This paper studies an alternative representation, a dissimilarity space, based on the pairwise dissimilarity between templates and subjects' signals. Additionally, this representation can make use of ECG signals sourced from multiple leads. Configurations of three leads will be tested and contrasted with single-lead experiments. Using the same k-NN classifier the results proved superior to those obtained through a similar algorithm which does not employ a dissimilarity representation. The best Authentication EER went as low as 1:53% for a database employing 503 subjects. However, the employment of extra leads did not prove itself advantageous.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the last decade, local image features have been widely used in robot visual localization. In order to assess image similarity, a strategy exploiting these features compares raw descriptors extracted from the current image with those in the models of places. This paper addresses the ensuing step in this process, where a combining function must be used to aggregate results and assign each place a score. Casting the problem in the multiple classifier systems framework, in this paper we compare several candidate combiners with respect to their performance in the visual localization task. For this evaluation, we selected the most popular methods in the class of non-trained combiners, namely the sum rule and product rule. A deeper insight into the potential of these combiners is provided through a discriminativity analysis involving the algebraic rules and two extensions of these methods: the threshold, as well as the weighted modifications. In addition, a voting method, previously used in robot visual localization, is assessed. Furthermore, we address the process of constructing a model of the environment by describing how the model granularity impacts upon performance. All combiners are tested on a visual localization task, carried out on a public dataset. It is experimentally demonstrated that the sum rule extensions globally achieve the best performance, confirming the general agreement on the robustness of this rule in other classification problems. The voting method, whilst competitive with the product rule in its standard form, is shown to be outperformed by its modified versions.