948 resultados para Kullback-leibler Divergence
Resumo:
International audience
Resumo:
A dolgozatban a döntéselméletben fontos szerepet játszó páros összehasonlítás mátrix prioritásvektorának meghatározására új megközelítést alkalmazunk. Az A páros összehasonlítás mátrix és a prioritásvektor által definiált B konzisztens mátrix közötti eltérést a Kullback-Leibler relatív entrópia-függvény segítségével mérjük. Ezen eltérés minimalizálása teljesen kitöltött mátrix esetében konvex programozási feladathoz vezet, nem teljesen kitöltött mátrix esetében pedig egy fixpont problémához. Az eltérésfüggvényt minimalizáló prioritásvektor egyben azzal a tulajdonsággal is rendelkezik, hogy az A mátrix elemeinek összege és a B mátrix elemeinek összege közötti különbség éppen az eltérésfüggvény minimumának az n-szerese, ahol n a feladat mérete. Így az eltérésfüggvény minimumának értéke két szempontból is lehet alkalmas az A mátrix inkonzisztenciájának a mérésére. _____ In this paper we apply a new approach for determining a priority vector for the pairwise comparison matrix which plays an important role in Decision Theory. The divergence between the pairwise comparison matrix A and the consistent matrix B defined by the priority vector is measured with the help of the Kullback-Leibler relative entropy function. The minimization of this divergence leads to a convex program in case of a complete matrix, leads to a fixed-point problem in case of an incomplete matrix. The priority vector minimizing the divergence also has the property that the difference of the sums of elements of the matrix A and the matrix B is n times the minimum of the divergence function where n is the dimension of the problem. Thus we developed two reasons for considering the value of the minimum of the divergence as a measure of inconsistency of the matrix A.
Resumo:
The content-based image retrieval is important for various purposes like disease diagnoses from computerized tomography, for example. The relevance, social and economic of image retrieval systems has created the necessity of its improvement. Within this context, the content-based image retrieval systems are composed of two stages, the feature extraction and similarity measurement. The stage of similarity is still a challenge due to the wide variety of similarity measurement functions, which can be combined with the different techniques present in the recovery process and return results that aren’t always the most satisfactory. The most common functions used to measure the similarity are the Euclidean and Cosine, but some researchers have noted some limitations in these functions conventional proximity, in the step of search by similarity. For that reason, the Bregman divergences (Kullback Leibler and I-Generalized) have attracted the attention of researchers, due to its flexibility in the similarity analysis. Thus, the aim of this research was to conduct a comparative study over the use of Bregman divergences in relation the Euclidean and Cosine functions, in the step similarity of content-based image retrieval, checking the advantages and disadvantages of each function. For this, it was created a content-based image retrieval system in two stages: offline and online, using approaches BSM, FISM, BoVW and BoVW-SPM. With this system was created three groups of experiments using databases: Caltech101, Oxford and UK-bench. The performance of content-based image retrieval system using the different functions of similarity was tested through of evaluation measures: Mean Average Precision, normalized Discounted Cumulative Gain, precision at k, precision x recall. Finally, this study shows that the use of Bregman divergences (Kullback Leibler and Generalized) obtains better results than the Euclidean and Cosine measures with significant gains for content-based image retrieval.
Resumo:
Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.
Resumo:
We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.
Resumo:
R from http://www.r-project.org/ is ‘GNU S’ – a language and environment for statistical computingand graphics. The environment in which many classical and modern statistical techniques havebeen implemented, but many are supplied as packages. There are 8 standard packages and many moreare available through the cran family of Internet sites http://cran.r-project.org .We started to develop a library of functions in R to support the analysis of mixtures and our goal isa MixeR package for compositional data analysis that provides support foroperations on compositions: perturbation and power multiplication, subcomposition with or withoutresiduals, centering of the data, computing Aitchison’s, Euclidean, Bhattacharyya distances,compositional Kullback-Leibler divergence etc.graphical presentation of compositions in ternary diagrams and tetrahedrons with additional features:barycenter, geometric mean of the data set, the percentiles lines, marking and coloring ofsubsets of the data set, theirs geometric means, notation of individual data in the set . . .dealing with zeros and missing values in compositional data sets with R procedures for simpleand multiplicative replacement strategy,the time series analysis of compositional data.We’ll present the current status of MixeR development and illustrate its use on selected data sets
Resumo:
Optimal robust M-estimates of a multidimensional parameter are described using Hampel's infinitesimal approach. The optimal estimates are derived by minimizing a measure of efficiency under the model, subject to a bounded measure of infinitesimal robustness. To this purpose we define measures of efficiency and infinitesimal sensitivity based on the Hellinger distance.We show that these two measures coincide with similar ones defined by Yohai using the Kullback-Leibler divergence, and therefore the corresponding optimal estimates coincide too.We also give an example where we fit a negative binomial distribution to a real dataset of "days of stay in hospital" using the optimal robust estimates.
Resumo:
In a recent paper, Komaki studied the second-order asymptotic properties of predictive distributions, using the Kullback-Leibler divergence as a loss function. He showed that estimative distributions with asymptotically efficient estimators can be improved by predictive distributions that do not belong to the model. The model is assumed to be a multidimensional curved exponential family. In this paper we generalize the result assuming as a loss function any f divergence. A relationship arises between alpha connections and optimal predictive distributions. In particular, using an alpha divergence to measure the goodness of a predictive distribution, the optimal shift of the estimate distribution is related to alpha-covariant derivatives. The expression that we obtain for the asymptotic risk is also useful to study the higher-order asymptotic properties of an estimator, in the mentioned class of loss functions.
Resumo:
On étudie l’application des algorithmes de décomposition matricielles tel que la Factorisation Matricielle Non-négative (FMN), aux représentations fréquentielles de signaux audio musicaux. Ces algorithmes, dirigés par une fonction d’erreur de reconstruction, apprennent un ensemble de fonctions de base et un ensemble de coef- ficients correspondants qui approximent le signal d’entrée. On compare l’utilisation de trois fonctions d’erreur de reconstruction quand la FMN est appliquée à des gammes monophoniques et harmonisées: moindre carré, divergence Kullback-Leibler, et une mesure de divergence dépendente de la phase, introduite récemment. Des nouvelles méthodes pour interpréter les décompositions résultantes sont présentées et sont comparées aux méthodes utilisées précédemment qui nécessitent des connaissances du domaine acoustique. Finalement, on analyse la capacité de généralisation des fonctions de bases apprises par rapport à trois paramètres musicaux: l’amplitude, la durée et le type d’instrument. Pour ce faire, on introduit deux algorithmes d’étiquetage des fonctions de bases qui performent mieux que l’approche précédente dans la majorité de nos tests, la tâche d’instrument avec audio monophonique étant la seule exception importante.
Resumo:
R from http://www.r-project.org/ is ‘GNU S’ – a language and environment for statistical computing and graphics. The environment in which many classical and modern statistical techniques have been implemented, but many are supplied as packages. There are 8 standard packages and many more are available through the cran family of Internet sites http://cran.r-project.org . We started to develop a library of functions in R to support the analysis of mixtures and our goal is a MixeR package for compositional data analysis that provides support for operations on compositions: perturbation and power multiplication, subcomposition with or without residuals, centering of the data, computing Aitchison’s, Euclidean, Bhattacharyya distances, compositional Kullback-Leibler divergence etc. graphical presentation of compositions in ternary diagrams and tetrahedrons with additional features: barycenter, geometric mean of the data set, the percentiles lines, marking and coloring of subsets of the data set, theirs geometric means, notation of individual data in the set . . . dealing with zeros and missing values in compositional data sets with R procedures for simple and multiplicative replacement strategy, the time series analysis of compositional data. We’ll present the current status of MixeR development and illustrate its use on selected data sets
Resumo:
The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The purpose of this paper is to develop a Bayesian approach for log-Birnbaum-Saunders Student-t regression models under right-censored survival data. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the considered model. In order to attenuate the influence of the outlying observations on the parameter estimates, we present in this paper Birnbaum-Saunders models in which a Student-t distribution is assumed to explain the cumulative damage. Also, some discussions on the model selection to compare the fitted models are given and case deletion influence diagnostics are developed for the joint posterior distribution based on the Kullback-Leibler divergence. The developed procedures are illustrated with a real data set. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The segmentation of an image aims to subdivide it into constituent regions or objects that have some relevant semantic content. This subdivision can also be applied to videos. However, in these cases, the objects appear in various frames that compose the videos. The task of segmenting an image becomes more complex when they are composed of objects that are defined by textural features, where the color information alone is not a good descriptor of the image. Fuzzy Segmentation is a region-growing segmentation algorithm that uses affinity functions in order to assign to each element in an image a grade of membership for each object (between 0 and 1). This work presents a modification of the Fuzzy Segmentation algorithm, for the purpose of improving the temporal and spatial complexity. The algorithm was adapted to segmenting color videos, treating them as 3D volume. In order to perform segmentation in videos, conventional color model or a hybrid model obtained by a method for choosing the best channels were used. The Fuzzy Segmentation algorithm was also applied to texture segmentation by using adaptive affinity functions defined for each object texture. Two types of affinity functions were used, one defined using the normal (or Gaussian) probability distribution and the other using the Skew Divergence. This latter, a Kullback-Leibler Divergence variation, is a measure of the difference between two probability distributions. Finally, the algorithm was tested in somes videos and also in texture mosaic images composed by images of the Brodatz album
Resumo:
In this paper, we propose a bivariate distribution for the bivariate survival times based on Farlie-Gumbel-Morgenstern copula to model the dependence on a bivariate survival data. The proposed model allows for the presence of censored data and covariates. For inferential purpose a Bayesian approach via Markov Chain Monte Carlo (MCMC) is considered. Further, some discussions on the model selection criteria are given. In order to examine outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. The newly developed procedures are illustrated via a simulation study and a real dataset.
Resumo:
We propose a method to measure real-valued time series irreversibility which combines two different tools: the horizontal visibility algorithm and the Kullback-Leibler divergence. This method maps a time series to a directed network according to a geometric criterion. The degree of irreversibility of the series is then estimated by the Kullback-Leibler divergence (i.e. the distinguishability) between the in and out degree distributions of the associated graph. The method is computationally efficient and does not require any ad hoc symbolization process. We find that the method correctly distinguishes between reversible and irreversible stationary time series, including analytical and numerical studies of its performance for: (i) reversible stochastic processes (uncorrelated and Gaussian linearly correlated), (ii) irreversible stochastic processes (a discrete flashing ratchet in an asymmetric potential), (iii) reversible (conservative) and irreversible (dissipative) chaotic maps, and (iv) dissipative chaotic maps in the presence of noise. Two alternative graph functionals, the degree and the degree-degree distributions, can be used as the Kullback-Leibler divergence argument. The former is simpler and more intuitive and can be used as a benchmark, but in the case of an irreversible process with null net current, the degree-degree distribution has to be considered to identify the irreversible nature of the series