Biblioteca Digital

54 resultados para Kullback leibler

A Kullback-Leibler relatív entrópia függvény alkalmazása páros összehasonlítás mátrix egy prioritásvektora meghatározására (Applying the Kullback-Leibler relative entropy function for determining priorities for the pairwise comparison matrix)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A dolgozatban a döntéselméletben fontos szerepet játszó páros összehasonlítás mátrix prioritásvektorának meghatározására új megközelítést alkalmazunk. Az A páros összehasonlítás mátrix és a prioritásvektor által definiált B konzisztens mátrix közötti eltérést a Kullback-Leibler relatív entrópia-függvény segítségével mérjük. Ezen eltérés minimalizálása teljesen kitöltött mátrix esetében konvex programozási feladathoz vezet, nem teljesen kitöltött mátrix esetében pedig egy fixpont problémához. Az eltérésfüggvényt minimalizáló prioritásvektor egyben azzal a tulajdonsággal is rendelkezik, hogy az A mátrix elemeinek összege és a B mátrix elemeinek összege közötti különbség éppen az eltérésfüggvény minimumának az n-szerese, ahol n a feladat mérete. Így az eltérésfüggvény minimumának értéke két szempontból is lehet alkalmas az A mátrix inkonzisztenciájának a mérésére. _____ In this paper we apply a new approach for determining a priority vector for the pairwise comparison matrix which plays an important role in Decision Theory. The divergence between the pairwise comparison matrix A and the consistent matrix B defined by the priority vector is measured with the help of the Kullback-Leibler relative entropy function. The minimization of this divergence leads to a convex program in case of a complete matrix, leads to a fixed-point problem in case of an incomplete matrix. The priority vector minimizing the divergence also has the property that the difference of the sums of elements of the matrix A and the matrix B is n times the minimum of the divergence function where n is the dimension of the problem. Thus we developed two reasons for considering the value of the minimum of the divergence as a measure of inconsistency of the matrix A.

O uso da Divergência de Kullback-Leibler e da Divergência Generalizada como medida de similaridade em sistemas CBIR

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The content-based image retrieval is important for various purposes like disease diagnoses from computerized tomography, for example. The relevance, social and economic of image retrieval systems has created the necessity of its improvement. Within this context, the content-based image retrieval systems are composed of two stages, the feature extraction and similarity measurement. The stage of similarity is still a challenge due to the wide variety of similarity measurement functions, which can be combined with the different techniques present in the recovery process and return results that aren’t always the most satisfactory. The most common functions used to measure the similarity are the Euclidean and Cosine, but some researchers have noted some limitations in these functions conventional proximity, in the step of search by similarity. For that reason, the Bregman divergences (Kullback Leibler and I-Generalized) have attracted the attention of researchers, due to its flexibility in the similarity analysis. Thus, the aim of this research was to conduct a comparative study over the use of Bregman divergences in relation the Euclidean and Cosine functions, in the step similarity of content-based image retrieval, checking the advantages and disadvantages of each function. For this, it was created a content-based image retrieval system in two stages: offline and online, using approaches BSM, FISM, BoVW and BoVW-SPM. With this system was created three groups of experiments using databases: Caltech101, Oxford and UK-bench. The performance of content-based image retrieval system using the different functions of similarity was tested through of evaluation measures: Mean Average Precision, normalized Discounted Cumulative Gain, precision at k, precision x recall. Finally, this study shows that the use of Bregman divergences (Kullback Leibler and Generalized) obtains better results than the Euclidean and Cosine measures with significant gains for content-based image retrieval.

Using Kullback-Leibler Distance for Text Categorization

Relevância:

100.00% 100.00%

Publicador:

Resumo:

International audience

Augmented Mixed Models For Clustered Proportion Data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.

Bayesian online algorithms for learning in discrete Hidden Markov Models

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose and analyze two different Bayesian online algorithms for learning in discrete Hidden Markov Models and compare their performance with the already known Baldi-Chauvin Algorithm. Using the Kullback-Leibler divergence as a measure of generalization we draw learning curves in simplified situations for these algorithms and compare their performances.

Mean anisotropy of homogeneous Gaussian random fields and anisotropic norms of linear translation-invariant operators on multidimensional integer lattices

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sensitivity of output of a linear operator to its input can be quantified in various ways. In Control Theory, the input is usually interpreted as disturbance and the output is to be minimized in some sense. In stochastic worst-case design settings, the disturbance is considered random with imprecisely known probability distribution. The prior set of probability measures can be chosen so as to quantify how far the disturbance deviates from the white-noise hypothesis of Linear Quadratic Gaussian control. Such deviation can be measured by the minimal Kullback-Leibler informational divergence from the Gaussian distributions with zero mean and scalar covariance matrices. The resulting anisotropy functional is defined for finite power random vectors. Originally, anisotropy was introduced for directionally generic random vectors as the relative entropy of the normalized vector with respect to the uniform distribution on the unit sphere. The associated a-anisotropic norm of a matrix is then its maximum root mean square or average energy gain with respect to finite power or directionally generic inputs whose anisotropy is bounded above by a≥0. We give a systematic comparison of the anisotropy functionals and the associated norms. These are considered for unboundedly growing fragments of homogeneous Gaussian random fields on multidimensional integer lattice to yield mean anisotropy. Correspondingly, the anisotropic norms of finite matrices are extended to bounded linear translation invariant operators over such fields.

Relative information entropy in cosmology: The problem of information entanglement

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The necessary information to distinguish a local inhomogeneous mass density field from its spatial average on a compact domain of the universe can be measured by relative information entropy. The Kullback-Leibler (KL) formula arises very naturally in this context, however, it provides a very complicated way to compute the mutual information between spatially separated but causally connected regions of the universe in a realistic, inhomogeneous model. To circumvent this issue, by considering a parametric extension of the KL measure, we develop a simple model to describe the mutual information which is entangled via the gravitational field equations. We show that the Tsallis relative entropy can be a good approximation in the case of small inhomogeneities, and for measuring the independent relative information inside the domain, we propose the R\'enyi relative entropy formula.

Compositional Data Analysis with R

Relevância:

60.00% 60.00%

Publicador:

Resumo:

R from http://www.r-project.org/ is ‘GNU S’ – a language and environment for statistical computingand graphics. The environment in which many classical and modern statistical techniques havebeen implemented, but many are supplied as packages. There are 8 standard packages and many moreare available through the cran family of Internet sites http://cran.r-project.org .We started to develop a library of functions in R to support the analysis of mixtures and our goal isa MixeR package for compositional data analysis that provides support foroperations on compositions: perturbation and power multiplication, subcomposition with or withoutresiduals, centering of the data, computing Aitchison’s, Euclidean, Bhattacharyya distances,compositional Kullback-Leibler divergence etc.graphical presentation of compositions in ternary diagrams and tetrahedrons with additional features:barycenter, geometric mean of the data set, the percentiles lines, marking and coloring ofsubsets of the data set, theirs geometric means, notation of individual data in the set . . .dealing with zeros and missing values in compositional data sets with R procedures for simpleand multiplicative replacement strategy,the time series analysis of compositional data.We’ll present the current status of MixeR development and illustrate its use on selected data sets

Optimal robust estimates using the Hellinger distance

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Optimal robust M-estimates of a multidimensional parameter are described using Hampel's infinitesimal approach. The optimal estimates are derived by minimizing a measure of efficiency under the model, subject to a bounded measure of infinitesimal robustness. To this purpose we define measures of efficiency and infinitesimal sensitivity based on the Hellinger distance.We show that these two measures coincide with similar ones defined by Yohai using the Kullback-Leibler divergence, and therefore the corresponding optimal estimates coincide too.We also give an example where we fit a negative binomial distribution to a real dataset of "days of stay in hospital" using the optimal robust estimates.

Aitchison Geometry for Probability and Likelihood as a new approach to mathematical statistics

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Centralnotations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform.In this way very elaborated aspects of mathematical statistics can be understoodeasily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating,combination of likelihood and robust M-estimation functions are simple additions/perturbations in A2(Pprior). Weighting observations corresponds to a weightedaddition of the corresponding evidence.Likelihood based statistics for general exponential families turns out to have aparticularly easy interpretation in terms of A2(P). Regular exponential families formfinite dimensional linear subspaces of A2(P) and they correspond to finite dimensionalsubspaces formed by their posterior in the dual information space A2(Pprior).The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P.The discussion of A2(P) valued random variables, such as estimation functionsor likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning

On the relationship between connections and the asymptotic properties of predictive distributions

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In a recent paper, Komaki studied the second-order asymptotic properties of predictive distributions, using the Kullback-Leibler divergence as a loss function. He showed that estimative distributions with asymptotically efficient estimators can be improved by predictive distributions that do not belong to the model. The model is assumed to be a multidimensional curved exponential family. In this paper we generalize the result assuming as a loss function any f divergence. A relationship arises between alpha connections and optimal predictive distributions. In particular, using an alpha divergence to measure the goodness of a predictive distribution, the optimal shift of the estimate distribution is related to alpha-covariant derivatives. The expression that we obtain for the asymptotic risk is also useful to study the higher-order asymptotic properties of an estimator, in the mentioned class of loss functions.

Etude des modèles de Whittle markoviens probabilisés

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Etude des modèles de Whittle markoviens probabilisés Résumé Le modèle de Whittle markovien probabilisé est un modèle de champ spatial autorégressif simultané d'ordre 1 qui exprime simultanément chaque variable du champ comme une moyenne pondérée aléatoire des variables adjacentes du champ, amortie d'un coefficient multiplicatif ρ, et additionnée d'un terme d'erreur (qui est une variable gaussienne homoscédastique spatialement indépendante, non mesurable directement). Dans notre cas, la moyenne pondérée est une moyenne arithmétique qui est aléatoire du fait de deux conditions : (a) deux variables sont adjacentes (au sens d'un graphe) avec une probabilité 1 − p si la distance qui les sépare est inférieure à un certain seuil, (b) il n'y a pas d'adjacence pour des distances au-dessus de ce seuil. Ces conditions déterminent un modèle d'adjacence (ou modèle de connexité) du champ spatial. Un modèle de Whittle markovien probabilisé aux conditions où p = 0 donne un modèle de Whittle classique qui est plus familier en géographie, économétrie spatiale, écologie, sociologie, etc. et dont ρ est le coefficient d'autorégression. Notre modèle est donc une forme probabilisée au niveau de la connexité du champ de la forme des modèles de Whittle classiques, amenant une description innovante de l'autocorrélation spatiale. Nous commençons par décrire notre modèle spatial en montrant les effets de la complexité introduite par le modèle de connexité sur le pattern de variances et la corrélation spatiale du champ. Nous étudions ensuite la problématique de l'estimation du coefficent d'autorégression ρ pour lequel au préalable nous effectuons une analyse approfondie de son information au sens de Fisher et de Kullback-Leibler. Nous montrons qu'un estimateur non biaisé efficace de ρ possède une efficacité qui varie en fonction du paramètre p, généralement de manière non monotone, et de la structure du réseau d'adjacences. Dans le cas où la connexité du champ est non observée, nous montrons qu'une mauvaise spécification de l'estimateur de maximum de vraisemblance de ρ peut biaiser celui-ci en fonction de p. Nous proposons dans ce contexte d'autres voies pour estimer ρ. Pour finir, nous étudions la puissance des tests de significativité de ρ pour lesquels les statistiques de test sont des variantes classiques du I de Moran (test de Cliff-Ord) et du I de Moran maximal (en s'inspirant de la méthode de Kooijman). Nous observons la variation de puissance en fonction du paramètre p et du coefficient ρ, montrant par cette voie la dualité de l'autocorrélation spatiale entre intensité et connectivité dans le contexte des modèles autorégressifs

Non-negative matrix decomposition approaches to frequency domain analysis of music audio signals

Relevância:

60.00% 60.00%

Publicador:

Resumo:

On étudie l’application des algorithmes de décomposition matricielles tel que la Factorisation Matricielle Non-négative (FMN), aux représentations fréquentielles de signaux audio musicaux. Ces algorithmes, dirigés par une fonction d’erreur de reconstruction, apprennent un ensemble de fonctions de base et un ensemble de coef- ficients correspondants qui approximent le signal d’entrée. On compare l’utilisation de trois fonctions d’erreur de reconstruction quand la FMN est appliquée à des gammes monophoniques et harmonisées: moindre carré, divergence Kullback-Leibler, et une mesure de divergence dépendente de la phase, introduite récemment. Des nouvelles méthodes pour interpréter les décompositions résultantes sont présentées et sont comparées aux méthodes utilisées précédemment qui nécessitent des connaissances du domaine acoustique. Finalement, on analyse la capacité de généralisation des fonctions de bases apprises par rapport à trois paramètres musicaux: l’amplitude, la durée et le type d’instrument. Pour ce faire, on introduit deux algorithmes d’étiquetage des fonctions de bases qui performent mieux que l’approche précédente dans la majorité de nos tests, la tâche d’instrument avec audio monophonique étant la seule exception importante.

Aitchison Geometry for Probability and Likelihood as a new approach to mathematical statistics

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Central notations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform. In this way very elaborated aspects of mathematical statistics can be understood easily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating, combination of likelihood and robust M-estimation functions are simple additions/ perturbations in A2(Pprior). Weighting observations corresponds to a weighted addition of the corresponding evidence. Likelihood based statistics for general exponential families turns out to have a particularly easy interpretation in terms of A2(P). Regular exponential families form finite dimensional linear subspaces of A2(P) and they correspond to finite dimensional subspaces formed by their posterior in the dual information space A2(Pprior). The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P. The discussion of A2(P) valued random variables, such as estimation functions or likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning

Compositional Data Analysis with R

Relevância:

60.00% 60.00%

Publicador:

Resumo:

R from http://www.r-project.org/ is ‘GNU S’ – a language and environment for statistical computing and graphics. The environment in which many classical and modern statistical techniques have been implemented, but many are supplied as packages. There are 8 standard packages and many more are available through the cran family of Internet sites http://cran.r-project.org . We started to develop a library of functions in R to support the analysis of mixtures and our goal is a MixeR package for compositional data analysis that provides support for operations on compositions: perturbation and power multiplication, subcomposition with or without residuals, centering of the data, computing Aitchison’s, Euclidean, Bhattacharyya distances, compositional Kullback-Leibler divergence etc. graphical presentation of compositions in ternary diagrams and tetrahedrons with additional features: barycenter, geometric mean of the data set, the percentiles lines, marking and coloring of subsets of the data set, theirs geometric means, notation of individual data in the set . . . dealing with zeros and missing values in compositional data sets with R procedures for simple and multiplicative replacement strategy, the time series analysis of compositional data. We’ll present the current status of MixeR development and illustrate its use on selected data sets

«
1
2
3
4
»