941 resultados para Kernel density estimation
Resumo:
Sigmoid type belief networks, a class of probabilistic neural networks, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and supervised learning problems. Often the parameters used in these networks need to be learned from examples. Unfortunately, estimating the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them exactly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains. The complementary networks can be used for continuous density estimation as well.
Resumo:
This paper introduces a probability model, the mixture of trees that can account for sparse, dynamically changing dependence relationships. We present a family of efficient algorithms that use EM and the Minimum Spanning Tree algorithm to find the ML and MAP mixture of trees for a variety of priors, including the Dirichlet and the MDL priors. We also show that the single tree classifier acts like an implicit feature selector, thus making the classification performance insensitive to irrelevant attributes. Experimental results demonstrate the excellent performance of the new model both in density estimation and in classification.
Resumo:
Aquesta tesi està emmarcada dins la detecció precoç de masses, un dels símptomes més clars del càncer de mama, en imatges mamogràfiques. Primerament, s'ha fet un anàlisi extensiu dels diferents mètodes de la literatura, concloent que aquests mètodes són dependents de diferent paràmetres: el tamany i la forma de la massa i la densitat de la mama. Així, l'objectiu de la tesi és analitzar, dissenyar i implementar un mètode de detecció robust i independent d'aquests tres paràmetres. Per a tal fi, s'ha construït un patró deformable de la massa a partir de l'anàlisi de masses reals i, a continuació, aquest model és buscat en les imatges seguint un esquema probabilístic, obtenint una sèrie de regions sospitoses. Fent servir l'anàlisi 2DPCA, s'ha construït un algorisme capaç de discernir aquestes regions són realment una massa o no. La densitat de la mama és un paràmetre que s'introdueix de forma natural dins l'algorisme.
Resumo:
La tesis se centra en la Visión por Computador y, más concretamente, en la segmentación de imágenes, la cual es una de las etapas básicas en el análisis de imágenes y consiste en la división de la imagen en un conjunto de regiones visualmente distintas y uniformes considerando su intensidad, color o textura. Se propone una estrategia basada en el uso complementario de la información de región y de frontera durante el proceso de segmentación, integración que permite paliar algunos de los problemas básicos de la segmentación tradicional. La información de frontera permite inicialmente identificar el número de regiones presentes en la imagen y colocar en el interior de cada una de ellas una semilla, con el objetivo de modelar estadísticamente las características de las regiones y definir de esta forma la información de región. Esta información, conjuntamente con la información de frontera, es utilizada en la definición de una función de energía que expresa las propiedades requeridas a la segmentación deseada: uniformidad en el interior de las regiones y contraste con las regiones vecinas en los límites. Un conjunto de regiones activas inician entonces su crecimiento, compitiendo por los píxeles de la imagen, con el objetivo de optimizar la función de energía o, en otras palabras, encontrar la segmentación que mejor se adecua a los requerimientos exprsados en dicha función. Finalmente, todo esta proceso ha sido considerado en una estructura piramidal, lo que nos permite refinar progresivamente el resultado de la segmentación y mejorar su coste computacional. La estrategia ha sido extendida al problema de segmentación de texturas, lo que implica algunas consideraciones básicas como el modelaje de las regiones a partir de un conjunto de características de textura y la extracción de la información de frontera cuando la textura es presente en la imagen. Finalmente, se ha llevado a cabo la extensión a la segmentación de imágenes teniendo en cuenta las propiedades de color y textura. En este sentido, el uso conjunto de técnicas no-paramétricas de estimación de la función de densidad para la descripción del color, y de características textuales basadas en la matriz de co-ocurrencia, ha sido propuesto para modelar adecuadamente y de forma completa las regiones de la imagen. La propuesta ha sido evaluada de forma objetiva y comparada con distintas técnicas de integración utilizando imágenes sintéticas. Además, se han incluido experimentos con imágenes reales con resultados muy positivos.
Resumo:
This paper describes the crowd image analysis challenge that forms part of the PETS 2009 workshop. The aim of this challenge is to use new or existing systems for i) crowd count and density estimation, ii) tracking of individual(s) within a crowd, and iii) detection of separate flows and specific crowd events, in a real-world environment. The dataset scenarios were filmed from multiple cameras and involve multiple actors.
Resumo:
This paper describes the crowd image analysis challenge that forms part of the PETS 2009 workshop. The aim of this challenge is to use new or existing systems for i) crowd count and density estimation, ii) tracking of individual(s) within a crowd, and iii) detection of separate flows and specific crowd events, in a real-world environment. The dataset scenarios were filmed from multiple cameras and involve multiple actors.
Resumo:
We propose a unified data modeling approach that is equally applicable to supervised regression and classification applications, as well as to unsupervised probability density function estimation. A particle swarm optimization (PSO) aided orthogonal forward regression (OFR) algorithm based on leave-one-out (LOO) criteria is developed to construct parsimonious radial basis function (RBF) networks with tunable nodes. Each stage of the construction process determines the center vector and diagonal covariance matrix of one RBF node by minimizing the LOO statistics. For regression applications, the LOO criterion is chosen to be the LOO mean square error, while the LOO misclassification rate is adopted in two-class classification applications. By adopting the Parzen window estimate as the desired response, the unsupervised density estimation problem is transformed into a constrained regression problem. This PSO aided OFR algorithm for tunable-node RBF networks is capable of constructing very parsimonious RBF models that generalize well, and our analysis and experimental results demonstrate that the algorithm is computationally even simpler than the efficient regularization assisted orthogonal least square algorithm based on LOO criteria for selecting fixed-node RBF models. Another significant advantage of the proposed learning procedure is that it does not have learning hyperparameters that have to be tuned using costly cross validation. The effectiveness of the proposed PSO aided OFR construction procedure is illustrated using several examples taken from regression and classification, as well as density estimation applications.
Resumo:
Background: Microarray based comparative genomic hybridisation (CGH) experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results: The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion: After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.
Resumo:
We generalize the popular ensemble Kalman filter to an ensemble transform filter, in which the prior distribution can take the form of a Gaussian mixture or a Gaussian kernel density estimator. The design of the filter is based on a continuous formulation of the Bayesian filter analysis step. We call the new filter algorithm the ensemble Gaussian-mixture filter (EGMF). The EGMF is implemented for three simple test problems (Brownian dynamics in one dimension, Langevin dynamics in two dimensions and the three-dimensional Lorenz-63 model). It is demonstrated that the EGMF is capable of tracking systems with non-Gaussian uni- and multimodal ensemble distributions. Copyright © 2011 Royal Meteorological Society
Resumo:
The bubble crab Dotilla fenestrata forms very dense populations on the sand flats of the eastern coast of Inhaca Island, Mozambique, making it an interesting biological model to examine spatial distribution patterns and test the relative efficiency of common sampling methods. Due to its apparent ecological importance within the sandy intertidal community, understanding the factors ruling the dynamics of Dotilla populations is also a key issue. In this study, different techniques of estimating crab density are described, and the trends of spatial distribution of the different population categories are shown. The studied populations are arranged in discrete patches located at the well-drained crests of nearly parallel mega sand ripples. For a given sample size, there was an obvious gain in precision by using a stratified random sampling technique, considering discrete patches as strata, compared to the simple random design. Density average and variance differed considerably among patches since juveniles and ovigerous females were found clumped, with higher densities at the lower and upper shore levels, respectively. Burrow counting was found to be an adequate method for large-scale sampling, although consistently underestimating actual crab density by nearly half. Regression analyses suggested that crabs smaller than 2.9 mm carapace width tend to be undetected in visual burrow counts. A visual survey of sampling plots over several patches of a large Dotilla population showed that crab density varied in an interesting oscillating pattern, apparently following the topography of the sand flat. Patches extending to the lower shore contained higher densities than those mostly covering the higher shore. Within-patch density variability also pointed to the same trend, but the density increment towards the lowest shore level varied greatly among the patches compared.
Resumo:
O objetivo deste estudo é caracterizar pela primeira vez alguns aspectos da reprodução do caranguejo-uçá em manguezais da Baía da Babitonga (Santa Catarina). Além disso, a densidade e o tamanho do estoque deste recurso pesqueiro foram também estimados. Os exemplares foram coletados mensalmente, de maio de 2002 a abril de 2003, em duas áreas distintas: Iperoba e Palmital; um total de 2265 espécimes (1623 machos e 642 fêmeas) foi analisado. Os machos com gônadas maturas foram registrados durante todo o ano, enquanto as fêmeas com gônadas maturas ocorreram em apenas cinco meses. As fêmeas ovígeras foram registradas apenas em dezembro e janeiro. O etograma do fenômeno de migração reprodutiva (andada) esteve em concordância com a maior atividade de caranguejos associada às luas cheias e novas, com maior intensidade em dezembro e janeiro, relacionados ao verão austral. A densidade total no Manguezal de Iperoba foi de 2,05 ± 0,97 ind./m², não diferindo significativamente daquela registrada para o Manguezal do Palmital (2,06 ± 1,08 ind./m²) (p < 0,05). A média global para a estimativa de densidade na Baia da Babitonga foi de 2,05 ± 1,00 ind./m², correspondendo a 1,42 ± 0,89 ind./m² com base nas galerias abertas e 0,64 ± 0,63 ind./m² para as galerias fechadas.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
In this work, we propose a two-stage algorithm for real-time fault detection and identification of industrial plants. Our proposal is based on the analysis of selected features using recursive density estimation and a new evolving classifier algorithm. More specifically, the proposed approach for the detection stage is based on the concept of density in the data space, which is not the same as probability density function, but is a very useful measure for abnormality/outliers detection. This density can be expressed by a Cauchy function and can be calculated recursively, which makes it memory and computational power efficient and, therefore, suitable for on-line applications. The identification/diagnosis stage is based on a self-developing (evolving) fuzzy rule-based classifier system proposed in this work, called AutoClass. An important property of AutoClass is that it can start learning from scratch". Not only do the fuzzy rules not need to be prespecified, but neither do the number of classes for AutoClass (the number may grow, with new class labels being added by the on-line learning process), in a fully unsupervised manner. In the event that an initial rule base exists, AutoClass can evolve/develop it further based on the newly arrived faulty state data. In order to validate our proposal, we present experimental results from a level control didactic process, where control and error signals are used as features for the fault detection and identification systems, but the approach is generic and the number of features can be significant due to the computationally lean methodology, since covariance or more complex calculations, as well as storage of old data, are not required. The obtained results are significantly better than the traditional approaches used for comparison
Resumo:
In the work reported here we present theoretical and numerical results about a Risk Model with Interest Rate and Proportional Reinsurance based on the article Inequalities for the ruin probability in a controlled discrete-time risk process by Ros ario Romera and Maikol Diasparra (see [5]). Recursive and integral equations as well as upper bounds for the Ruin Probability are given considering three di erent approaches, namely, classical Lundberg inequality, Inductive approach and Martingale approach. Density estimation techniques (non-parametrics) are used to derive upper bounds for the Ruin Probability and the algorithms used in the simulation are presented
Resumo:
The goal of this work is to assess the efficacy of texture measures for estimating levels of crowd densities ill images. This estimation is crucial for the problem of crowd monitoring. and control. The assessment is carried out oil a set of nearly 300 real images captured from Liverpool Street Train Station. London, UK using texture measures extracted from the images through the following four different methods: gray level dependence matrices, straight lille segments. Fourier analysis. and fractal dimensions. The estimations of dowel densities are given in terms of the classification of the input images ill five classes of densities (very low, low. moderate. high and very high). Three types of classifiers are used: neural (implemented according to the Kohonen model). Bayesian. and an approach based on fitting functions. The results obtained by these three classifiers. using the four texture measures. allowed the conclusion that, for the problem of crowd density estimation. texture analysis is very effective.