950 resultados para Gaussian random fields
Resumo:
Super-resolution refers to the process of obtaining a high resolution image from one or more low resolution images. In this work, we present a novel method for the super-resolution problem for the limited case, where only one image of low resolution is given as an input. The proposed method is based on statistical learning for inferring the high frequencies regions which helps to distinguish a high resolution image from a low resolution one. These inferences are obtained from the correlation between regions of low and high resolution that come exclusively from the image to be super-resolved, in term of small neighborhoods. The Markov random fields are used as a model to capture the local statistics of high and low resolution data when they are analyzed at different scales and resolutions. Experimental results show the viability of the method.
Resumo:
This article describes a finite element-based formulation for the statistical analysis of the response of stochastic structural composite systems whose material properties are described by random fields. A first-order technique is used to obtain the second-order statistics for the structural response considering means and variances of the displacement and stress fields of plate or shell composite structures. Propagation of uncertainties depends on sensitivities taken as measurement of variation effects. The adjoint variable method is used to obtain the sensitivity matrix. This method is appropriated for composite structures due to the large number of random input parameters. Dominant effects on the stochastic characteristics are studied analyzing the influence of different random parameters. In particular, a study of the anisotropy influence on uncertainties propagation of angle-ply composites is carried out based on the proposed approach.
Resumo:
L'un des modèles d'apprentissage non-supervisé générant le plus de recherche active est la machine de Boltzmann --- en particulier la machine de Boltzmann restreinte, ou RBM. Un aspect important de l'entraînement ainsi que l'exploitation d'un tel modèle est la prise d'échantillons. Deux développements récents, la divergence contrastive persistante rapide (FPCD) et le herding, visent à améliorer cet aspect, se concentrant principalement sur le processus d'apprentissage en tant que tel. Notamment, le herding renonce à obtenir un estimé précis des paramètres de la RBM, définissant plutôt une distribution par un système dynamique guidé par les exemples d'entraînement. Nous généralisons ces idées afin d'obtenir des algorithmes permettant d'exploiter la distribution de probabilités définie par une RBM pré-entraînée, par tirage d'échantillons qui en sont représentatifs, et ce sans que l'ensemble d'entraînement ne soit nécessaire. Nous présentons trois méthodes: la pénalisation d'échantillon (basée sur une intuition théorique) ainsi que la FPCD et le herding utilisant des statistiques constantes pour la phase positive. Ces méthodes définissent des systèmes dynamiques produisant des échantillons ayant les statistiques voulues et nous les évaluons à l'aide d'une méthode d'estimation de densité non-paramétrique. Nous montrons que ces méthodes mixent substantiellement mieux que la méthode conventionnelle, l'échantillonnage de Gibbs.
Resumo:
Les données provenant de l'échantillonnage fin d'un processus continu (champ aléatoire) peuvent être représentées sous forme d'images. Un test statistique permettant de détecter une différence entre deux images peut être vu comme un ensemble de tests où chaque pixel est comparé au pixel correspondant de l'autre image. On utilise alors une méthode de contrôle de l'erreur de type I au niveau de l'ensemble de tests, comme la correction de Bonferroni ou le contrôle du taux de faux-positifs (FDR). Des méthodes d'analyse de données ont été développées en imagerie médicale, principalement par Keith Worsley, utilisant la géométrie des champs aléatoires afin de construire un test statistique global sur une image entière. Il s'agit d'utiliser l'espérance de la caractéristique d'Euler de l'ensemble d'excursion du champ aléatoire sous-jacent à l'échantillon au-delà d'un seuil donné, pour déterminer la probabilité que le champ aléatoire dépasse ce même seuil sous l'hypothèse nulle (inférence topologique). Nous exposons quelques notions portant sur les champs aléatoires, en particulier l'isotropie (la fonction de covariance entre deux points du champ dépend seulement de la distance qui les sépare). Nous discutons de deux méthodes pour l'analyse des champs anisotropes. La première consiste à déformer le champ puis à utiliser les volumes intrinsèques et les compacités de la caractéristique d'Euler. La seconde utilise plutôt les courbures de Lipschitz-Killing. Nous faisons ensuite une étude de niveau et de puissance de l'inférence topologique en comparaison avec la correction de Bonferroni. Finalement, nous utilisons l'inférence topologique pour décrire l'évolution du changement climatique sur le territoire du Québec entre 1991 et 2100, en utilisant des données de température simulées et publiées par l'Équipe Simulations climatiques d'Ouranos selon le modèle régional canadien du climat.
Resumo:
Matheron's usual variogram estimator can result in unreliable variograms when data are strongly asymmetric or skewed. Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. This paper examines the effects of underlying asymmetry on the variogram and on the accuracy of prediction, and the second one examines the effects arising from outliers. Standard geostatistical texts suggest ways of dealing with underlying asymmetry; however, this is based on informed intuition rather than detailed investigation. To determine whether the methods generally used to deal with underlying asymmetry are appropriate, the effects of different coefficients of skewness on the shape of the experimental variogram and on the model parameters were investigated. Simulated annealing was used to create normally distributed random fields of different size from variograms with different nugget:sill ratios. These data were then modified to give different degrees of asymmetry and the experimental variogram was computed in each case. The effects of standard data transformations on the form of the variogram were also investigated. Cross-validation was used to assess quantitatively the performance of the different variogram models for kriging. The results showed that the shape of the variogram was affected by the degree of asymmetry, and that the effect increased as the size of data set decreased. Transformations of the data were more effective in reducing the skewness coefficient in the larger sets of data. Cross-validation confirmed that variogram models from transformed data were more suitable for kriging than were those from the raw asymmetric data. The results of this study have implications for the 'standard best practice' in dealing with asymmetry in data for geostatistical analyses. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. The first paper of this series examined the effects of the former on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to create normally distributed random fields of different size that are realizations of known processes described by variograms with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were computed from these data by Matheron's estimator and by three robust estimators. The effects of standard data transformations on the coefficient of skewness and on the variogram were also investigated. Cross-validation was used to assess the performance of models fitted to experimental variograms computed from a range of data contaminated by outliers for kriging. The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both square roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near to the edge or the centre of the field. The results of cross-validation showed that the robust estimators and the removal of outliers were the most effective ways of dealing with outliers for variogram estimation and kriging. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.
Resumo:
Monte Carlo algorithms often aim to draw from a distribution π by simulating a Markov chain with transition kernel P such that π is invariant under P. However, there are many situations for which it is impractical or impossible to draw from the transition kernel P. For instance, this is the case with massive datasets, where is it prohibitively expensive to calculate the likelihood and is also the case for intractable likelihood models arising from, for example, Gibbs random fields, such as those found in spatial statistics and network analysis. A natural approach in these cases is to replace P by an approximation Pˆ. Using theory from the stability of Markov chains we explore a variety of situations where it is possible to quantify how ’close’ the chain given by the transition kernel Pˆ is to the chain given by P . We apply these results to several examples from spatial statistics and network analysis.
Resumo:
Clusters of galaxies are the most impressive gravitationally-bound systems in the universe, and their abundance (the cluster mass function) is an important statistic to probe the matter density parameter (Omega(m)) and the amplitude of density fluctuations (sigma(8)). The cluster mass function is usually described in terms of the Press-Schecther (PS) formalism where the primordial density fluctuations are assumed to be a Gaussian random field. In previous works we have proposed a non-Gaussian analytical extension of the PS approach with basis on the q-power law distribution (PL) of the nonextensive kinetic theory. In this paper, by applying the PL distribution to fit the observational mass function data from X-ray highest flux-limited sample (HIFLUGCS), we find a strong degeneracy among the cosmic parameters, sigma(8), Omega(m) and the q parameter from the PL distribution. A joint analysis involving recent observations from baryon acoustic oscillation (BAO) peak and Cosmic Microwave Background (CMB) shift parameter is carried out in order to break these degeneracy and better constrain the physically relevant parameters. The present results suggest that the next generation of cluster surveys will be able to probe the quantities of cosmological interest (sigma(8), Omega(m)) and the underlying cluster physics quantified by the q-parameter.
Resumo:
This letter presents pseudolikelihood equations for the estimation of the Potts Markov random field model parameter on higher order neighborhood systems. The derived equation for second-order systems is a significantly reduced version of a recent result found in the literature (from 67 to 22 terms). Also, with the proposed method, a completely original equation for Potts model parameter estimation in third-order systems was obtained. These equations allow the modeling of less restrictive contextual systems for a large number of applications in a computationally feasible way. Experiments with both simulated and real remote sensing images provided good results.
Resumo:
Purpose: To evaluate the microvessel density by comparing the performance of anti-factor VIII-related antigen, anti-CD31 and, anti-CD34 monoclonal antibodies in breast cancer. Methods: Twenty-three postmenopausal women diagnosed with Stage II breast cancer submitted to definitive surgical treatment were evaluated. The monoclonal antibodies used were anti-factor VIII, anti-CD31 and anti-CD34. Microvessels were counted in the areas of highest microvessel density in ten random fields (200 x). The data were analyzed using the Kruskal-Wallis nonparametric test (p < 0.05). Results: Mean microvessel densities with anti-factor VIII, anti-CD31 and anti-CD34 were 4.16 +/- 0.38, 4.09 +/- 0.23 and 6.59 +/- 0.42, respectively. Microvessel density as assessed by anti-CD34 was significantly greater than that detected by anti-CD31 or anti-factor VIII (p < 0.0001). There was no statistically significant difference between anti-CD31 and anti-factor VIII (p = 0.4889). Conclusion: The density of stained microvessels was greater and staining was more intense with anti-CD34 compared to anti-CD31 and anti-factor VII-related antigen.
Resumo:
We present a new version of the hglm package for fittinghierarchical generalized linear models (HGLM) with spatially correlated random effects. A CAR family for conditional autoregressive random effects was implemented. Eigen decomposition of the matrix describing the spatial structure (e.g. the neighborhood matrix) was used to transform the CAR random effectsinto an independent, but heteroscedastic, gaussian random effect. A linear predictor is fitted for the random effect variance to estimate the parameters in the CAR model.This gives a computationally efficient algorithm for moderately sized problems (e.g. n<5000).
Resumo:
We present a new version (> 2.0) of the hglm package for fitting hierarchical generalized linear models (HGLMs) with spatially correlated random effects. CAR() and SAR() families for conditional and simultaneous autoregressive random effects were implemented. Eigen decomposition of the matrix describing the spatial structure (e.g., the neighborhood matrix) was used to transform the CAR/SAR random effects into an independent, but eteroscedastic, Gaussian random effect. A linear predictor is fitted for the random effect variance to estimate the parameters in the CAR and SAR models. This gives a computationally efficient algorithm for moderately sized problems.
Resumo:
Point pattern matching in Euclidean Spaces is one of the fundamental problems in Pattern Recognition, having applications ranging from Computer Vision to Computational Chemistry. Whenever two complex patterns are encoded by two sets of points identifying their key features, their comparison can be seen as a point pattern matching problem. This work proposes a single approach to both exact and inexact point set matching in Euclidean Spaces of arbitrary dimension. In the case of exact matching, it is assured to find an optimal solution. For inexact matching (when noise is involved), experimental results confirm the validity of the approach. We start by regarding point pattern matching as a weighted graph matching problem. We then formulate the weighted graph matching problem as one of Bayesian inference in a probabilistic graphical model. By exploiting the existence of fundamental constraints in patterns embedded in Euclidean Spaces, we prove that for exact point set matching a simple graphical model is equivalent to the full model. It is possible to show that exact probabilistic inference in this simple model has polynomial time complexity with respect to the number of elements in the patterns to be matched. This gives rise to a technique that for exact matching provably finds a global optimum in polynomial time for any dimensionality of the underlying Euclidean Space. Computational experiments comparing this technique with well-known probabilistic relaxation labeling show significant performance improvement for inexact matching. The proposed approach is significantly more robust under augmentation of the sizes of the involved patterns. In the absence of noise, the results are always perfect.
Resumo:
Th17 cells have been strongly associated to the pathogenesis of inflammatory and autoimmune diseases, although their influence on the carcinogenesis is still little known, there are reports of anti-tumor and protumoral actions. The objective of this study is to research the presence of Th17 lineage in lip and tongue SCC, using the analysis of the immunoexpression of IL-17 and RORγt, relating this immunoexpression with clinical and morphological findings in the attempt to better comprehend the role of these cells on the tumoral immunity of OSCCs. The results were submitted to non-parametric statistical tests with significance level of 5%. On the histomorphological analysis, it was observed the predominance of low level lesions on lip and high level lesions on tongue (p=0,024). It was not observed statistical significance between clinical stage and histological gradation of malignancy (p=0,644). For the immunohistochemical study, 5 random fields with greater immunoreactivity of the peritumoral inflammatory infiltrate were photomicrographed on the 400x magnification. It was done the count of lymphocytes which showed cytoplasmic and pericytoplasmic staining for the IL-17 cytokine as well as nuclear and cytoplasmic staining for RORγt. It was observed statistical significance difference on the quantity of immunopositive lymphocytes to IL-17 between the groups of SCC of lip and tongue (p=0,028). For the RORγt it was not observed statistical significance difference between the groups of SCC of lip and tongue (p=0,915). It was not observed statistical difference between the immunostaining of IL-17 and RORγt with histological gradation of malignancy and clinical staging. The findings of this research suggest a possible anti-tumor role of IL-17 for cases of lip. The results of the analysis of the RORγt are possibly due to the wide duality of the anti-tumor and protumoral role of the Th17 cells and their plasticity which, in the presence of different cytokines expressed on the tumor microenvironment, can alter its phenotype.