932 resultados para Probabilistic latent semantic model
Resumo:
L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.
Resumo:
The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.
Resumo:
We study a climatologically important interaction of two of the main components of the geophysical system by adding an energy balance model for the averaged atmospheric temperature as dynamic boundary condition to a diagnostic ocean model having an additional spatial dimension. In this work, we give deeper insight than previous papers in the literature, mainly with respect to the 1990 pioneering model by Watts and Morantine. We are taking into consideration the latent heat for the two phase ocean as well as a possible delayed term. Non-uniqueness for the initial boundary value problem, uniqueness under a non-degeneracy condition and the existence of multiple stationary solutions are proved here. These multiplicity results suggest that an S-shaped bifurcation diagram should be expected to occur in this class of models generalizing previous energy balance models. The numerical method applied to the model is based on a finite volume scheme with nonlinear weighted essentially non-oscillatory reconstruction and Runge–Kutta total variation diminishing for time integration.
Resumo:
The amount of information contained within the Internet has exploded in recent decades. As more and more news, blogs, and many other kinds of articles that are published on the Internet, categorization of articles and documents are increasingly desired. Among the approaches to categorize articles, labeling is one of the most common method; it provides a relatively intuitive and effective way to separate articles into different categories. However, manual labeling is limited by its efficiency, even thought the labels selected manually have relatively high quality. This report explores the topic modeling approach of Online Latent Dirichlet Allocation (Online-LDA). Additionally, a method to automatically label articles with their latent topics by combining the Online-LDA posterior with a probabilistic automatic labeling algorithm is implemented. The goal of this report is to examine the accuracy of the labels generated automatically by a topic model and probabilistic relevance algorithm for a set of real-world, dynamically updated articles from an online Rich Site Summary (RSS) service.
Resumo:
In this paper, the problem of semantic place categorization in mobile robotics is addressed by considering a time-based probabilistic approach called dynamic Bayesian mixture model (DBMM), which is an improved variation of the dynamic Bayesian network. More specifically, multi-class semantic classification is performed by a DBMM composed of a mixture of heterogeneous base classifiers, using geometrical features computed from 2D laserscanner data, where the sensor is mounted on-board a moving robot operating indoors. Besides its capability to combine different probabilistic classifiers, the DBMM approach also incorporates time-based (dynamic) inferences in the form of previous class-conditional probabilities and priors. Extensive experiments were carried out on publicly available benchmark datasets, highlighting the influence of the number of time-slices and the effect of additive smoothing on the classification performance of the proposed approach. Reported results, under different scenarios and conditions, show the effectiveness and competitive performance of the DBMM.
Resumo:
The quantification of the available energy in the environment is important because it determines photosynthesis, evapotranspiration and, therefore, the final yield of crops. Instruments for measuring the energy balance are costly and indirect estimation alternatives are desirable. This study assessed the Deardorff's model performance during a cycle of a sugarcane crop in Piracicaba, State of São Paulo, Brazil, in comparison to the aerodynamic method. This mechanistic model simulates the energy fluxes (sensible, latent heat and net radiation) at three levels (atmosphere, canopy and soil) using only air temperature, relative humidity and wind speed measured at a reference level above the canopy, crop leaf area index, and some pre-calibrated parameters (canopy albedo, soil emissivity, atmospheric transmissivity and hydrological characteristics of the soil). The analysis was made for different time scales, insolation conditions and seasons (spring, summer and autumn). Analyzing all data of 15 minute intervals, the model presented good performance for net radiation simulation in different insolations and seasons. The latent heat flux in the atmosphere and the sensible heat flux in the atmosphere did not present differences in comparison to data from the aerodynamic method during the autumn. The sensible heat flux in the soil was poorly simulated by the model due to the poor performance of the soil water balance method. The Deardorff's model improved in general the flux simulations in comparison to the aerodynamic method when more insolation was available in the environment.
Resumo:
Stavskaya's model is a one-dimensional probabilistic cellular automaton (PCA) introduced in the end of the 1960s as an example of a model displaying a nonequilibrium phase transition. Although its absorbing state phase transition is well understood nowadays, the model never received a full numerical treatment to investigate its critical behavior. In this Brief Report we characterize the critical behavior of Stavskaya's PCA by means of Monte Carlo simulations and finite-size scaling analysis. The critical exponents of the model are calculated and indicate that its phase transition belongs to the directed percolation universality class of critical behavior, as would be expected on the basis of the directed percolation conjecture. We also explicitly establish the relationship of the model with the Domany-Kinzel PCA on its directed site percolation line, a connection that seems to have gone unnoticed in the literature so far.
Resumo:
Electrical or chemical stimulation of the inferior colliculus (IC) induces fear-like behaviors. More recently, consistent evidence has shown that electrical stimulation of the central nucleus of the IC supports Pavlovian conditioning and latent inhibition (Li). LI is characterized by retardation in conditioning and also by an impaired ability to ignore irrelevant stimuli, after a non-reinforced pre-exposure to the conditioned stimulus. LI has been proposed as a behavioral model of cognitive abnormalities seen in schizophrenia. The aim of the present study was to determine whether dopaminergic mechanisms in the IC are involved in LI of the conditioned emotional response (CER). To induce LI, a group of rats was pre-exposed (PE) to six tones in two sessions, while rats that were not pre-exposed (NPE) had two sessions without tone presentations. The conditioning consisted of two tone presentations to the animal, followed immediately by a foot shock. PE and NPE rats received IC microinjections of physiological saline, the dopaminergic agonist apomorphine (9.0 mu g/0.5 mu L/side), or the dopaminergic antagonist haloperidol (0.5 mu g/0.5 mu L/side) before both pre-exposure and conditioning. During the test, the PE rats that received saline or haloperidol had a lower suppression of the licking response compared to NPE rats that received vehicle or haloperidol, indicating that latent inhibition was induced. There was no significant difference in the suppression ratio in rats that received apomorphine injections into the IC, indicating reduced latent inhibition. These results suggest that dopamine-mediated mechanisms of the IC are involved in the development of LI. (C) 2008 Elsevier Inc. All rights reserved.
Resumo:
Latent inhibition, retarded learning after preexposure to the to-be-conditioned stimulus, has been implied as a tool for the investigation of attentional deficits in schizophrenia and related disorders. The present paper reviews research that used Pavlovian conditioning as indexed by autonomic responses (electrodermal, vasomotor, cardiac) to investigate latent inhibition in adult humans. Latent inhibition has been demonstrated repeatedly in healthy subjects in absence of a masking task that is required in other latent inhibition paradigms. Moreover, latent inhibition of Pavlovian conditioning is stimulus-specific and increases with an increased number of preexposure trials which mirrors results from research in animals. A reduction of latent inhibition has been shown in healthy subjects who score high on questionnaire measures of psychosis proneness and in unmedicated schizophrenic patients. The latter result was obtained in a within-subject paradigm that holds promise for research with patient samples. (C) 1997 Elsevier Science B.V.
Resumo:
The present paper addresses two major concerns that were identified when developing neural network based prediction models and which can limit their wider applicability in the industry. The first problem is that it appears neural network models are not readily available to a corrosion engineer. Therefore the first part of this paper describes a neural network model of CO2 corrosion which was created using a standard commercial software package and simple modelling strategies. It was found that such a model was able to capture practically all of the trends noticed in the experimental data with acceptable accuracy. This exercise has proven that a corrosion engineer could readily develop a neural network model such as the one described below for any problem at hand, given that sufficient experimental data exist. This applies even in the cases when the understanding of the underlying processes is poor. The second problem arises from cases when all the required inputs for a model are not known or can be estimated with a limited degree of accuracy. It seems advantageous to have models that can take as input a range rather than a single value. One such model, based on the so-called Monte Carlo approach, is presented. A number of comparisons are shown which have illustrated how a corrosion engineer might use this approach to rapidly test the sensitivity of a model to the uncertainities associated with the input parameters. (C) 2001 Elsevier Science Ltd. All rights reserved.
Resumo:
We used event-related functional magnetic resonance imaging (fMRI) to investigate neural responses associated with the semantic interference (SI) effect in the picture-word task. Independent stage models of word production assume that the locus of the SI effect is at the conceptual processing level (Levelt et al. [1999]: Behav Brain Sci 22:1-75), whereas interactive models postulate that it occurs at phonological retrieval (Starreveld and La Heij [1996]: J Exp Psychol Learn Mem Cogn 22:896-918). In both types of model resolution of the SI effect occurs as a result of competitive, spreading activation without the involvement of inhibitory links. These assumptions were tested by randomly presenting participants with trials from semantically-related and lexical control distractor conditions and acquiring image volumes coincident with the estimated peak hemodynamic response for each trial. Overt vocalization of picture names occurred in the absence of scanner noise, allowing reaction time (RT) data to be collected. Analysis of the RT data confirmed the SI effect. Regions showing differential hemodynamic responses during the SI effect included the left mid section of the middle temporal gyrus, left posterior superior temporal gyrus, left anterior cingulate cortex, and bilateral orbitomedial prefrontal cortex. Additional responses were observed in the frontal eye fields, left inferior parietal lobule, and right anterior temporal and occipital cortex. The results are interpreted as indirectly supporting interactive models that allow spreading activation between both conceptual processing and phonological retrieval levels of word production. In addition, the data confirm that selective attention/response suppression has a role in resolving the SI effect similar to the way in which Stroop interference is resolved. We conclude that neuroimaging studies can provide information about the neuroanatomical organization of the lexical system that may prove useful for constraining theoretical models of word production. (C) 2001 Wiley-Liss, Inc.
Resumo:
Latent inhibition (LI) is an important model for understanding cognitive deficits in schizophrenia, Disruption of LI is thought to result from an inability to ignore irrelevant stimuli. The study investigated LI in schizophrenic patients by using Pavlovian conditioning of electrodermal responses in a complete within-subject design. Thirty-two schizophrenic patients, ( 16 acute. unmedicated and 16 medicated patients) and 16 healthy control subjects (matched with respect to age and gender) participated in the study. The experiment consisted of two stages: preexposure and conditioning. During preexposure two visual stimuli were presented, one of which served as the to-be-conditioned stimulus (CSp +) and the other one was the not-to-be-conditioned stimulus (CSp -) during the following conditioning ( = acquisition). During acquisition. two novel visual stimuli (CSn + and CSn -) were introduced. A reaction time task was used as the unconditioned stimulus (US). LI was defined as the difference in response differentiation observed between proexposed and non-preexposed sets of CS + and CS -. During preexposure. the schizophrenic patients did not differ in electrodermal responding from the control subjects, neither concerning the extent of orienting nor the course of habituation. The exposure to novel stimuli at the beginning of the acquisition elicited reduced orienting responses in unmedicated patients compared to medicated patients and control subjects, LI was observed in medicated schizophrenic patients and healthy controls. but not in acute unmedicated patients. Furthermore LI was found to be correlated with the duration of illness: it was attenuated in patients who had suffered their first psychotic episode. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
This paper presents a numerical study of fluidized-bed coating on thin plates using an orthogonal collocation technique. Inclusion of the latent heat of fusion term in the boundary conditions of the mathematical model accounts for the fact that some polymer powders used in coating may be partially crystalline. Predictions of coating thickness on flat plates were made with actual polymers used in fluidized-bed coating. Reasonably good agreement between numerical predictions of the coating thickness and experimental coating data of Richart was obtained for steel panels preheated to 316 degreesC. A good agreement was also obtained between numerical predictions and our coating thickness data for nylon-11 and polyethylene powders. Predicted coating thickness for polyethylene powder on flat plates were obtained with values of heat transfer coefficient closer to those obtained from our experiments. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
Fixed-point roundoff noise in digital implementation of linear systems arises due to overflow, quantization of coefficients and input signals, and arithmetical errors. In uniform white-noise models, the last two types of roundoff errors are regarded as uniformly distributed independent random vectors on cubes of suitable size. For input signal quantization errors, the heuristic model is justified by a quantization theorem, which cannot be directly applied to arithmetical errors due to the complicated input-dependence of errors. The complete uniform white-noise model is shown to be valid in the sense of weak convergence of probabilistic measures as the lattice step tends to zero if the matrices of realization of the system in the state space satisfy certain nonresonance conditions and the finite-dimensional distributions of the input signal are absolutely continuous.
Resumo:
The purpose of this study was threefold: first, the study was designed to illustrate the use of data and information collected in food safety surveys in a quantitative risk assessment. In this case, the focus was on the food service industry; however, similar data from other parts of the food chain could be similarly incorporated. The second objective was to quantitatively describe and better understand the role that the food service industry plays in the safety of food. The third objective was to illustrate the additional decision-making information that is available when uncertainty and variability are incorporated into the modelling of systems. (C) 2002 Elsevier Science B.V. All rights reserved.