14 resultados para Probabilistic latent semantic analysis (PLSA)

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a new approach to model and classify breast parenchymal tissue. Given a mammogram, first, we will discover the distribution of the different tissue densities in an unsupervised manner, and second, we will use this tissue distribution to perform the classification. We achieve this using a classifier based on local descriptors and probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature. We studied the influence of different descriptors like texture and SIFT features at the classification stage showing that textons outperform SIFT in all cases. Moreover we demonstrate that pLSA automatically extracts meaningful latent aspects generating a compact tissue representation based on their densities, useful for discriminating on mammogram classification. We show the results of tissue classification over the MIAS and DDSM datasets. We compare our method with approaches that classified these same datasets showing a better performance of our proposal

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis investigates epistemic indefinites (EIs), elements noteworthy for their grammaticalized ignorance implicature, i.e. inability to provide further information about the identity of the expression's referent. This work contributes to the effort of finding a unified account of the cross-linguistic repertoire of EIs. It comprises a corpus survey and a semantic analysis of Slovak voľa- and -si, EI items not studied until now. First, the following hypothesis was tested: the semantic/syntactic functions expressed by an indefinite will fall into contiguous areas on an implicational map (Haspelmath 1997). The results of the corpus analysis revealed that the map does not entirely capture the Slovak EIs' functional distribution and interpretations. Secondly, the semantic analysis was developed within the alternatives-and-exhaustification framework (Chierchia 2013). I show that some of the EIs' behavior can be explained as a consequence of an assumed sensitivity to parameters proposed by Chierchia. I situate voľa- and -si with respect to the framework’s typology and offer a critical assessment of this theoretical perspective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En aquest article es defi neix la classe dels verbs d'actitud, tant intensionalment, a partir de criteris sint àctics i sem àntics, com extensionalment. Quant a la sintaxi, s'observa que els verbs que pertanyen a aquest grup presenten almenys una estructura distintiva compartida; pel que fa a la sem àntica, a m és d'identifi car-hi els mateixos components de signifi cat, s'ha observat que presenten també el mateix tipus general d'estructura esdevenimental. La hipòtesi de treball és la de Levin [6] -els verbs que tenen el mateix signifi cat comparteixen el mateix comportament sintàctic-, encara que amb alguns matisos. El punt de partida és tamb é la classifi cació d'aquesta autora, sobre la qual es proposa una reagrupaci ó en funci ó dels criteris esmentats.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper analyzes and evaluates, in the context of Ontology learning, some techniques to identify and extract candidate terms to classes of a taxonomy. Besides, this work points out some inconsistencies that may be occurring in the preprocessing of text corpus, and proposes techniques to obtain good terms candidate to classes of a taxonomy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a description of the role of definitional verbal patterns for the extraction of semantic relations. Several studies show that semantic relations can be extracted from analytic definitions contained in machine-readable dictionaries (MRDs). In addition, definitions found in specialised texts are a good starting point to search for different types of definitions where other semantic relations occur. The extraction of definitional knowledge from specialised corpora represents another interesting approach for the extraction of semantic relations. Here, we present a descriptive analysis of definitional verbal patterns in Spanish and the first steps towards the development of a system for the automatic extraction of definitional knowledge.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a method for extracting semantic informationfrom online music discussion forums is proposed. The semantic relations are inferred from the co-occurrence of musical concepts in forum posts, using network analysis. The method starts by defining a dictionary of common music terms in an art music tradition. Then, it creates a complex network representation of the online forum by matchingsuch dictionary against the forum posts. Once the complex network is built we can study different network measures, including node relevance, node co-occurrence andterm relations via semantically connecting words. Moreover, we can detect communities of concepts inside the forum posts. The rationale is that some music terms are more related to each other than to other terms. All in all, this methodology allows us to obtain meaningful and relevantinformation from forum discussions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using data from the Spanish household budget survey, we investigate life- cycle effects on several product expenditures. A latent-variable model approach is adopted to evaluate the impact of income on expenditures, controlling for the number of members in the family. Two latent factors underlying repeated measures of monetary and non-monetary income are used as explanatory variables in the expenditure regression equations, thus avoiding possible bias associated to the measurement error in income. The proposed methodology also takes care of the case in which product expenditures exhibit a pattern of infrequent purchases. Multiple-group analysis is used to assess the variation of key parameters of the model across various household life-cycle typologies. The analysis discloses significant life-cycle effects on the mean levels of expenditures; it also detects significant life-cycle effects on the way expenditures are affected by income and family size. Asymptotic robust methods are used to account for possible non-normality of the data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using data from the Spanish household budget survey, we investigate life-cycle effects on several product expenditures. A latent-variable model approach is adopted to evaluate the impact of income on expenditures, controlling for the number of members in the family. Two latent factors underlying repeated measures of monetary and non-monetary income are used as explanatory variables in the expenditure regression equations, thus avoiding possible bias associated to the measurement error in income. The proposed methodology also takes care of the case in which product expenditures exhibit a pattern of infrequent purchases. Multiple-group analysis is used to assess the variation of key parameters of the model across various household life-cycle typologies. The analysis discloses significant life-cycle effects on the mean levels of expenditures; it also detects significant life-cycle effects on the way expenditures are affected by income and family size. Asymptotic robust methods are used to account for possible non-normality of the data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Standard methods for the analysis of linear latent variable models oftenrely on the assumption that the vector of observed variables is normallydistributed. This normality assumption (NA) plays a crucial role inassessingoptimality of estimates, in computing standard errors, and in designinganasymptotic chi-square goodness-of-fit test. The asymptotic validity of NAinferences when the data deviates from normality has been calledasymptoticrobustness. In the present paper we extend previous work on asymptoticrobustnessto a general context of multi-sample analysis of linear latent variablemodels,with a latent component of the model allowed to be fixed across(hypothetical)sample replications, and with the asymptotic covariance matrix of thesamplemoments not necessarily finite. We will show that, under certainconditions,the matrix $\Gamma$ of asymptotic variances of the analyzed samplemomentscan be substituted by a matrix $\Omega$ that is a function only of thecross-product moments of the observed variables. The main advantage of thisis thatinferences based on $\Omega$ are readily available in standard softwareforcovariance structure analysis, and do not require to compute samplefourth-order moments. An illustration with simulated data in the context ofregressionwith errors in variables will be presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new model for dealing with decision making under risk by considering subjective and objective information in the same formulation is here presented. The uncertain probabilistic weighted average (UPWA) is also presented. Its main advantage is that it unifies the probability and the weighted average in the same formulation and considering the degree of importance that each case has in the analysis. Moreover, it is able to deal with uncertain environments represented in the form of interval numbers. We study some of its main properties and particular cases. The applicability of the UPWA is also studied and it is seen that it is very broad because all the previous studies that use the probability or the weighted average can be revised with this new approach. Focus is placed on a multi-person decision making problem regarding the selection of strategies by using the theory of expertons.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

En el presente artículo se ha desarrollado un sistema capaz de categorizar de forma automática la base de datos de imágenes que sirven de punto de partida para la ideación y diseño en la producción artística del escultor M. Planas. La metodología utilizada está basada en características locales. Para la construcción de un vocabulario visual se sigue un procedimiento análogo al que se utiliza en el análisis automático de textos (modelo 'Bag-of-Words'-BOW) y en el ámbito de las imágenes nos referiremos a representaciones 'Bag-of-Visual Terms' (BOV). En este enfoque se analizan las imágenes como un conjunto de regiones, describiendo solamente su apariencia e ignorando su estructura espacial. Para superar los inconvenientes de polisemia y sinonimia que lleva asociados esta metodología, se utiliza el análisis probabilístico de aspectos latentes (PLSA) que detecta aspectos subyacentes en las imágenes, patrones formales. Los resultados obtenidos son prometedores y, además de la utilidad intrínseca de la categorización automática de imágenes, este método puede proporcionar al artista un punto de vista auxiliar muy interesante.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper focuses on the study of the factorial structure of an inventory to estimate the subjective perception of insecurity and fear of crime. Made from the review of the literature on the subject and the results obtained in previous works, this factor structure shows that this attitude towards insecurity and fear of crime is identified through a number of latent factors which are schematically summarized in (a) personal safety, (b) the perception of personal and social control, (c) the presence of threatening people or situations, (d) the processes of identity and space appropriation, (e) satisfaction with the environment, and (f) the environmental and the use of space. Such factors are relevant dimensions to analyze the phenomenon. Method: A sample of 571 participants in a neighborhood of Barcelona was evaluated with the proposed inventory, which yielded data from the distributions of all the items provided. The administration was conducted by researchers specially trained for it and the results were analyzed by using standard procedures in the confirmatory factor analysis (CFA) from the hypothesized theoretical structure. The analysis was performed by decatypes according to the different response scales prepared in the inventory and their ordinal nature, and by estimating the polychoric correlation coefficients. The results show an acceptable fit of the proposed model, an appropriate behavior of the residuals and statistically significant estimates of the factor loadings. This would indicate the goodness of the proposed factor structure.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives: The aim of the study was to combine clinical results from the European Cohort of the REVERSE study and costs associated with the addition of cardiac resynchronization therapy (CRT) to optimal medical therapy (OMT) in patients with mild symptomatic (NYHA I-II) or asymptomatic left ventricular dysfunction and markers of cardiac dyssynchrony in Spain. Methods: A Markov model was developed with CRT + OMT (CRT-ON) versus OMT only (CRT-OFF) based on a retrospective cost-effectiveness analysis. Raw data was derived from literature and expert opinion, reflecting clinical and economic consequences of patient"s management in Spain. Time horizon was 10 years. Both costs (euro 2010) and effects were discounted at 3 percent per annum. Results: CRT-ON showed higher total costs than CRT-OFF; however, CRT reduced the length of hospitalization in ICU by 94 percent (0.006 versus 0.091 days) and general ward in by 34 percent (0.705 versus 1.076 days). Surviving CRT-ON patients (88.2 percent versus 77.5 percent) remained in better functional class longer, and they achieved an improvement of 0.9 life years (LYGs) and 0.77 years quality-adjusted life years (QALYs). CRT-ON proved to be cost-effective after 6 years, except for the 7th year due to battery depletion. At 10 years, the results were 18,431 per LYG and 21,500 per QALY gained. Probabilistic sensitivity analysis showed CRT-ON was cost-effective in 75.4 percent of the cases at 10 years. Conclusions: The use of CRT added to OMT represents an efficient use of resources in patients suffering from heart failure in NYHA functional classes I and II.