898 resultados para Latent semantic indexing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we introduce an application of matrix factorization to produce corpus-derived, distributional
models of semantics that demonstrate cognitive plausibility. We find that word representations
learned by Non-Negative Sparse Embedding (NNSE), a variant of matrix factorization, are sparse,
effective, and highly interpretable. To the best of our knowledge, this is the first approach which
yields semantic representation of words satisfying these three desirable properties. Though extensive
experimental evaluations on multiple real-world tasks and datasets, we demonstrate the superiority
of semantic models learned by NNSE over other state-of-the-art baselines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Computational models of meaning trained on naturally occurring text successfully model human performance on tasks involving simple similarity measures, but they characterize meaning in terms of undifferentiated bags of words or topical dimensions. This has led some to question their psychological plausibility (Murphy, 2002; Schunn, 1999). We present here a fully automatic method for extracting a structured and comprehensive set of concept descriptions directly from an English part-of-speech-tagged corpus. Concepts are characterized by weighted properties, enriched with concept-property types that approximate classical relations such as hypernymy and function. Our model outperforms comparable algorithms in cognitive tasks pertaining not only to concept-internal structures (discovering properties of concepts, grouping properties by property type) but also to inter-concept relations (clustering into superordinates), suggesting the empirical validity of the property-based approach. Copyright © 2009 Cognitive Science Society, Inc. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Achieving a clearer picture of categorial distinctions in the brain is essential for our understanding of the conceptual lexicon, but much more fine-grained investigations are required in order for this evidence to contribute to lexical research. Here we present a collection of advanced data-mining techniques that allows the category of individual concepts to be decoded from single trials of EEG data. Neural activity was recorded while participants silently named images of mammals and tools, and category could be detected in single trials with an accuracy well above chance, both when considering data from single participants, and when group-training across participants. By aggregating across all trials, single concepts could be correctly assigned to their category with an accuracy of 98%. The pattern of classifications made by the algorithm confirmed that the neural patterns identified are due to conceptual category, and not any of a series of processing-related confounds. The time intervals, frequency bands and scalp locations that proved most informative for prediction permit physiological interpretation: the widespread activation shortly after appearance of the stimulus (from 100. ms) is consistent both with accounts of multi-pass processing, and distributed representations of categories. These methods provide an alternative to fMRI for fine-grained, large-scale investigations of the conceptual lexicon. © 2010 Elsevier Inc.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many studies suggest a large capacity memory for briefly presented pictures of whole scenes. At the same time, visual working memory (WM) of scene elements is limited to only a few items. We examined the role of retroactive interference in limiting memory for visual details. Participants viewed a scene for 5?s and then, after a short delay containing either a blank screen or 10 distracter scenes, answered questions about the location, color, and identity of objects in the scene. We found that the influence of the distracters depended on whether they were from a similar semantic domain, such as "kitchen" or "airport." Increasing the number of similar scenes reduced, and eventually eliminated, memory for scene details. Although scene memory was firmly established over the initial study period, this memory was fragile and susceptible to interference. This may help to explain the discrepancy in the literature between studies showing limited visual WM and those showing a large capacity memory for scenes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In most previous research on distributional semantics, Vector Space Models (VSMs) of words are built either from topical information (e.g., documents in which a word is present), or from syntactic/semantic types of words (e.g., dependency parse links of a word in sentences), but not both. In this paper, we explore the utility of combining these two representations to build VSM for the task of semantic composition of adjective-noun phrases. Through extensive experiments on benchmark datasets, we find that even though a type-based VSM is effective for semantic composition, it is often outperformed by a VSM built using a combination of topic- and type-based statistics. We also introduce a new evaluation task wherein we predict the composed vector representation of a phrase from the brain activity of a human subject reading that phrase. We exploit a large syntactically parsed corpus of 16 billion tokens to build our VSMs, with vectors for both phrases and words, and make them publicly available.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – Under investigation is Prosecco wine, a sparkling white wine from North-East Italy.
Information collection on consumer perceptions is particularly relevant when developing market
strategies for wine, especially so when local production and certification of origin play an important
role in the wine market of a given district, as in the case at hand. Investigating and characterizing the
structure of preference heterogeneity become crucial steps in every successful marketing strategy. The
purpose of this paper is to investigate the sources of systematic differences in consumer preferences.
Design/methodology/approach – The paper explores the effect of inclusion of answers to
attitudinal questions in a latent class regression model of stated willingness to pay (WTP) for this
specialty wine. These additional variables were included in the membership equations to investigate
whether they could be of help in the identification of latent classes. The individual specific WTPs from
the sampled respondents were then derived from the best fitting model and examined for consistency.
Findings – The use of answers to attitudinal question in the latent class regression model is found to
improve model fit, thereby helping in the identification of latent classes. The best performing model
obtained makes use of both attitudinal scores and socio-economic covariates identifying five latent
classes. A reasonable pattern of differences in WTP for Prosecco between CDO and TGI types were
derived from this model.
Originality/value – The approach appears informative and promising: attitudes emerge as
important ancillary indicators of taste differences for specialty wines. This might be of interest per se
and of practical use in market segmentation. If future research shows that these variables can be of use
in other contexts, it is quite possible that more attitudinal questions will be routinely incorporated in
structural latent class hedonic models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There has long been substantial interest in understanding consumer food choices, where a key complexity in this context is the potentially large amount of heterogeneity in tastes across individual consumers, as well as the role of underlying attitudes towards food and cooking. The present paper underlines that both tastes and attitudes are unobserved, and makes the case for a latent variable treatment of these components. Using empirical data collected in Northern Ireland as part of a wider study to elicit intra-household trade-offs between home-cooked meal options, we show how these latent sensitivities and attitudes drive both the choice behaviour as well as the answers to supplementary questions. We find significant heterogeneity across respondents in these underlying factors and show how incorporating them in our models leads to important insights into preferences. 

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Supreme Court of the United States in Feist v. Rural (Feist, 1991) specified that compilations or databases, and other works, must have a minimal degree of creativity to be copyrightable. The significance and global diffusion of the decision is only matched by the difficulties it has posed for interpretation. The judgment does not specify what is to be understood by creativity, although it does give a full account of the negative of creativity, as ‘so mechanical or routine as to require no creativity whatsoever’ (Feist, 1991, p.362). The negative of creativity as highly mechanical has particularly diffused globally.

A recent interpretation has correlated ‘so mechanical’ (Feist, 1991) with an automatic mechanical procedure or computational process, using a rigorous exegesis fully to correlate the two uses of mechanical. The negative of creativity is then understood as an automatic computation and as a highly routine process. Creativity is itself is conversely understood as non-computational activity, above a certain level of routinicity (Warner, 2013).

The distinction between the negative of creativity and creativity is strongly analogous to an independently developed distinction between forms of mental labour, between semantic and syntactic labour. Semantic labour is understood as human labour motivated by considerations of meaning and syntactic labour as concerned solely with patterns. Semantic labour is distinctively human while syntactic labour can be directly humanly conducted or delegated to machine, as an automatic computational process (Warner, 2005; 2010, pp.33-41).

The value of the analogy is to greatly increase the intersubjective scope of the distinction between semantic and syntactic mental labour. The global diffusion of the standard for extreme absence of copyrightability embodied in the judgment also indicates the possibility that the distinction fully captures the current transformation in the distribution of mental labour, where syntactic tasks which were previously humanly performed are now increasingly conducted by machine.

The paper has substantive and methodological relevance to the conference themes. Substantively, it is concerned with human creativity, with rationality as not reducible to computation, and has relevance to the language myth, through its indirect endorsement of a non-computable or not mechanical semantics. These themes are supported by the underlying idea of technology as a human construction. Methodologically, it is rooted in the humanities and conducts critical thinking through exegesis and empirically tested theoretical development

References

Feist. (1991). Feist Publications, Inc. v. Rural Tel. Service Co., Inc. 499 U.S. 340.

Warner, J. (2005). Labor in information systems. Annual Review of Information Science and Technology. 39, 2005, pp.551-573.

Warner, J. (2010). Human Information Retrieval (History and Foundations of Information Science Series). Cambridge, MA: MIT Press.

Warner, J. (2013). Creativity for Feist. Journal of the American Society for Information Science and Technology. 64, 6, 2013, pp.1173-1192.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Creep of Steel Fiber Reinforced Concrete (SFRC) under flexural loads in the cracked state and to what extent different factors determine creep behaviour are quite understudied topics within the general field of SFRC mechanical properties. A series of prismatic specimens have been produced and subjected to sustained flexural loads. The effect of a number of variables (fiber length and slenderness, fiber content, and concrete compressive strength) has been studied in a comprehensive fashion. Twelve response variables (creep parameters measured at different times) have been retained as descriptive of flexural creep behaviour. Multivariate techniques have been used: the experimental results have been projected to their latent structure by means of Principal Components Analysis (PCA), so that all the information has been reduced to a set of three latent variables. They have been related to the variables considered and statistical significance of their effects on creep behaviour has been assessed. The result is a unified view on the effects of the different variables considered upon creep behaviour: fiber content and fiber slenderness have been detected to clearly modify the effect that load ratio has on flexural creep behaviour.