310 resultados para Topic modeling
Resumo:
Topic modeling has been widely utilized in the fields of information retrieval, text mining, text classification etc. Most existing statistical topic modeling methods such as LDA and pLSA generate a term based representation to represent a topic by selecting single words from multinomial word distribution over this topic. There are two main shortcomings: firstly, popular or common words occur very often across different topics that bring ambiguity to understand topics; secondly, single words lack coherent semantic meaning to accurately represent topics. In order to overcome these problems, in this paper, we propose a two-stage model that combines text mining and pattern mining with statistical modeling to generate more discriminative and semantic rich topic representations. Experiments show that the optimized topic representations generated by the proposed methods outperform the typical statistical topic modeling method LDA in terms of accuracy and certainty.
Resumo:
Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, which has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering is rarely known. Patterns are always thought to be more representative than single terms for representing documents. In this paper, a novel information filtering model, Pattern-based Topic Model(PBTM) , is proposed to represent the text documents not only using the topic distributions at general level but also using semantic pattern representations at detailed specific level, both of which contribute to the accurate document representation and document relevance ranking. Extensive experiments are conducted to evaluate the effectiveness of PBTM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model achieves outstanding performance.
A tag-based personalized item recommendation system using tensor modeling and topic model approaches
Resumo:
This research falls in the area of enhancing the quality of tag-based item recommendation systems. It aims to achieve this by employing a multi-dimensional user profile approach and by analyzing the semantic aspects of tags. Tag-based recommender systems have two characteristics that need to be carefully studied in order to build a reliable system. Firstly, the multi-dimensional correlation, called as tag assignment
Resumo:
Quantum theory has recently been employed to further advance the theory of information retrieval (IR). A challenging research topic is to investigate the so called quantum-like interference in users’ relevance judgement process, where users are involved to judge the relevance degree of each document with respect to a given query. In this process, users’ relevance judgement for the current document is often interfered by the judgement for previous documents, due to the interference on users’ cognitive status. Research from cognitive science has demonstrated some initial evidence of quantum-like cognitive interference in human decision making, which underpins the user’s relevance judgement process. This motivates us to model such cognitive interference in the relevance judgement process, which in our belief will lead to a better modeling and explanation of user behaviors in relevance judgement process for IR and eventually lead to more user-centric IR models. In this paper, we propose to use probabilistic automaton(PA) and quantum finite automaton (QFA), which are suitable to represent the transition of user judgement states, to dynamically model the cognitive interference when the user is judging a list of documents.
Resumo:
Software development and Web site development techniques have evolved significantly over the past 20 years. The relatively young Web Application development area has borrowed heavily from traditional software development methodologies primarily due to the similarities in areas of data persistence and User Interface (UI) design. Recent developments in this area propose a new Web Modeling Language (WebML) to facilitate the nuances specific to Web development. WebML is one of a number of implementations designed to enable modeling of web site interaction flows while being extendable to accommodate new features in Web site development into the future. Our research aims to extend WebML with a focus on stigmergy which is a biological term originally used to describe coordination between insects. We see design features in existing Web sites that mimic stigmergic mechanisms as part of the UI. We believe that we can synthesize and embed stigmergy in Web 2.0 sites. This paper focuses on the sub-topic of site UI design and stigmergic mechanism designs required to achieve this.
Resumo:
The SimCalc Vision and Contributions Advances in Mathematics Education 2013, pp 419-436 Modeling as a Means for Making Powerful Ideas Accessible to Children at an Early Age Richard Lesh, Lyn English, Serife Sevis, Chanda Riggs … show all 4 hide » Look Inside » Get Access Abstract In modern societies in the 21st century, significant changes have been occurring in the kinds of “mathematical thinking” that are needed outside of school. Even in the case of primary school children (grades K-2), children not only encounter situations where numbers refer to sets of discrete objects that can be counted. Numbers also are used to describe situations that involve continuous quantities (inches, feet, pounds, etc.), signed quantities, quantities that have both magnitude and direction, locations (coordinates, or ordinal quantities), transformations (actions), accumulating quantities, continually changing quantities, and other kinds of mathematical objects. Furthermore, if we ask, what kind of situations can children use numbers to describe? rather than restricting attention to situations where children should be able to calculate correctly, then this study shows that average ability children in grades K-2 are (and need to be) able to productively mathematize situations that involve far more than simple counts. Similarly, whereas nearly the entire K-16 mathematics curriculum is restricted to situations that can be mathematized using a single input-output rule going in one direction, even the lives of primary school children are filled with situations that involve several interacting actions—and which involve feedback loops, second-order effects, and issues such as maximization, minimization, or stabilizations (which, many years ago, needed to be postponed until students had been introduced to calculus). …This brief paper demonstrates that, if children’s stories are used to introduce simulations of “real life” problem solving situations, then average ability primary school children are quite capable of dealing productively with 60-minute problems that involve (a) many kinds of quantities in addition to “counts,” (b) integrated collections of concepts associated with a variety of textbook topic areas, (c) interactions among several different actors, and (d) issues such as maximization, minimization, and stabilization.