35 resultados para Ontologies (Information retrieval)


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Web document cluster analysis plays an important role in information retrieval by organizing large amounts of documents into a small number of meaningful clusters. Traditional web document clustering is based on the Vector Space Model (VSM), which takes into account only two-level (document and term) knowledge granularity but ignores the bridging paragraph granularity. However, this two-level granularity may lead to unsatisfactory clustering results with “false correlation”. In order to deal with the problem, a Hierarchical Representation Model with Multi-granularity (HRMM), which consists of five-layer representation of data and a twophase clustering process is proposed based on granular computing and article structure theory. To deal with the zero-valued similarity problemresulted from the sparse term-paragraphmatrix, an ontology based strategy and a tolerance-rough-set based strategy are introduced into HRMM. By using granular computing, structural knowledge hidden in documents can be more efficiently and effectively captured in HRMM and thus web document clusters with higher quality can be generated. Extensive experiments show that HRMM, HRMM with tolerancerough-set strategy, and HRMM with ontology all outperform VSM and a representative non VSM-based algorithm, WFP, significantly in terms of the F-Score.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The practice of evidence-based medicine involves consulting documents from repositories such as Scopus, PubMed, or the Cochrane Library. The most common approach for presenting retrieved documents is in the form of a list, with the assumption that the higher a document is on a list, the more relevant it is. Despite this list-based presentation, it is seldom studied how physicians perceive the importance of the order of documents presented in a list. This paper describes an empirical study that elicited and modeled physicians' preferences with regard to list-based results. Preferences were analyzed using a GRIP method that relies on pairwise comparisons of selected subsets of possible rank-ordered lists composed of 3 documents. The results allow us to draw conclusions regarding physicians' attitudes towards the importance of having documents ranked correctly on a result list, versus the importance of retrieving relevant but misplaced documents. Our findings should help developers of clinical information retrieval applications when deciding how retrieved documents should be presented and how performance of the application should be assessed. © 2012 Springer-Verlag Berlin Heidelberg.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Term dependence is a natural consequence of language use. Its successful representation has been a long standing goal for Information Retrieval research. We present a methodology for the construction of a concept hierarchy that takes into account the three basic dimensions of term dependence. We also introduce a document evaluation function that allows the use of the concept hierarchy as a user profile for Information Filtering. Initial experimental results indicate that this is a promising approach for incorporating term dependence in the way documents are filtered.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Timeline generation is an important research task which can help users to have a quick understanding of the overall evolution of any given topic. It thus attracts much attention from research communities in recent years. Nevertheless, existing work on timeline generation often ignores an important factor, the attention attracted to topics of interest (hereafter termed "social attention"). Without taking into consideration social attention, the generated timelines may not reflect users' collective interests. In this paper, we study how to incorporate social attention in the generation of timeline summaries. In particular, for a given topic, we capture social attention by learning users' collective interests in the form of word distributions from Twitter, which are subsequently incorporated into a unified framework for timeline summary generation. We construct four evaluation sets over six diverse topics. We demonstrate that our proposed approach is able to generate both informative and interesting timelines. Our work sheds light on the feasibility of incorporating social attention into traditional text mining tasks. Copyright © 2013 ACM.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Two studies aiming to identify the nature and extent of problems that people have when completing theory of planned behaviour (TPB) questionnaires, using a cognitive interviewing approach are reported. Both studies required participants to 'think aloud' as they completed TPB questionnaires about: (a) increasing physical activity (six general public participants); and (b) binge drinking (13 students). Most people had no identifiable problems with the majority of questions. However, there were problems common to both studies, relating to information retrieval and to participants answering different questions from those intended by researchers. Questions about normative influence were particularly problematic. The standard procedure for developing TPB questionnaires may systematically produce problematic questions. Suggestions are made for improving this procedure. Copyright © 2007 SAGE Publications.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In recent years, learning word vector representations has attracted much interest in Natural Language Processing. Word representations or embeddings learned using unsupervised methods help addressing the problem of traditional bag-of-word approaches which fail to capture contextual semantics. In this paper we go beyond the vector representations at the word level and propose a novel framework that learns higher-level feature representations of n-grams, phrases and sentences using a deep neural network built from stacked Convolutional Restricted Boltzmann Machines (CRBMs). These representations have been shown to map syntactically and semantically related n-grams to closeby locations in the hidden feature space. We have experimented to additionally incorporate these higher-level features into supervised classifier training for two sentiment analysis tasks: subjectivity classification and sentiment classification. Our results have demonstrated the success of our proposed framework with 4% improvement in accuracy observed for subjectivity classification and improved the results achieved for sentiment classification over models trained without our higher level features.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Continuing advances in digital image capture and storage are resulting in a proliferation of imagery and associated problems of information overload in image domains. In this work we present a framework that supports image management using an interactive approach that captures and reuses task-based contextual information. Our framework models the relationship between images and domain tasks they support by monitoring the interactive manipulation and annotation of task-relevant imagery. During image analysis, interactions are captured and a task context is dynamically constructed so that human expertise, proficiency and knowledge can be leveraged to support other users in carrying out similar domain tasks using case-based reasoning techniques. In this article we present our framework for capturing task context and describe how we have implemented the framework as two image retrieval applications in the geo-spatial and medical domains. We present an evaluation that tests the efficiency of our algorithms for retrieving image context information and the effectiveness of the framework for carrying out goal-directed image tasks. © 2010 Springer Science+Business Media, LLC.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents the design and results of a task-based user study, based on Information Foraging Theory, on a novel user interaction framework - uInteract - for content-based image retrieval (CBIR). The framework includes a four-factor user interaction model and an interactive interface. The user study involves three focused evaluations, 12 simulated real life search tasks with different complexity levels, 12 comparative systems and 50 subjects. Information Foraging Theory is applied to the user study design and the quantitative data analysis. The systematic findings have not only shown how effective and easy to use the uInteract framework is, but also illustrate the value of Information Foraging Theory for interpreting user interaction with CBIR. © 2011 Springer-Verlag Berlin Heidelberg.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The paper proposes an ISE (Information goal, Search strategy, Evaluation threshold) user classification model based on Information Foraging Theory for understanding user interaction with content-based image retrieval (CBIR). The proposed model is verified by a multiple linear regression analysis based on 50 users' interaction features collected from a task-based user study of interactive CBIR systems. To our best knowledge, this is the first principled user classification model in CBIR verified by a formal and systematic qualitative analysis of extensive user interaction data. Copyright 2010 ACM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Representing knowledge using domain ontologies has shown to be a useful mechanism and format for managing and exchanging information. Due to the difficulty and cost of building ontologies, a number of ontology libraries and search engines are coming to existence to facilitate reusing such knowledge structures. The need for ontology ranking techniques is becoming crucial as the number of ontologies available for reuse is continuing to grow. In this paper we present AKTiveRank, a prototype system for ranking ontologies based on the analysis of their structures. We describe the metrics used in the ranking system and present an experiment on ranking ontologies returned by a popular search engine for an example query.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three ma jor steps in ontology building: associating terms, constructing hierarchies and labelling relations. A number of methods are presented for these purposes but we conclude that the issue of data-sparsity still is a ma jor challenge. We argue for the use of resources external tot he domain specific corpus.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Owing to the rise in the volume of literature, problems arise in the retrieval of required information. Various retrieval strategies have been proposed, but most of that are not flexible enough for their users. Specifically, most of these systems assume that users know exactly what they are looking for before approaching the system, and that users are able to precisely express their information needs according to l aid- down specifications. There has, however, been described a retrieval program THOMAS which aims at satisfying incompletely- defined user needs through a man- machine dialogue which does not require any rigid queries. Unlike most systems, Thomas attempts to satisfy the user's needs from a model which it builds of the user's area of interest. This model is a subset of the program's "world model" - a database in the form of a network where the nodes represent concepts since various concepts have various degrees of similarities and associations, this thesis contends that instead of models which assume equal levels of similarities between concepts, the links between the concepts should have values assigned to them to indicate the degree of similarity between the concepts. Furthermore, the world model of the system should be structured such that concepts which are related to one another be clustered together, so that a user- interaction would involve only the relevant clusters rather than the entire database such clusters being determined by the system, not the user. This thesis also attempts to link the design work with the current notion in psychology centred on the use of the computer to simulate human cognitive processes. In this case, an attempt has been made to model a dialogue between two people - the information seeker and the information expert. The system, called Thomas-II, has been implemented and found to require less effort from the user than Thomas.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Over recent years, evidence has been accumulating in favour of the importance of long-term information as a variable which can affect the success of short-term recall. Lexicality, word frequency, imagery and meaning have all been shown to augment short term recall performance. Two competing theories as to the causes of this long-term memory influence are outlined and tested in this thesis. The first approach is the order-encoding account, which ascribes the effect to the usage of resources at encoding, hypothesising that word lists which require less effort to process will benefit from increased levels of order encoding, in turn enhancing recall success. The alternative view, trace redintegration theory, suggests that order is automatically encoded phonologically, and that long-term information can only influence the interpretation of the resultant memory trace. The free recall experiments reported here attempted to determine the importance of order encoding as a facilitatory framework and to determine the locus of the effects of long-term information in free recall. Experiments 1 and 2 examined the effects of word frequency and semantic categorisation over a filled delay, and experiments 3 and 4 did the same for immediate recall. Free recall was improved by both long-term factors tested. Order information was not used over a short filled delay, but was evident in immediate recall. Furthermore, it was found that both long-term factors increased the amount of order information retained. Experiment 5 induced an order encoding effect over a filled delay, leaving a picture of short-term processes which are closely associated with long-term processes, and which fit conceptions of short-term memory being part of language processes rather better than either the encoding or the retrieval-based models. Experiments 6 and 7 aimed to determine to what extent phonological processes were responsible for the pattern of results observed. Articulatory suppression affected the encoding of order information where speech rate had no direct influence, suggesting that it is ease of lexical access which is the most important factor in the influence of long-term memory on immediate recall tasks. The evidence presented in this thesis does not offer complete support for either the retrieval-based account or the order encoding account of long-term influence. Instead, the evidence sits best with models that are based upon language-processing. The path urged for future research is to find ways in which this diffuse model can be better specified, and which can take account of the versatility of the human brain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis examines children's consumer choice behaviour using an information processing perspective, with the fundamental goal of applying academic research to practical marketing and commercial problems. Proceeding a preface, which describes the academic and commercial terms of reference within which this interdisciplinary study is couched, the thesis comprises four discernible parts. Initially, the rationale inherent in adopting an information processing perspective is justified and the diverse array of topics which have bearing on children's consumer processing and behaviour are aggregated. The second part uses this perspective as a springboard to appraise the little explored role of memory, and especially memory structure, as a central cognitive component in children's consumer choice processing. The main research theme explores the ease with which 10 and 11 year olds retrieve contemporary consumer information from subjectively defined memory organisations. Adopting a sort-recall paradigm, hierarchical retrieval processing is stimulated and it is contended that when two items, known to be stored proximally in the memory organisation are not recalled adjacently, this discrepancy is indicative of retrieval processing ease. Results illustrate the marked influence of task conditions and orientation of memory structure on retrieval; these conclusions are accounted for in terms of input and integration failure. The third section develops the foregoing interpellations in the marketing context. A straightforward methodology for structuring marketing situations is postulated, a basis for segmenting children's markets using processing characteristics is adopted, and criteria for communicating brand support information to children are discussed. A taxonomy of market-induced processing conditions is developed. Finally, a case study with topical commercial significance is described. The development, launch and marketing of a new product in the confectionery market is outlined, the aetiology of its subsequent demise identified and expounded, and prescriptive guidelines are put forward to help avert future repetition of marketing misjudgements.