336 resultados para Information retrieval - Australia


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundamental tooling is required in order to apply USDL in practical settings. This chapter discusses three fundamental types of tools for USDL. First, USDL editors have been developed for expert and casual users, respectively. Second, several USDL repositories have been built to allow editors accessing and storing USDL descriptions. Third, our generic USDL marketplace allows providers to describe their services once and potentially trade them anywhere. In addition, the iosyncrasies of service trading as opposed to the simpler case of product trading. The chapter also presents several deployment scenarios of such tools to foster individual value chains and support new business models across organizational boundaries. We close the chapter with an application of USDL in the context of service engineering.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper develops a framework for classifying term dependencies in query expansion with respect to the role terms play in structural linguistic associations. The framework is used to classify and compare the query expansion terms produced by the unigram and positional relevance models. As the unigram relevance model does not explicitly model term dependencies in its estimation process it is often thought to ignore dependencies that exist between words in natural language. The framework presented in this paper is underpinned by two types of linguistic association, namely syntagmatic and paradigmatic associations. It was found that syntagmatic associations were a more prevalent form of linguistic association used in query expansion. Paradoxically, it was the unigram model that exhibited this association more than the positional relevance model. This surprising finding has two potential implications for information retrieval models: (1) if linguistic associations underpin query expansion, then a probabilistic term dependence assumption based on position is inadequate for capturing them; (2) the unigram relevance model captures more term dependency information than its underlying theoretical model suggests, so its normative position as a baseline that ignores term dependencies should perhaps be reviewed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Traditional recommendation methods offer items, that are inanimate and one way recommendation, to users. Emerging new applications such as online dating or job recruitments require reciprocal people-to-people recommendations that are animate and two-way recommendations. In this paper, we propose a reciprocal collaborative method based on the concepts of users' similarities and common neighbors. The dataset employed for the experiment is gathered from a real life online dating network. The proposed method is compared with baseline methods that use traditional collaborative algorithms. Results show the proposed method can achieve noticeably better performance than the baseline methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper discusses users’ query reformulation behaviour while searching information on the Web. Query reformulations have emerged as an important component of Web search behaviour and human-computer interaction (HCI) because a user’s success of information retrieval (IR) depends on how he or she formulates queries. There are various factors, such as cognitive styles, that influence users’ query reformulation behaviour. Understanding how users with different cognitive styles formulate their queries while performing Web searches can help HCI researchers and information systems (IS) developers to provide assistance to the users. This paper aims to examine the effects of users’ cognitive styles on their query reformation behaviour. To achieve the goal of the study, a user study was conducted in which a total of 3613 search terms and 872 search queries were submitted by 50 users who engaged in 150 scenario-based search tasks. Riding’s (1991) Cognitive Style Analysis (CSA) test was used to assess users’ cognitive style as wholist or analytic, and verbaliser or imager. The study findings show that users’ query reformulation behaviour is affected by their cognitive styles. The results reveal that analytic users tended to prefer Add queries while all other users preferred New queries. A significant difference was found among wholists and analytics in the manner they performed Remove query reformulations. Future HCI researchers and IS developers can utilize the study results to develop interactive and user-cantered search model, and to provide context-based query suggestions for users.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Divergence from a random baseline is a technique for the evaluation of document clustering. It ensures cluster quality measures are performing work that prevents ineffective clusterings from giving high scores to clusterings that provide no useful result. These concepts are defined and analysed using intrinsic and extrinsic approaches to the evaluation of document cluster quality. This includes the classical clusters to categories approach and a novel approach that uses ad hoc information retrieval. The divergence from a random baseline approach is able to differentiate ineffective clusterings encountered in the INEX XML Mining track. It also appears to perform a normalisation similar to the Normalised Mutual Information (NMI) measure but it can be applied to any measure of cluster quality. When it is applied to the intrinsic measure of distortion as measured by RMSE, subtraction from a random baseline provides a clear optimum that is not apparent otherwise. This approach can be applied to any clustering evaluation. This paper describes its use in the context of document clustering evaluation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Success of query reformulation and relevant information retrieval depends on many factors, such as users’ prior knowledge, age, gender, and cognitive styles. One of the important factors that affect a user’s query reformulation behaviour is that of the nature of the search tasks. Limited studies have examined the impact of the search task types on query reformulation behaviour while performing Web searches. This paper examines how the nature of the search tasks affects users’ query reformulation behaviour during information searching. The paper reports empirical results from a user study in which 50 participants performed a set of three Web search tasks – exploratory, factorial and abstract. Users’ interactions with search engines were logged by using a monitoring program. 872 unique search queries were classified into five query types – New, Add, Remove, Replace and Repeat. Users submitted fewer queries for the factual task, which accounted for 26%. They completed a higher number of queries (40% of the total queries) while carrying out the exploratory task. A one-way MANOVA test indicated a significant effect of search task types on users’ query reformulation behaviour. In particular, the search task types influenced the manner in which users reformulated the New and Repeat queries.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper analyses the pairwise distances of signatures produced by the TopSig retrieval model on two document collections. The distribution of the distances are compared to purely random signatures. It explains why TopSig is only competitive with state of the art retrieval models at early precision. Only the local neighbourhood of the signatures is interpretable. We suggest this is a common property of vector space models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Many existing information retrieval models do not explicitly take into account in- formation about word associations. Our approach makes use of rst and second order relationships found in natural language, known as syntagmatic and paradigmatic associ- ations, respectively. This is achieved by using a formal model of word meaning within the query expansion process. On ad hoc retrieval, our approach achieves statistically sig- ni cant improvements in MAP (0.158) and P@20 (0.396) over our baseline model. The ERR@20 and nDCG@20 of our system was 0.249 and 0.192 respectively. Our results and discussion suggest that information about both syntagamtic and paradigmatic associa- tions can assist with improving retrieval eectiveness on ad hoc retrieval.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. One of the most popular web personalization systems is recommender systems. In recommender systems choosing user information that can be used to profile users is very crucial for user profiling. In Web 2.0, one facility that can help users organize Web resources of their interest is user tagging systems. Exploring user tagging behavior provides a promising way for understanding users’ information needs since tags are given directly by users. However, free and relatively uncontrolled vocabulary makes the user self-defined tags lack of standardization and semantic ambiguity. Also, the relationships among tags need to be explored since there are rich relationships among tags which could provide valuable information for us to better understand users. In this paper, we propose a novel approach for learning tag ontology based on the widely used lexical database WordNet for capturing the semantics and the structural relationships of tags. We present personalization strategies to disambiguate the semantics of tags by combining the opinion of WordNet lexicographers and users’ tagging behavior together. To personalize further, clustering of users is performed to generate a more accurate ontology for a particular group of users. In order to evaluate the usefulness of the tag ontology, we use the tag ontology in a pilot tag recommendation experiment for improving the recommendation performance by exploiting the semantic information in the tag ontology. The initial result shows that the personalized information has improved the accuracy of the tag recommendation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Many existing information retrieval models do not explicitly take into account in- formation about word associations. Our approach makes use of rst and second order relationships found in natural language, known as syntagmatic and paradigmatic associ- ations, respectively. This is achieved by using a formal model of word meaning within the query expansion process. On ad hoc retrieval, our approach achieves statistically sig- ni cant improvements in MAP (0.158) and P@20 (0.396) over our baseline model. The ERR@20 and nDCG@20 of our system was 0.249 and 0.192 respectively. Our results and discussion suggest that information about both syntagamtic and paradigmatic associa- tions can assist with improving retrieval eectiveness on ad hoc retrieval.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we describe a machine-translated parallel English corpus for the NTCIR Chinese, Japanese and Korean (CJK) Wikipedia collections. This document collection is named CJK2E Wikipedia XML corpus. The corpus could be used by the information retrieval research community and knowledge sharing in Wikipedia in many ways; for example, this corpus could be used for experimentations in cross-lingual information retrieval, cross-lingual link discovery, or omni-lingual information retrieval research. Furthermore, the translated CJK articles could be used to further expand the current coverage of the English Wikipedia.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we examine automated Chinese to English link discovery in Wikipedia and the effects of Chinese segmentation and Chinese to English translation on the hyperlink recommendation. Our experimental results show that the implemented link discovery framework can effectively recommend Chinese-to-English cross-lingual links. The techniques described here can assist bi-lingual users where a particular topic is not covered in Chinese, is not equally covered in both languages, or is biased in one language; as well as for language learning.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Free association norms indicate that words are organized into semantic/associative neighborhoods within a larger network of words and links that bind the net together. We present evidence indicating that memory for a recent word event can depend on implicitly and simultaneously activating related words in its neighborhood. Processing a word during encoding primes its network representation as a function of the density of the links in its neighborhood. Such priming increases recall and recognition and can have long lasting effects when the word is processed in working memory. Evidence for this phenomenon is reviewed in extralist cuing, primed free association, intralist cuing, and single-item recognition tasks. The findings also show that when a related word is presented to cue the recall of a studied word, the cue activates it in an array of related words that distract and reduce the probability of its selection. The activation of the semantic network produces priming benefits during encoding and search costs during retrieval. In extralist cuing recall is a negative function of cue-to-distracter strength and a positive function of neighborhood density, cue-to-target strength, and target-to cue strength. We show how four measures derived from the network can be combined and used to predict memory performance. These measures play different roles in different tasks indicating that the contribution of the semantic network varies with the context provided by the task. We evaluate spreading activation and quantum-like entanglement explanations for the priming effect produced by neighborhood density.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Information retrieval (IR) by clinicians in the healthcare setting is critical for informing clinical decision-making. However, a large part of this information is in the form of free-text and inhibits clinical decision support and effective healthcare services. This makes meaningful use of clinical free-­text in electronic health records (EHRs) for patient care a difficult task. Within the context of IR, given a repository of free-­text clinical reports, one might want to retrieve and analyse data for patients who have a known clinical finding.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Due to the development of XML and other data models such as OWL and RDF, sharing data is an increasingly common task since these data models allow simple syntactic translation of data between applications. However, in order for data to be shared semantically, there must be a way to ensure that concepts are the same. One approach is to employ commonly usedschemas—called standard schemas —which help guarantee that syntactically identical objects have semantically similar meanings. As a result of the spread of data sharing, there has been widespread adoption of standard schemas in a broad range of disciplines and for a wide variety of applications within a very short period of time. However, standard schemas are still in their infancy and have not yet matured or been thoroughly evaluated. It is imperative that the data management research community takes a closer look at how well these standard schemas have fared in real-world applications to identify not only their advantages, but also the operational challenges that real users face. In this paper, we both examine the usability of standard schemas in a comparison that spans multiple disciplines, and describe our first step at resolving some of these issues in our Semantic Modeling System. We evaluate our Semantic Modeling System through a careful case study of the use of standard schemas in architecture, engineering, and construction, which we conducted with domain experts. We discuss how our Semantic Modeling System can help the broader problem and also discuss a number of challenges that still remain.