868 resultados para Focussed retrieval
Resumo:
Over the years, people have often held the hypothesis that negative feedback should be very useful for largely improving the performance of information filtering systems; however, we have not obtained very effective models to support this hypothesis. This paper, proposes an effective model that use negative relevance feedback based on a pattern mining approach to improve extracted features. This study focuses on two main issues of using negative relevance feedback: the selection of constructive negative examples to reduce the space of negative examples; and the revision of existing features based on the selected negative examples. The former selects some offender documents, where offender documents are negative documents that are most likely to be classified in the positive group. The later groups the extracted features into three groups: the positive specific category, general category and negative specific category to easily update the weight. An iterative algorithm is also proposed to implement this approach on RCV1 data collections, and substantial experiments show that the proposed approach achieves encouraging performance.
Resumo:
This paper investigates self–Googling through the monitoring of search engine activities of users and adds to the few quantitative studies on this topic already in existence. We explore this phenomenon by answering the following questions: To what extent is the self–Googling visible in the usage of search engines; is any significant difference measurable between queries related to self–Googling and generic search queries; to what extent do self–Googling search requests match the selected personalised Web pages? To address these questions we explore the theory of narcissism in order to help define self–Googling and present the results from a 14–month online experiment using Google search engine usage data.
Resumo:
An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).
Resumo:
Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering.
Resumo:
The evolution of organisms that cause healthcare acquired infections (HAI) puts extra stress on hospitals already struggling with rising costs and demands for greater productivity and cost containment. Infection control can save scarce resources, lives, and possibly a facility’s reputation, but statistics and epidemiology are not always sufficient to make the case for the added expense. Economics and Preventing Healthcare Acquired Infection presents a rigorous analytic framework for dealing with this increasingly serious problem. ----- Engagingly written for the economics non-specialist, and brimming with tables, charts, and case examples, the book lays out the concepts of economic analysis in clear, real-world terms so that infection control professionals or infection preventionists will gain competence in developing analyses of their own, and be confident in the arguments they present to decision-makers. The authors: ----- Ground the reader in the basic principles and language of economics. ----- Explain the role of health economists in general and in terms of infection prevention and control. ----- Introduce the concept of economic appraisal, showing how to frame the problem, evaluate and use data, and account for uncertainty. ----- Review methods of estimating and interpreting the costs and health benefits of HAI control programs and prevention methods. ----- Walk the reader through a published economic appraisal of an infection reduction program. ----- Identify current and emerging applications of economics in infection control. ---- Economics and Preventing Healthcare Acquired Infection is a unique resource for practitioners and researchers in infection prevention, control and healthcare economics. It offers valuable alternate perspective for professionals in health services research, healthcare epidemiology, healthcare management, and hospital administration. ----- Written for: Professionals and researchers in infection control, health services research, hospital epidemiology, healthcare economics, healthcare management, hospital administration; Association of Professionals in Infection Control (APIC), Society for Healthcare Epidemiologists of America (SHEA)
Resumo:
We argue that web service discovery technology should help the user navigate a complex problem space by providing suggestions for services which they may not be able to formulate themselves as (s)he lacks the epistemic resources to do so. Free text documents in service environments provide an untapped source of information for augmenting the epistemic state of the user and hence their ability to search effectively for services. A quantitative approach to semantic knowledge representation is adopted in the form of semantic space models computed from these free text documents. Knowledge of the user’s agenda is promoted by associational inferences computed from the semantic space. The inferences are suggestive and aim to promote human abductive reasoning to guide the user from fuzzy search goals into a better understanding of the problem space surrounding the given agenda. Experimental results are discussed based on a complex and realistic planning activity.
Resumo:
Intuitively, any `bag of words' approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distri- butions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document's initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur's search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.
Resumo:
XML document clustering is essential for many document handling applications such as information storage, retrieval, integration and transformation. An XML clustering algorithm should process both the structural and the content information of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. This paper introduces a novel approach that first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. The proposed method reduces the high dimensionality of input data by using only the structure-constrained content. The empirical analysis reveals that the proposed method can effectively cluster even very large XML datasets and outperform other existing methods.
Resumo:
One of the definitions of the term myth is ‘an unproved or false collective belief that is used to justify a social institution’ (see http://dictionary.reference.com/browse/myth). Before we are criticized for suggesting such an irreverent thought might apply to tourism academia, readers must recognize that organizations and industries often operate using shared collective myths (see Meyer and Rowan 1977). Institutionalized rules and processes function as myths that provide legitimacy. The question of interest in this paper is not in the context of the quality of tourism academic research output, which is addressed by other papers in this research probe section. Rather, of importance is enhancing understanding of the extent to which our collective knowledge, legitimized through publishing in peer reviewed academic publications, is proving of value to industry stakeholders, an axiom that appears to be largely unquestioned and unproven.
Resumo:
Recommender Systems is one of the effective tools to deal with information overload issue. Similar with the explicit rating and other implicit rating behaviours such as purchase behaviour, click streams, and browsing history etc., the tagging information implies user’s important personal interests and preferences information, which can be used to recommend personalized items to users. This paper is to explore how to utilize tagging information to do personalized recommendations. Based on the distinctive three dimensional relationships among users, tags and items, a new user profiling and similarity measure method is proposed. The experiments suggest that the proposed approach is better than the traditional collaborative filtering recommender systems using only rating data.
Resumo:
With the size and state of the Internet today, a good quality approach to organizing this mass of information is of great importance. Clustering web pages into groups of similar documents is one approach, but relies heavily on good feature extraction and document representation as well as a good clustering approach and algorithm. Due to the changing nature of the Internet, resulting in a dynamic dataset, an incremental approach is preferred. In this work we propose an enhanced incremental clustering approach to develop a better clustering algorithm that can help to better organize the information available on the Internet in an incremental fashion. Experiments show that the enhanced algorithm outperforms the original histogram based algorithm by up to 7.5%.
Resumo:
Association rule mining is one technique that is widely used when querying databases, especially those that are transactional, in order to obtain useful associations or correlations among sets of items. Much work has been done focusing on efficiency, effectiveness and redundancy. There has also been a focusing on the quality of rules from single level datasets with many interestingness measures proposed. However, with multi-level datasets now being common there is a lack of interestingness measures developed for multi-level and cross-level rules. Single level measures do not take into account the hierarchy found in a multi-level dataset. This leaves the Support-Confidence approach,which does not consider the hierarchy anyway and has other drawbacks, as one of the few measures available. In this paper we propose two approaches which measure multi-level association rules to help evaluate their interestingness. These measures of diversity and peculiarity can be used to help identify those rules from multi-level datasets that are potentially useful.
Resumo:
Recommender systems are widely used online to help users find other products, items etc that they may be interested in based on what is known about that user in their profile. Often however user profiles may be short on information and thus when there is not sufficient knowledge on a user it is difficult for a recommender system to make quality recommendations. This problem is often referred to as the cold-start problem. Here we investigate whether association rules can be used as a source of information to expand a user profile and thus avoid this problem, leading to improved recommendations to users. Our pilot study shows that indeed it is possible to use association rules to improve the performance of a recommender system. This we believe can lead to further work in utilising appropriate association rules to lessen the impact of the cold-start problem.
Resumo:
This thesis by publication contributes to our knowledge of psychological factors underlying a modern day phenomenon, young people’s mobile phone behaviour. Specifically, the thesis reports a PhD program of research which adopted a social psychological approach to explore mobile phone behaviour among young Australians aged between 15 and 24 years. A particular focus of the research program was to explore both the cognitive and behavioural aspects of young people’s mobile phone behaviour which for the purposes of this thesis is defined as mobile phone involvement. The research program comprised three separate stages which were developmental in nature, in that, the findings of each stage of the research program informed the next. The overarching goal of the program of research was to improve our understanding of the psychosocial factors influencing young people’s mobile phone behaviour. To achieve this overall goal, there were a number of aims to the research program which reflect the developmental nature of this thesis. Given the limited research into the mobile phone behaviour in Australia, the first two aims of the research program were to explore patterns of mobile phone behaviour among Australian youth and explore the social psychological factors relating to their mobile phone behaviour. Following this exploration, the research program sought to develop a measure which captures the cognitive and behavioural aspects of mobile phone behaviour. Finally, the research program aimed to examine and differentiate the psychosocial predictors of young people’s frequency of mobile phone use and their level of involvement with their mobile phone. Both qualitative and quantitative methodologies were used throughout the program of research. Five papers prepared during the three stages of the research program form the bulk of this thesis. The first stage of the research program was a qualitative investigation of young people’s mobile phone behaviour. Thirty-two young Australians participated in a series of focus groups in which they discussed their mobile phone behaviour. Thematic data analysis explored patterns of mobile phone behaviour among young people, developed an understanding of psychological factors influencing their use of mobile phones, and identified that symptoms of addiction were emerging in young people’s mobile phone behaviour. Two papers (Papers 1 and 2) emanated from this first stage of the research program. Paper 1 explored patterns of mobile phone behaviour and revealed that mobile phones were perceived as being highly beneficial to young people’s lives, with the ability to remain in constant contact with others being particularly valued. The paper also identified that symptoms of behavioural addiction including withdrawal, cognitive and behavioural salience, and loss of control, emerged in participants’ descriptions of their mobile phone behaviour. Paper 2 explored how young people’s need to belong and their social identity (two constructs previously unexplored in the context of mobile phone behaviour) related to their mobile phone behaviour. It was revealed that young people use their mobile phones to facilitate social attachments. Additionally, friends and peers influenced young people’s mobile phone behaviour; for example, their choice of mobile phone carrier and their most frequent type of mobile phone use. These papers laid the foundation for the further investigation of addictive patterns of behaviour and the role of social psychological factors on young people’s mobile behaviour throughout the research program. Stage 2 of the research program focussed on developing a new parsimonious measure of mobile phone behaviour, the Mobile Phone Involvement Questionnaire (MPIQ), which captured the cognitive and behavioural aspects of mobile phone use. Additionally, the stage included a preliminary exploration of factors influencing young people’s mobile phone behaviour. Participants (N = 946) completed a questionnaire which included a pool of items assessing symptoms of behavioural addiction, the uses and gratifications relating to mobile phone use, and self-identity and validation from others in the context of mobile phone behaviour. Two papers (Papers 3 & 4) emanated from the second stage of the research program. Paper 3 provided an important link between the qualitative and quantitative components of the research program. Qualitative data from Stage 1 indicated the reasons young people use their mobile phones and identified addictive characteristics present in young people’s mobile phone behaviour. Results of the quantitative study conducted in Stage 2 of the research program revealed the uses and gratifications relating to young people’s mobile phone behaviour and the effect of these gratifications on young people’s frequency of mobile phone use and three indicators of addiction, withdrawal, salience, and loss of control. Three major uses and gratifications: self (such as feeling good or as a fashion item), social (such as contacting friends), and security (such as use in an emergency) were found to underlie much of young people’s mobile phone behaviour. Self and social gratifications predicted young people’s frequency of mobile phone use and the three indicators of addiction but security gratifications did not. These results provided an important foundation for the inclusion of more specific psychosocial predictors in the later stages of the research program. Paper 4 reported the development of the mobile phone involvement questionnaire and a preliminary exploration of the effect of self-identity and validation from others on young people’s mobile phone behaviour. The MPIQ assessed a unitary construct and was a reliable measure amongst this cohort. Results found that self-identity influenced the frequency of young people’s use whereas self-identity and validation from others influenced their level of mobile phone involvement. These findings provided an important indication that, in addition to self factors, other people have a strong influence on young people’s involvement with their mobile phone and that mobile phone involvement is conceptually different to frequency of mobile phone use. Stage 3 of the research program empirically examined the psychosocial predictors of young people’s mobile behaviour and one paper, Paper 5, emanated from this stage. Young people (N = 292) from throughout Australia completed an online survey assessing the role of self-identity, ingroup norm, the need to belong, and self-esteem on their frequency of mobile phone use and their mobile phone involvement. Self-identity was the only psychosocial predictor of young people’s frequency of mobile phone use. In contrast, self-identity, ingroup norm, and need to belong all influenced young people’s level of involvement with their mobile phone. Additionally, the effect of self-esteem on young people’s mobile phone involvement was mediated by their need to belong. These results indicate that young people who perceive their mobile phone to be an integral part of their self-identity, who perceive that mobile phone is common amongst friends and peers, and who have a strong need for attachment to others, in some cases driven by a desire to enhance their self-esteem, are most likely to become highly involved with their mobile phones. Overall, this PhD program of research has provided an important contribution to our understanding of young Australians’ mobile phone behaviour. Results of the program have broadened our knowledge of factors influencing mobile phone behaviour beyond the approaches used in previous research. The use of various social psychological theories combined with a behavioural addiction framework provided a novel examination of young people’s mobile behaviour. In particular, the development of a new measure of mobile phone behaviour in the research program facilitated the differentiation of the psychosocial factors influencing frequency of young people’s mobile phone behaviour and their level of involvement with their mobile phone. Results of the research program indicate the important role that mobile phone behaviour plays in young people’s social development and also signals the characteristics of those people who may become highly involved with their mobile phone. Future research could build on this thesis by exploring whether mobile phones are affecting traditional social psychological processes and whether the results in this research program are generalisable to other cohorts and other communication technologies.
Resumo:
Purpose – The purpose of this paper is to set out to explore the similarities and differences between jargon used to describe future-focussed commercial building product. This is not so much an exercise in semantics as an attempt to demonstrate that responses to challenges facing the construction and property sectors may have more to do with language than is generally appreciated. Design/methodology/approach – This is a conceptual analysis which draws upon relevant literature. Findings – Social responsibility and sustainability are often held to be much the same thing, with each term presupposing the existence of the other. Clearly, however, there are incidences where sustainable commercial property investment (SCPI) may not be particularly socially responsible, despite being understood as an environmentally friendly initiative. By contrast, socially responsible assets, at least in theory, should always be more sustainable than mainstream non-ethically based investment. Put simply, the expression of social responsibility in the built environment may evoke, and thereby deliver, a more sustainable product, as defined by wider socially inclusive parameters. Practical implications – The findings show that promoting an ethic of social responsibility may well result in more SCPI. Thus, the further articulation and celebration of social responsibility concepts may well help to further advance a sustainable property investment agenda, which is arguably more concerned about demonstrability of efficiency than wider public good outcomes. Originality/value – The idea that jargon affects outcomes is not new. However, this idea has rarely, if ever, been applied to the distinctions between social responsibility and sustainability. Even a moderate re-emphasis on social responsibility in preference to sustainability may well provide significant future benefits with respect to the investment, building and refurbishment of commercial property.