852 resultados para Information seeking behavior
Resumo:
It is a big challenge to clearly identify the boundary between positive and negative streams. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on RCV1, and substantial experiments show that the proposed approach achieves encouraging performance.
Resumo:
Over the years, people have often held the hypothesis that negative feedback should be very useful for largely improving the performance of information filtering systems; however, we have not obtained very effective models to support this hypothesis. This paper, proposes an effective model that use negative relevance feedback based on a pattern mining approach to improve extracted features. This study focuses on two main issues of using negative relevance feedback: the selection of constructive negative examples to reduce the space of negative examples; and the revision of existing features based on the selected negative examples. The former selects some offender documents, where offender documents are negative documents that are most likely to be classified in the positive group. The later groups the extracted features into three groups: the positive specific category, general category and negative specific category to easily update the weight. An iterative algorithm is also proposed to implement this approach on RCV1 data collections, and substantial experiments show that the proposed approach achieves encouraging performance.
Resumo:
This qualitative study views international students as information-using learners, through an information literacy lens. Focusing on the experiences of 25 international students at two Australian universities, the study investigates how international students use online information resources to learn, and identifies associated information literacy learning needs. An expanded critical incident approach provided the methodological framework for the study. Building on critical incident technique, this approach integrated a variety of concepts and research strategies. The investigation centred on real-life critical incidents experienced by the international students whilst using online resources for assignment purposes. Data collection involved semi-structured interviews and an observed online resource-using task. Inductive data analysis and interpretation enabled the creation of a multifaceted word picture of international students using online resources and a set of critical findings about their information literacy learning needs. The study’s key findings reveal: • the complexity of the international students’ experience of using online information resources to learn, which involves an interplay of their interactions with online resources, their affective and reflective responses to using them, and the cultural and linguistic dimensions of their information use. • the array of strengths as well as challenges that the international students experience in their information use and learning. • an apparent information literacy imbalance between the international students’ more developed information skills and less developed critical and strategic approaches to using information • the need for enhanced information literacy education that responds to international students’ identified information literacy needs. Responding to the findings, the study proposes an inclusive informed learning approach to support reflective information use and inclusive information literacy learning in culturally diverse higher education environments.
Resumo:
This paper investigates self–Googling through the monitoring of search engine activities of users and adds to the few quantitative studies on this topic already in existence. We explore this phenomenon by answering the following questions: To what extent is the self–Googling visible in the usage of search engines; is any significant difference measurable between queries related to self–Googling and generic search queries; to what extent do self–Googling search requests match the selected personalised Web pages? To address these questions we explore the theory of narcissism in order to help define self–Googling and present the results from a 14–month online experiment using Google search engine usage data.
Resumo:
An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).
Resumo:
In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", or NGMI, which is used to segment Chinese documents into n-character words or phrases, using language statistics drawn from the Chinese Wikipedia corpus. The approach alleviates the tremendous effort that is required in preparing and maintaining the manually segmented Chinese text for training purposes, and manually maintaining ever expanding lexicons. Previously, mutual information was used to achieve automated segmentation into 2-character words. The NGMI approach extends the approach to handle longer n-character words. Experiments with heterogeneous documents from the Chinese Wikipedia collection show good results.
Resumo:
The Queensland Injury Surveillance Unit (QISU) has been collecting and analysing injury data in Queensland since 1988. QISU data is collected from participating emergency departments (EDs) in urban, rural and remote areas of Queensland. Using this data, QISU produces several injury bulletins per year on selected topics, providing a picture of Queensland injury, and setting this in the context of relevant local, national and international research and policy. These bulletins are used by numerous government and non-government groups to inform injury prevention and practice throughout the state. QISU bulletins are also used by local and state media to inform the general public of injury risk and prevention strategies. In addition to producing the bulletins, QISU regularly responds to requests for information from a variety of sources. These requests often require additional analysis of QISU data to tailor the response to the needs of the end user. This edition of the bulletin reviews 5 years of information requests to QISU.
Resumo:
Since the industrial revolution, our world has experienced rapid and unplanned industrialization and urbanization. As a result, we have had to cope with serious environmental challenges. In this context, an explanation of how smart urban ecosystems can emerge, gains a crucial importance. Capacity building and community involvement have always been key issues in achieving sustainable development and enhancing urban ecosystems. By considering these, this paper looks at new approaches to increase public awareness of environmental decision making. This paper will discuss the role of Information and Communication Technologies (ICT), particularly Webbased Geographic Information Systems (Web-based GIS) as spatial decision support systems to aid public participatory environmental decision making. The paper also explores the potential and constraints of these webbased tools for collaborative decision making.
Resumo:
Process modeling grammars are used by analysts to describe information systems domains in terms of the business operations an organization is conducting. While prior research has examined the factors that lead to continued usage behavior, little knowledge has been established as to what extent characteristics of the users of process modeling grammars inform usage behavior. In this study, a theoretical model is advanced that incorporates determinants of continued usage behavior as well as key antecedent individual difference factors of the grammar users, such as modeling experience, modeling background and perceived grammar familiarity. Findings from a global survey of 529 grammar users support the hypothesized relationships of the model. The study offers three central contributions. First, it provides a validated theoretical model of post-adoptive modeling grammar usage intentions. Second, it discusses the effects of individual difference factors of grammar users in the context of modeling grammar usage. Third, it provides implications for research and practice.
Resumo:
1. Ecological data sets often use clustered measurements or use repeated sampling in a longitudinal design. Choosing the correct covariance structure is an important step in the analysis of such data, as the covariance describes the degree of similarity among the repeated observations. 2. Three methods for choosing the covariance are: the Akaike information criterion (AIC), the quasi-information criterion (QIC), and the deviance information criterion (DIC). We compared the methods using a simulation study and using a data set that explored effects of forest fragmentation on avian species richness over 15 years. 3. The overall success was 80.6% for the AIC, 29.4% for the QIC and 81.6% for the DIC. For the forest fragmentation study the AIC and DIC selected the unstructured covariance, whereas the QIC selected the simpler autoregressive covariance. Graphical diagnostics suggested that the unstructured covariance was probably correct. 4. We recommend using DIC for selecting the correct covariance structure.
Resumo:
Purpose: Physical activity has become a focus of cancer recovery research as it has the potential to reduce treatment-related burden and optimize health-related quality of life (HRQoL). However, the potential for physical activity to influence recovery may be age-dependent. This paper describes physical activity levels and HRQoL among younger and older women after surgery for breast cancer and explores the correlates of physical inactivity. Methods: A population-based sample of breast cancer patients diagnosed in South-East Queensland, Australia, (n=287) were assessed once every three months, from 6 to 18 months post-surgery. The Functional Assessment of Cancer Therapy-Breast questionnaire (FACTB+4) and items from the Behavioral Risk Factor Surveillance System (BRFSS) questionnaire were used to measure HRQoL and physical activity, respectively. Physical activity was assigned metabolic equivalent task (MET) values, and categorized as < 3, 3 to 17.9 and 18+ MET-hours/weeks. Descriptive statistics, generalized linear models with age stratification (<50 years versus 50+ years), and logistic regression were used for analyses (p=0.05, two-tailed). Results: Younger women who engaged in 3 or more MET-hours/week of physical activity reported a higher HRQoL at 18 months compared to their more sedentary counterparts (p<0.05). Older women reported similar HRQoL irrespective of activity level and consistently reported clinically higher HRQoL than younger women. Increasing age, being overweight or obese, and restricting use of the treated side at six months post-surgery increased the likelihood of sedentary behavior (OR>3, p<0.05). Conclusions: Age influences the potential to observe HRQoL benefits related to physical activity participation. These results also provide relevant information for the design of exercise interventions for breast cancer survivors and highlights that some groups of women are at greater risk of long-term sedentary behavior.
Resumo:
Intuitively, any `bag of words' approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distri- butions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document's initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur's search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.
Resumo:
Collaborative tagging can help users organize, share and retrieve information in an easy and quick way. For the collaborative tagging information implies user’s important personal preference information, it can be used to recommend personalized items to users. This paper proposes a novel tag-based collaborative filtering approach for recommending personalized items to users of online communities that are equipped with tagging facilities. Based on the distinctive three dimensional relationships among users, tags and items, a new similarity measure method is proposed to generate the neighborhood of users with similar tagging behavior instead of similar implicit ratings. The promising experiment result shows that by using the tagging information the proposed approach outperforms the standard user and item based collaborative filtering approaches.
Resumo:
Recommender Systems is one of the effective tools to deal with information overload issue. Similar with the explicit rating and other implicit rating behaviours such as purchase behaviour, click streams, and browsing history etc., the tagging information implies user’s important personal interests and preferences information, which can be used to recommend personalized items to users. This paper is to explore how to utilize tagging information to do personalized recommendations. Based on the distinctive three dimensional relationships among users, tags and items, a new user profiling and similarity measure method is proposed. The experiments suggest that the proposed approach is better than the traditional collaborative filtering recommender systems using only rating data.