976 resultados para missing information
Resumo:
It is a big challenge to clearly identify the boundary between positive and negative streams. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on RCV1, and substantial experiments show that the proposed approach achieves encouraging performance.
Resumo:
Over the years, people have often held the hypothesis that negative feedback should be very useful for largely improving the performance of information filtering systems; however, we have not obtained very effective models to support this hypothesis. This paper, proposes an effective model that use negative relevance feedback based on a pattern mining approach to improve extracted features. This study focuses on two main issues of using negative relevance feedback: the selection of constructive negative examples to reduce the space of negative examples; and the revision of existing features based on the selected negative examples. The former selects some offender documents, where offender documents are negative documents that are most likely to be classified in the positive group. The later groups the extracted features into three groups: the positive specific category, general category and negative specific category to easily update the weight. An iterative algorithm is also proposed to implement this approach on RCV1 data collections, and substantial experiments show that the proposed approach achieves encouraging performance.
Resumo:
This qualitative study views international students as information-using learners, through an information literacy lens. Focusing on the experiences of 25 international students at two Australian universities, the study investigates how international students use online information resources to learn, and identifies associated information literacy learning needs. An expanded critical incident approach provided the methodological framework for the study. Building on critical incident technique, this approach integrated a variety of concepts and research strategies. The investigation centred on real-life critical incidents experienced by the international students whilst using online resources for assignment purposes. Data collection involved semi-structured interviews and an observed online resource-using task. Inductive data analysis and interpretation enabled the creation of a multifaceted word picture of international students using online resources and a set of critical findings about their information literacy learning needs. The study’s key findings reveal: • the complexity of the international students’ experience of using online information resources to learn, which involves an interplay of their interactions with online resources, their affective and reflective responses to using them, and the cultural and linguistic dimensions of their information use. • the array of strengths as well as challenges that the international students experience in their information use and learning. • an apparent information literacy imbalance between the international students’ more developed information skills and less developed critical and strategic approaches to using information • the need for enhanced information literacy education that responds to international students’ identified information literacy needs. Responding to the findings, the study proposes an inclusive informed learning approach to support reflective information use and inclusive information literacy learning in culturally diverse higher education environments.
Resumo:
This paper investigates self–Googling through the monitoring of search engine activities of users and adds to the few quantitative studies on this topic already in existence. We explore this phenomenon by answering the following questions: To what extent is the self–Googling visible in the usage of search engines; is any significant difference measurable between queries related to self–Googling and generic search queries; to what extent do self–Googling search requests match the selected personalised Web pages? To address these questions we explore the theory of narcissism in order to help define self–Googling and present the results from a 14–month online experiment using Google search engine usage data.
Resumo:
An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).
Resumo:
In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", or NGMI, which is used to segment Chinese documents into n-character words or phrases, using language statistics drawn from the Chinese Wikipedia corpus. The approach alleviates the tremendous effort that is required in preparing and maintaining the manually segmented Chinese text for training purposes, and manually maintaining ever expanding lexicons. Previously, mutual information was used to achieve automated segmentation into 2-character words. The NGMI approach extends the approach to handle longer n-character words. Experiments with heterogeneous documents from the Chinese Wikipedia collection show good results.
Resumo:
The art of storytelling is one of the oldest forms of creative discourse. Apart from finding stories, the most important job in television is the construction of stories to have a broad audience appeal. This first-hand review of Missing Persons Unit, hereafter referred to as MPU, a prime time program on the Nine Network in Australia with immense audience appeal, is an original work by the executive producer (development and series producer Series One, executive producer Series Two and Three) based on an overview of two-and-a-half years of production on three series. Through a case study approach, this Masters project explores how story is constructed into a television format. The thesis comprises two parts: the creative component (weighted 50%) is demonstrated through two programs of MPU (one program for evaluation) and the academic component through a written exegesis (50%). This case study aims to demonstrate how observational hybrid series such as MPU can be managed to quick turn-around schedules with precise skill sets that cut across a number of traditional genre styles. With the advent of radio and then television, storytelling found a home and a series of labels called genres to help place them in a schedule for listeners and viewers to choose. Over recent years, with the advent of digital technology and the rush to collect the masses of content required to feed the growing television slate, storytelling has often been replaced by story gathering. Today even in factual series where a clear story construct is important, third party ‘quick fix’ specialists are hired to shape raw content shot by a field team, who never put their own work together and may never come into the edit suite during a project. This thesis explores the art of storytelling in fast turn-around television. In particular it explores the layer cake approach used in the production process of MPU, that enables producers of fast turn-around television to shepherd their own stories from field through to post-production. While each new hybrid series will require its own particular sets of skills, the exploration of the genesis of MPU will demonstrate the building blocks required to successfully produce this type of factual series. This study is also intended as a ‘road map’ for producers who wish to develop similar series.
Resumo:
The Queensland Injury Surveillance Unit (QISU) has been collecting and analysing injury data in Queensland since 1988. QISU data is collected from participating emergency departments (EDs) in urban, rural and remote areas of Queensland. Using this data, QISU produces several injury bulletins per year on selected topics, providing a picture of Queensland injury, and setting this in the context of relevant local, national and international research and policy. These bulletins are used by numerous government and non-government groups to inform injury prevention and practice throughout the state. QISU bulletins are also used by local and state media to inform the general public of injury risk and prevention strategies. In addition to producing the bulletins, QISU regularly responds to requests for information from a variety of sources. These requests often require additional analysis of QISU data to tailor the response to the needs of the end user. This edition of the bulletin reviews 5 years of information requests to QISU.
Resumo:
Since the industrial revolution, our world has experienced rapid and unplanned industrialization and urbanization. As a result, we have had to cope with serious environmental challenges. In this context, an explanation of how smart urban ecosystems can emerge, gains a crucial importance. Capacity building and community involvement have always been key issues in achieving sustainable development and enhancing urban ecosystems. By considering these, this paper looks at new approaches to increase public awareness of environmental decision making. This paper will discuss the role of Information and Communication Technologies (ICT), particularly Webbased Geographic Information Systems (Web-based GIS) as spatial decision support systems to aid public participatory environmental decision making. The paper also explores the potential and constraints of these webbased tools for collaborative decision making.
Resumo:
1. Ecological data sets often use clustered measurements or use repeated sampling in a longitudinal design. Choosing the correct covariance structure is an important step in the analysis of such data, as the covariance describes the degree of similarity among the repeated observations. 2. Three methods for choosing the covariance are: the Akaike information criterion (AIC), the quasi-information criterion (QIC), and the deviance information criterion (DIC). We compared the methods using a simulation study and using a data set that explored effects of forest fragmentation on avian species richness over 15 years. 3. The overall success was 80.6% for the AIC, 29.4% for the QIC and 81.6% for the DIC. For the forest fragmentation study the AIC and DIC selected the unstructured covariance, whereas the QIC selected the simpler autoregressive covariance. Graphical diagnostics suggested that the unstructured covariance was probably correct. 4. We recommend using DIC for selecting the correct covariance structure.
Resumo:
Recommender Systems is one of the effective tools to deal with information overload issue. Similar with the explicit rating and other implicit rating behaviours such as purchase behaviour, click streams, and browsing history etc., the tagging information implies user’s important personal interests and preferences information, which can be used to recommend personalized items to users. This paper is to explore how to utilize tagging information to do personalized recommendations. Based on the distinctive three dimensional relationships among users, tags and items, a new user profiling and similarity measure method is proposed. The experiments suggest that the proposed approach is better than the traditional collaborative filtering recommender systems using only rating data.
Resumo:
This chapter considers how open content licences of copyright-protected materials – specifically, Creative Commons (CC) licences - can be used by governments as a simple and effective mechanism to enable reuse of their PSI, particularly where materials are made available in digital form online or distributed on disk.
Resumo:
Information System (IS) success may be the most arguable and important dependent variable in the IS field. The purpose of the present study is to address IS success by empirically assess and compare DeLone and McLean’s (1992) and Gable’s et al. (2008) models of IS success in Australian Universities context. The two models have some commonalities and several important distinctions. Both models integrate and interrelate multiple dimensions of IS success. Hence, it would be useful to compare the models to see which is superior; as it is not clear how IS researchers should respond to this controversy.
Resumo:
To assess the effects of information interventions which orient patients and their carers/family to a cancer care facility and the services available in the facility.