778 resultados para akaike information criterion
Resumo:
Term-based approaches can extract many features in text documents, but most include noise. Many popular text-mining strategies have been adapted to reduce noisy information from extracted features; however, text-mining techniques suffer from low frequency. The key issue is how to discover relevance features in text documents to fulfil user information needs. To address this issue, we propose a new method to extract specific features from user relevance feedback. The proposed approach includes two stages. The first stage extracts topics (or patterns) from text documents to focus on interesting topics. In the second stage, topics are deployed to lower level terms to address the low-frequency problem and find specific terms. The specific terms are determined based on their appearances in relevance feedback and their distribution in topics or high-level patterns. We test our proposed method with extensive experiments in the Reuters Corpus Volume 1 dataset and TREC topics. Results show that our proposed approach significantly outperforms the state-of-the-art models.
Resumo:
In this paper we present a unified sequential Monte Carlo (SMC) framework for performing sequential experimental design for discriminating between a set of models. The model discrimination utility that we advocate is fully Bayesian and based upon the mutual information. SMC provides a convenient way to estimate the mutual information. Our experience suggests that the approach works well on either a set of discrete or continuous models and outperforms other model discrimination approaches.
Resumo:
"Students transitioning from vocational education and training (VET) to university can experience a number of challenges. This small research project explored the information literacy needs of VET and university students and how they differ. Students studying early childhood related VET and university courses reported differences in how and where they searched for information in their studies. These differences reflect the more practical focus of VET compared with the more academic and theoretical approach of university. The author proposes a framework of support that could be provided to transitioning students to enable them to develop the necessary information literacy skills for university study."--publisher website
Resumo:
In this paper we introduce a formalization of Logical Imaging applied to IR in terms of Quantum Theory through the use of an analogy between states of a quantum system and terms in text documents. Our formalization relies upon the Schrodinger Picture, creating an analogy between the dynamics of a physical system and the kinematics of probabilities generated by Logical Imaging. By using Quantum Theory, it is possible to model more precisely contextual information in a seamless and principled fashion within the Logical Imaging process. While further work is needed to empirically validate this, the foundations for doing so are provided.
Resumo:
Retrieval with Logical Imaging is derived from belief revision and provides a novel mechanism for estimating the relevance of a document through logical implication (i.e. P(q -> d)). In this poster, we perform the first comprehensive evaluation of Logical Imaging (LI) in Information Retrieval (IR) across several TREC test Collections. When compared against standard baseline models, we show that LI fails to improve performance. This failure can be attributed to a nuance within the model that means non-relevant documents are promoted in the ranking, while relevant documents are demoted. This is an important contribution because it not only contextualizes the effectiveness of LI, but crucially ex- plains why it fails. By addressing this nuance, future LI models could be significantly improved.
Resumo:
Quantum-inspired models have recently attracted increasing attention in Information Retrieval. An intriguing characteristic of the mathematical framework of quantum theory is the presence of complex numbers. However, it is unclear what such numbers could or would actually represent or mean in Information Retrieval. The goal of this paper is to discuss the role of complex numbers within the context of Information Retrieval. First, we introduce how complex numbers are used in quantum probability theory. Then, we examine van Rijsbergen’s proposal of evoking complex valued representations of informations objects. We empirically show that such a representation is unlikely to be effective in practice (confuting its usefulness in Information Retrieval). We then explore alternative proposals which may be more successful at realising the power of complex numbers.
Creation of a new evaluation benchmark for information retrieval targeting patient information needs
Resumo:
Searching for health advice on the web is becoming increasingly common. Because of the great importance of this activity for patients and clinicians and the effect that incorrect information may have on health outcomes, it is critical to present relevant and valuable information to a searcher. Previous evaluation campaigns on health information retrieval (IR) have provided benchmarks that have been widely used to improve health IR and record these improvements. However, in general these benchmarks have targeted the specialised information needs of physicians and other healthcare workers. In this paper, we describe the development of a new collection for evaluation of effectiveness in IR seeking to satisfy the health information needs of patients. Our methodology features a novel way to create statements of patients’ information needs using realistic short queries associated with patient discharge summaries, which provide details of patient disorders. We adopt a scenario where the patient then creates a query to seek information relating to these disorders. Thus, discharge summaries provide us with a means to create contextually driven search statements, since they may include details on the stage of the disease, family history etc. The collection will be used for the first time as part of the ShARe/-CLEF 2013 eHealth Evaluation Lab, which focuses on natural language processing and IR for clinical care.
Resumo:
Complex numbers are a fundamental aspect of the mathematical formalism of quantum physics. Quantum-like models developed outside physics often overlooked the role of complex numbers. Specifically, previous models in Information Retrieval (IR) ignored complex numbers. We argue that to advance the use of quantum models of IR, one has to lift the constraint of real-valued representations of the information space, and package more information within the representation by means of complex numbers. As a first attempt, we propose a complex-valued representation for IR, which explicitly uses complex valued Hilbert spaces, and thus where terms, documents and queries are represented as complex-valued vectors. The proposal consists of integrating distributional semantics evidence within the real component of a term vector; whereas, ontological information is encoded in the imaginary component. Our proposal has the merit of lifting the role of complex numbers from a computational byproduct of the model to the very mathematical texture that unifies different levels of semantic information. An empirical instantiation of our proposal is tested in the TREC Medical Record task of retrieving cohorts for clinical studies.
Resumo:
This paper presents the results of task 3 of the ShARe/CLEF eHealth Evaluation Lab 2013. This evaluation lab focuses on improving access to medical information on the web. The task objective was to investigate the effect of using additional information such as the discharge summaries and external resources such as medical ontologies on the IR effectiveness. The participants were allowed to submit up to seven runs, one mandatory run using no additional information or external resources, and three each using or not using discharge summaries.
Resumo:
Social Media Analytics ist ein neuer Forschungsbereich, in dem interdisziplinäre Methoden kombiniert, erweitert und angepasst werden, um Social-Media-Daten auszuwerten. Neben der Beantwortung von Forschungsfragen ist es ebenfalls ein Ziel, Architekturentwürfe für die Entwicklung neuer Informationssysteme und Anwendungen bereitzustellen, die auf sozialen Medien basieren. Der Beitrag stellt die wichtigsten Aspekte des Bereichs Social Media Analytics vor und verweist auf die Notwendigkeit einer fächerübergreifenden Forschungsagenda, für deren Erstellung und Bearbeitung der Wirtschaftsinformatik eine wichtige Rolle zukommt.
Resumo:
Social Media Analytics is an emerging interdisciplinary research field that aims on combining, extending, and adapting methods for analysis of social media data. On the one hand it can support IS and other research disciplines to answer their research questions and on the other hand it helps to provide architectural designs as well as solution frameworks for new social media-based applications and information systems. The authors suggest that IS should contribute to this field and help to develop and process an interdisciplinary research agenda.
Resumo:
In the current business world which companies’ competition is very compact in the business arena, quality in manufacturing and providing products and services can be considered as a means of seeking excellence and success of companies in this competition arena. Entering the era of e-commerce and emergence of new production systems and new organizational structures, traditional management and quality assurance systems have been challenged. Consequently, quality information system has been gained a special seat as one of the new tools of quality management. In this paper, quality information system has been studied with a review of the literature of the quality information system, and the role and position of quality Information System (QIS) among other information systems of a organization is investigated. The quality Information system models are analyzed and by analyzing and assessing presented models in quality information system a conceptual and hierarchical model of quality information system is suggested and studied. As a case study the hierarchical model of quality information system is developed by evaluating hierarchical models presented in the field of quality information system based on the Shetabkar Co.
Resumo:
This thesis opens up the design space for awareness research in CSCW and HCI. By challenging the prevalent understanding of roles in awareness processes and exploring different mechanisms for actively engaging users in the awareness process, this thesis provides a better understanding of the complexity of these processes and suggests practical solutions for designing and implementing systems that support active awareness. Mutual awareness, a prominent research topic in the fields of Computer-Supported Cooperative Work (CSCW) and Human-Computer Interaction (HCI) refers to a fundamental aspect of a person’s work: their ability to gain a better understanding of a situation by perceiving and interpreting their co-workers actions. Technologically-mediated awareness, used to support co-workers across distributed settings, distinguishes between the roles of the actor, whose actions are often limited to being the target of an automated data gathering processes, and the receiver, who wants to be made aware of the actors’ actions. This receiver-centric view of awareness, focusing on helping receivers to deal with complex sets of awareness information, stands in stark contrast to our understanding of awareness as social process involving complex interactions between both actors and receivers. It fails to take into account an actors’ intimate understanding of their own activities and the contribution that this subjective understanding could make in providing richer awareness information. In this thesis I challenge the prevalent receiver-centric notion of awareness, and explore the conceptual foundations, design, implementation and evaluation of an alternative active awareness approach by making the following five contributions. Firstly, I identify the limitations of existing awareness research and solicit further evidence to support the notion of active awareness. I analyse ethnographic workplace studies that demonstrate how actors engage in an intricate interplay involving the monitoring of their co-workers progress and displaying aspects of their activities that may be of relevance to others. The examination of a large body of awareness research reveals that while disclosing information is a common practice in face-to-face collaborative settings it has been neglected in implementations of technically mediated awareness. Based on these considerations, I introduce the notion of intentional disclosure to describe the action of users actively and deliberately contributing awareness information. I consider challenges and potential solutions for the design of active awareness. I compare a range of systems, each allowing users to share information about their activities at various levels of detail. I discuss one of the main challenges to active awareness: that disclosing information about activities requires some degree of effort. I discuss various representations of effort in collaborative work. These considerations reveal that there is a trade-off between the richness of awareness information and the effort required to provide this information. I propose a framework for active awareness, aimed to help designers to understand the scope and limitations of different types of intentional disclosure. I draw on the identified richness/effort trade-off to develop two types of intentional disclosure, both of which aim to facilitate the disclosure of information while reducing the effort required to do so. For both of these approaches, direct and indirect disclosure, I delineate how they differ from related approaches and define a set of design criteria that is intended to guide their implementation. I demonstrate how the framework of active awareness can be practically applied by building two proof-of-concept prototypes that implement direct and indirect disclosure respectively. AnyBiff, implementing direct disclosure, allows users to create, share and use shared representations of activities in order to express their current actions and intentions. SphereX, implementing indirect disclosure, represents shared areas of interests or working context, and links sets of activities to these representations. Lastly, I present the results of the qualitative evaluation of the two prototypes and analyse the results with regard to the extent to which they implemented their respective disclosure mechanisms and supported active awareness. Both systems were deployed and tested in real world environments. The results for AnyBiff showed that users developed a wide range of activity representations, some unanticipated, and actively used the system to disclose information. The results further highlighted a number of design considerations relating to the relationship between awareness and communication, and the role of ambiguity. The evaluation of SphereX validated the feasibility of the indirect disclosure approach. However, the study highlighted the challenges of implementing cross-application awareness support and translating the concept to users. The study resulted in design recommendations aimed to improve the implementation of future systems.
Resumo:
Early works on Private Information Retrieval (PIR) focused on minimizing the necessary communication overhead. They seemed to achieve this goal but at the expense of query response time. To mitigate this weakness, protocols with secure coprocessors were introduced. They achieve optimal communication complexity and better online processing complexity. Unfortunately, all secure coprocessor-based PIR protocols require heavy periodical preprocessing. In this paper, we propose a new protocol, which is free from the periodical preprocessing while offering the optimal communication complexity and almost optimal online processing complexity. The proposed protocol is proven to be secure.