814 resultados para Information Systems.
Resumo:
An analogy is established between the syntagm and paradigm from Saussurean linguistics and the message and messages for selection from the information theory initiated by Claude Shannon. The analogy is pursued both as an end itself and for its analytic value in understanding patterns of retrieval from full text systems. The multivalency of individual words when isolated from their syntagm is contrasted with the relative stability of meaning of multi-word sequences, when searching ordinary written discourse. The syntagm is understood as the linear sequence of oral and written language. Saussureâ??s understanding of the word, as a unit which compels recognition by the mind, is endorsed, although not regarded as final. The lesser multivalency of multi-word sequences is understood as the greater determination of signification by the extended syntagm. The paradigm is primarily understood as the network of associations a word acquires when considered apart from the syntagm. The restriction of information theory to expression or signals, and its focus on the combinatorial aspects of the message, is sustained. The message in the model of communication in information theory can include sequences of written language. Shannonâ??s understanding of the written word, as a cohesive group of letters, with strong internal statistical influences, is added to the Saussurean conception. Sequences of more than one word are regarded as weakly correlated concatenations of cohesive units.
Resumo:
In previous papers, we have presented a logic-based framework based on fusion rules for merging structured news reports. Structured news reports are XML documents, where the textentries are restricted to individual words or simple phrases, such as names and domain-specific terminology, and numbers and units. We assume structured news reports do not require natural language processing. Fusion rules are a form of scripting language that define how structured news reports should be merged. The antecedent of a fusion rule is a call to investigate the information in the structured news reports and the background knowledge, and the consequent of a fusion rule is a formula specifying an action to be undertaken to form a merged report. It is expected that a set of fusion rules is defined for any given application. In this paper we extend the approach to handling probability values, degrees of beliefs, or necessity measures associated with textentries in the news reports. We present the formal definition for each of these types of uncertainty and explain how they can be handled using fusion rules. We also discuss the methods of detecting inconsistencies among sources.
Resumo:
This research published in the foremost international journal in information theory and shows interplay between complex random matrix and multiantenna information theory. Dr T. Ratnarajah is leader in this area of research and his work has been contributed in the development of graduate curricula (course reader) in Massachusetts Institute of Technology (MIT), USA, By Professor Alan Edelman. The course name is "The Mathematics and Applications of Random Matrices", see http://web.mit.edu/18.338/www/projects.html
Resumo:
Selection power is taken as the fundamental value for information retrieval systems. Selection power is regarded as produced by selection labor, which itself separates historically into description and search labor. As forms of mental labor, description and search labor participate in the conditions for labor and for mental labor. Concepts and distinctions applicable to physical and mental labor are indicated, introducing the necessity of labor for survival, the idea of technology as a human construction, and the possibility of the transfer of human labor to technology. Distinctions specific to mental labor, particular between semantic and syntactic labor, are introduced. Description labor is exemplified by cataloging, classification, and database description, can be more formally understood as the labor involved in the transformation of objects for description into searchable descriptions, and is also understood to include interpretation. The costs of description labor are discussed. Search labor is conceived as the labor expended in searching systems. For both description and search labor, there has been a progressive reduction in direct human labor, with its syntactic aspects transferred to technology, effectively compelled by the high relative costs of direct human labor compared to machine processes.
Resumo:
This article synthesizes the labor theoretic approach to information retrieval. Selection power is taken as the fundamental value for information retrieval and is regarded as produced by selection labor. Selection power remains relatively constant while selection labor modulates across oral, written, and computational modes. A dynamic, stemming principally from the costs of direct human mental labor and effectively compelling the transfer of aspects of human labor to computational technology, is identified. The decision practices of major information system producers are shown to conform with the motivating forces identified in the dynamic. An enhancement of human capacities, from the increased scope of description processes, is revealed. Decision variation and decision considerations are identified. The value of the labor theoretic approach is considered in relation to pre-existing theories, real world practice, and future possibilities. Finally, the continuing intractability of information retrieval is suggested.
Resumo:
Information retrieval in the age of Internet search engines has become part of ordinary discourse and everyday practice: "Google" is a verb in common usage. Thus far, more attention has been given to practical understanding of information retrieval than to a full theoretical account. In Human Information Retrieval, Julian Warner offers a comprehensive overview of information retrieval, synthesizing theories from different disciplines (information and computer science, librarianship and indexing, and information society discourse) and incorporating such disparate systems as WorldCat and Google into a single, robust theoretical framework. There is a need for such a theoretical treatment, he argues, one that reveals the structure and underlying patterns of this complex field while remaining congruent with everyday practice. Warner presents a labor theoretic approach to information retrieval, building on his previously formulated distinction between semantic and syntactic mental labor, arguing that the description and search labor of information retrieval can be understood as both semantic and syntactic in character. Warner's information science approach is rooted in the humanities and the social sciences but informed by an understanding of information technology and information theory. The chapters offer a progressive exposition of the topic, with illustrative examples to explain the concepts presented. Neither narrowly practical nor largely speculative, Human Information Retrieval meets the contemporary need for a broader treatment of information and information systems.
Resumo:
Purpose
– Information science has been conceptualized as a partly unreflexive response to developments in information and computer technology, and, most powerfully, as part of the gestalt of the computer. The computer was viewed as an historical accident in the original formulation of the gestalt. An alternative, and timely, approach to understanding, and then dissolving, the gestalt would be to address the motivating technology directly, fully recognizing it as a radical human construction. This paper aims to address the issues.
Design/methodology/approach
– The paper adopts a social epistemological perspective and is concerned with collective, rather than primarily individual, ways of knowing.
Findings
– Information technology tends to be received as objectively given, autonomously developing, and causing but not itself caused, by the language of discussions in information science. It has also been characterized as artificial, in the sense of unnatural, and sometimes as threatening. Attitudes to technology are implied, rather than explicit, and can appear weak when articulated, corresponding to collective repression.
Research limitations/implications
– Receiving technology as objectively given has an analogy with the Platonist view of mathematical propositions as discovered, in its exclusion of human activity, opening up the possibility of a comparable critique which insists on human agency.
Originality/value
– Apprehensions of information technology have been raised to consciousness, exposing their limitations.
Resumo:
Voice over IP (VoIP) has experienced a tremendous growth over the last few years and is now widely used among the population and for business purposes. The security of such VoIP systems is often assumed, creating a false sense of privacy. This paper investigates in detail the leakage of information from Skype, a widely used and protected VoIP application. Experiments have shown that isolated phonemes can be classified and given sentences identified. By using the dynamic time warping (DTW) algorithm, frequently used in speech processing, an accuracy of 60% can be reached. The results can be further improved by choosing specific training data and reach an accuracy of 83% under specific conditions. The initial results being speaker dependent, an approach involving the Kalman filter is proposed to extract the kernel of all training signals.
Resumo:
Learning or writing regular expressions to identify instances of a specific
concept within text documents with a high precision and recall is challenging.
It is relatively easy to improve the precision of an initial regular expression
by identifying false positives covered and tweaking the expression to avoid the
false positives. However, modifying the expression to improve recall is difficult
since false negatives can only be identified by manually analyzing all documents,
in the absence of any tools to identify the missing instances. We focus on partially
automating the discovery of missing instances by soliciting minimal user
feedback. We present a technique to identify good generalizations of a regular
expression that have improved recall while retaining high precision. We empirically
demonstrate the effectiveness of the proposed technique as compared to
existing methods and show results for a variety of tasks such as identification of
dates, phone numbers, product names, and course numbers on real world datasets