911 resultados para automatic indexing
Resumo:
This paper reports a research to evaluate the potential and the effects of use of annotated Paraconsistent logic in automatic indexing. This logic attempts to deal with contradictions, concerned with studying and developing inconsistency-tolerant systems of logic. This logic, being flexible and containing logical states that go beyond the dichotomies yes and no, permits to advance the hypothesis that the results of indexing could be better than those obtained by traditional methods. Interactions between different disciplines, as information retrieval, automatic indexing, information visualization, and nonclassical logics were considered in this research. From the methodological point of view, an algorithm for treatment of uncertainty and imprecision, developed under the Paraconsistent logic, was used to modify the values of the weights assigned to indexing terms of the text collections. The tests were performed on an information visualization system named Projection Explorer (PEx), created at Institute of Mathematics and Computer Science (ICMC - USP Sao Carlos), with available source code. PEx uses traditional vector space model to represent documents of a collection. The results were evaluated by criteria built in the information visualization system itself, and demonstrated measurable gains in the quality of the displays, confirming the hypothesis that the use of the para-analyser under the conditions of the experiment has the ability to generate more effective clusters of similar documents. This is a point that draws attention, since the constitution of more significant clusters can be used to enhance information indexing and retrieval. It can be argued that the adoption of non-dichotomous (non-exclusive) parameters provides new possibilities to relate similar information.
Resumo:
Includes bibliography.
Resumo:
Evidence-based medicine relies on repositories of empirical research evidence that can be used to support clinical decision making for improved patient care. However, retrieving evidence from such repositories at local sites presents many challenges. This paper describes a methodological framework for automatically indexing and retrieving empirical research evidence in the form of the systematic reviews and associated studies from The Cochrane Library, where retrieved documents are specific to a patient-physician encounter and thus can be used to support evidence-based decision making at the point of care. Such an encounter is defined by three pertinent groups of concepts - diagnosis, treatment, and patient, and the framework relies on these three groups to steer indexing and retrieval of reviews and associated studies. An evaluation of the indexing and retrieval components of the proposed framework was performed using documents relevant for the pediatric asthma domain. Precision and recall values for automatic indexing of systematic reviews and associated studies were 0.93 and 0.87, and 0.81 and 0.56, respectively. Moreover, precision and recall for the retrieval of relevant systematic reviews and associated studies were 0.89 and 0.81, and 0.92 and 0.89, respectively. With minor modifications, the proposed methodological framework can be customized for other evidence repositories. © 2010 Elsevier Inc.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
Trabalho apresentado no âmbito do Mestrado em Engenharia Informática, como requisito parcial para obtenção do grau de Mestre em Engenharia Informática
Resumo:
There are still major challenges in the area of automatic indexing and retrieval of multimedia content data for very large multimedia content corpora. Current indexing and retrieval applications still use keywords to index multimedia content and those keywords usually do not provide any knowledge about the semantic content of the data. With the increasing amount of multimedia content, it is inefficient to continue with this approach. In this paper, we describe the project DREAM, which addresses such challenges by proposing a new framework for semi-automatic annotation and retrieval of multimedia based on the semantic content. The framework uses the Topic Map Technology, as a tool to model the knowledge automatically extracted from the multimedia content using an Automatic Labelling Engine. We describe how we acquire knowledge from the content and represent this knowledge using the support of NLP to automatically generate Topic Maps. The framework is described in the context of film post-production.
Resumo:
There are still major challenges in the area of automatic indexing and retrieval of digital data. The main problem arises from the ever increasing mass of digital media and the lack of efficient methods for indexing and retrieval of such data based on the semantic content rather than keywords. To enable intelligent web interactions or even web filtering, we need to be capable of interpreting the information base in an intelligent manner. Research has been ongoing for a few years in the field of ontological engineering with the aim of using ontologies to add knowledge to information. In this paper we describe the architecture of a system designed to automatically and intelligently index huge repositories of special effects video clips, based on their semantic content, using a network of scalable ontologies to enable intelligent retrieval.
Resumo:
Automatic indexing and retrieval of digital data poses major challenges. The main problem arises from the ever increasing mass of digital media and the lack of efficient methods for indexing and retrieval of such data based on the semantic content rather than keywords. To enable intelligent web interactions, or even web filtering, we need to be capable of interpreting the information base in an intelligent manner. For a number of years research has been ongoing in the field of ontological engineering with the aim of using ontologies to add such (meta) knowledge to information. In this paper, we describe the architecture of a system (Dynamic REtrieval Analysis and semantic metadata Management (DREAM)) designed to automatically and intelligently index huge repositories of special effects video clips, based on their semantic content, using a network of scalable ontologies to enable intelligent retrieval. The DREAM Demonstrator has been evaluated as deployed in the film post-production phase to support the process of storage, indexing and retrieval of large data sets of special effects video clips as an exemplar application domain. This paper provides its performance and usability results and highlights the scope for future enhancements of the DREAM architecture which has proven successful in its first and possibly most challenging proving ground, namely film production, where it is already in routine use within our test bed Partners' creative processes. (C) 2009 Published by Elsevier B.V.
Resumo:
In some applications with case-based system, the attributes available for indexing are better described as linguistic variables instead of receiving numerical treatment. In these applications, the concept of fuzzy hypercube can be applied to give a geometrical interpretation of similarities among cases. This paper presents an approach that uses geometrical properties of fuzzy hypercube space to make indexing and retrieval processes of cases.
Resumo:
Most of the tasks in genome annotation can be at least partially automated. Since this annotation is time-consuming, facilitating some parts of the process - thus freeing the specialist to carry out more valuable tasks - has been the motivation of many tools and annotation environments. In particular, annotation of protein function can benefit from knowledge about enzymatic processes. The use of sequence homology alone is not a good approach to derive this knowledge when there are only a few homologues of the sequence to be annotated. The alternative is to use motifs. This paper uses a symbolic machine learning approach to derive rules for the classification of enzymes according to the Enzyme Commission (EC). Our results show that, for the top class, the average global classification error is 3.13%. Our technique also produces a set of rules relating structural to functional information, which is important to understand the protein tridimensional structure and determine its biological function. © 2009 Springer Berlin Heidelberg.
Resumo:
"Issued October 1980."
Representing clinical documents to support automatic retrieval of evidence from the Cochrane Library
Resumo:
The overall aim of our research is to develop a clinical information retrieval system that retrieves systematic reviews and underlying clinical studies from the Cochrane Library to support physician decision making. We believe that in order to accomplish this goal we need to develop a mechanism for effectively representing documents that will be retrieved by the application. Therefore, as a first step in developing the retrieval application we have developed a methodology that semi-automatically generates high quality indices and applies them as descriptors to documents from The Cochrane Library. In this paper we present a description and implementation of the automatic indexing methodology and an evaluation that demonstrates that enhanced document representation results in the retrieval of relevant documents for clinical queries. We argue that the evaluation of information retrieval applications should also include an evaluation of the quality of the representation of documents that may be retrieved. ©2010 IEEE.
Resumo:
Tenint en compte l’evolució a Internet dels portals d’informació dels mitjans de comunicació, sorgeix la idea d’un motor de cerca orientat a la recaptació de notícies dispersades per les diferents pàgines web dels grans mitjans de comunicació espanyols, que permetés obtenir informació sobre “descriptors contractats” pels usuaris d’un portal. El primer objectiu és l’anàlisi de les necessitats que es volen cobrir per a un hipotètic client de l’aplicació, el segon és en l’àmbit algorítmic, cal obtenir una metodologia de treball que permeti l’obtenció de la notícia. En l’àmbit de la programació es consideren tres etapes: descarregar les pàgines web necessàries, que es farà mitjançant les eines que proporciona la llibreria cUrl; l’anàlisi de les notícies (obtenir tots els enllaços que corresponen a notícies, filtrar els descriptors per decidir si cal guardar la notícia, analitzar l’estructura interna de les notícies seleccionades per guardar-ne només les parts establertes), i la base de dades que ens ha de permetre organitzar i gestionar les notícies escollides
Resumo:
Tenint en compte l’evolució a Internet dels portals d’informació dels mitjans de comunicació, sorgeix la idea d’un motor de cerca orientat a la recaptació de notícies dispersades per les diferents pàgines web dels grans mitjans de comunicació espanyols, que permetés obtenir informació sobre “descriptors contractats” pels usuaris d’un portal. El primer objectiu és l’anàlisi de les necessitats que es volen cobrir per a un hipotètic client de l’aplicació, el segon és en l’àmbit algorítmic, cal obtenir una metodologia de treball que permeti l’obtenció de la notícia. En l’àmbit de la programació es consideren tres etapes: descarregar les pàgines web necessàries, que es farà mitjançant les eines que proporciona la llibreria cUrl; l’anàlisi de les notícies (obtenir tots els enllaços que corresponen a notícies, filtrar els descriptors per decidir si cal guardar la notícia, analitzar l’estructura interna de les notícies seleccionades per guardar-ne només les parts establertes), i la base de dades que ens ha de permetre organitzar i gestionar les notícies escollides
Resumo:
The terminological performance of the descriptors representing the Information Science domain in the SIBI/USP Controlled Vocabulary was evaluated in manual, automatic and semi-automatic indexing processes. It can be concluded that, in order to have a better performance (i.e., to adequately represent the content of the corpus), current Information Science descriptors of the SIBi/USP Controlled Vocabulary must be extended and put into context by means of terminological definitions so that information needs of users are fulfilled.