4 resultados para Databases as Topic

em Helda - Digital Repository of University of Helsinki


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Topic detection and tracking (TDT) is an area of information retrieval research the focus of which revolves around news events. The problems TDT deals with relate to segmenting news text into cohesive stories, detecting something new, previously unreported, tracking the development of a previously reported event, and grouping together news that discuss the same event. The performance of the traditional information retrieval techniques based on full-text similarity has remained inadequate for online production systems. It has been difficult to make the distinction between same and similar events. In this work, we explore ways of representing and comparing news documents in order to detect new events and track their development. First, however, we put forward a conceptual analysis of the notions of topic and event. The purpose is to clarify the terminology and align it with the process of news-making and the tradition of story-telling. Second, we present a framework for document similarity that is based on semantic classes, i.e., groups of words with similar meaning. We adopt people, organizations, and locations as semantic classes in addition to general terms. As each semantic class can be assigned its own similarity measure, document similarity can make use of ontologies, e.g., geographical taxonomies. The documents are compared class-wise, and the outcome is a weighted combination of class-wise similarities. Third, we incorporate temporal information into document similarity. We formalize the natural language temporal expressions occurring in the text, and use them to anchor the rest of the terms onto the time-line. Upon comparing documents for event-based similarity, we look not only at matching terms, but also how near their anchors are on the time-line. Fourth, we experiment with an adaptive variant of the semantic class similarity system. The news reflect changes in the real world, and in order to keep up, the system has to change its behavior based on the contents of the news stream. We put forward two strategies for rebuilding the topic representations and report experiment results. We run experiments with three annotated TDT corpora. The use of semantic classes increased the effectiveness of topic tracking by 10-30\% depending on the experimental setup. The gain in spotting new events remained lower, around 3-4\%. The anchoring the text to a time-line based on the temporal expressions gave a further 10\% increase the effectiveness of topic tracking. The gains in detecting new events, again, remained smaller. The adaptive systems did not improve the tracking results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

"The Protection of Traditional Knowledge Associated with Genetic Resources: The Role of Databases and Registers" ABSTRACT Yovana Reyes Tagle The misappropriation of TK has sparked a search for national and international laws to govern the use of indigenous peoples knowledge and protection against its commercial exploitation. There is a widespread perception that biopiracy or illegal access to genetic resources and associated traditional knowledge (TK) continues despite national and regional efforts to address this concern. The purpose of this research is to address the question of how documentation of TK through databases and registers could protect TK, in light of indigenous peoples increasing demands to control their knowledge and benefit from its use. Throughout the international debate over the protection of TK, various options have been brought up and discussed. At its core, the discussion over the legal protection of TK comes down to these issues: 1) The doctrinal question: What is protection of TK? 2) The methodological question: How can protection of TK be achieved? 3) The legal question: What should be protected? And 4) The policy questions: Who has rights and how should they be implemented? What kind of rights should indigenous peoples have over their TK? What are the central concerns the TK databases want to solve? The acceptance of TK databases and registers may bring with it both opportunities and dangers. How can the rights of indigenous peoples over their documented knowledge be assured? Documentation of TK was envisaged as a means to protect TK, but there are concerns about how documented TK can be protected from misappropriation. The methodology used in this research seeks to contribute to the understanding of the protection of TK. The steps taken in this research attempt to describe and to explain a) what has been done to protect TK through databases and registers, b) how this protection is taking place, and c) why the establishment of TK databases can or cannot be useful for the protection of TK. The selected case studies (Peru and Venezuela) seek to illustrate the complexity and multidisciplinary nature of the establishment of TK databases, which entail not only legal but also political, socio-economic and cultural issues. The study offers some conclusions and recommendations that have emerged after reviewing the national experiences, international instruments, work of international organizations, and indigenous peoples perspectives. This thesis concludes that if TK is to be protected from disclosure and unauthorized use, confidential databases are required. Finally, the TK database strategy needs to be strengthened by the legal protection of the TK itself.