2 resultados para Vector Space IR, Search Engines, Document Clustering, Document

em Nottingham eTheses


Relevância:

100.00% 100.00%

Publicador:

Resumo:

For some years now the Internet and World Wide Web communities have envisaged moving to a next generation of Web technologies by promoting a globally unique, and persistent, identifier for identifying and locating many forms of published objects . These identifiers are called Universal Resource Names (URNs) and they hold out the prospect of being able to refer to an object by what it is (signified by its URN), rather than by where it is (the current URL technology). One early implementation of URN ideas is the Unicode-based Handle technology, developed at CNRI in Reston Virginia. The Digital Object Identifier (DOI) is a specific URN naming convention proposed just over 5 years ago and is now administered by the International DOI organisation, founded by a consortium of publishers and based in Washington DC. The DOI is being promoted for managing electronic content and for intellectual rights management of it, either using the published work itself, or, increasingly via metadata descriptors for the work in question. This paper describes the use of the CNRI handle parser to navigate a corpus of papers for the Electronic Publishing journal. These papers are in PDF format and based on our server in Nottingham. For each paper in the corpus a metadata descriptor is prepared for every citation appearing in the References section. The important factor is that the underlying handle is resolved locally in the first instance. In some cases (e.g. cross-citations within the corpus itself and links to known resources elsewhere) the handle can be handed over to CNRI for further resolution. This work shows the encouraging prospect of being able to use persistent URNs not only for intellectual property negotiations but also for search and discovery. In the test domain of this experiment every single resource, referred to within a given paper, can be resolved, at least to the level of metadata about the referred object. If the Web were to become more fully URN aware then a vast directed graph of linked resources could be accessed, via persistent names. Moreover, if these names delivered embedded metadata when resolved, the way would be open for a new generation of vastly more accurate and intelligent Web search engines.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The purpose of this paper is twofold. Firstly it presents a preliminary and ethnomethodologically-informed analysis of the way in which the growing structure of a particular program's code was ongoingly derived from its earliest stages. This was motivated by an interest in how the detailed structure of completed program `emerged from nothing' as a product of the concrete practices of the programmer within the framework afforded by the language. The analysis is broken down into three sections that discuss: the beginnings of the program's structure; the incremental development of structure; and finally the code productions that constitute the structure and the importance of the programmer's stock of knowledge. The discussion attempts to understand and describe the emerging structure of code rather than focus on generating `requirements' for supporting the production of that structure. Due to time and space constraints, however, only a relatively cursory examination of these features was possible. Secondly the paper presents some thoughts on the difficulties associated with the analytic---in particular ethnographic---study of code, drawing on general problems as well as issues arising from the difficulties and failings encountered as part of the analysis presented in the first section.