3 resultados para digital libraries
em CentAUR: Central Archive University of Reading - UK
Resumo:
One of the main tasks of the mathematical knowledge management community must surely be to enhance access to mathematics on digital systems. In this paper we present a spectrum of approaches to solving the various problems inherent in this task, arguing that a variety of approaches is both necessary and useful. The main ideas presented are about the differences between digitised mathematics, digitally represented mathematics and formalised mathematics. Each has its part to play in managing mathematical information in a connected world. Digitised material is that which is embodied in a computer file, accessible and displayable locally or globally. Represented material is digital material in which there is some structure (usually syntactic in nature) which maps to the mathematics contained in the digitised information. Formalised material is that in which both the syntax and semantics of the represented material, is automatically accessible. Given the range of mathematical information to which access is desired, and the limited resources available for managing that information, we must ensure that these resources are applied to digitise, form representations of or formalise, existing and new mathematical information in such a way as to extract the most benefit from the least expenditure of resources. We also analyse some of the various social and legal issues which surround the practical tasks.
Resumo:
In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.
Resumo:
We describe the CHARMe project, which aims to link climate datasets with publications, user feedback and other items of "commentary metadata". The system will help users learn from previous community experience and select datasets that best suit their needs, as well as providing direct traceability between conclusions and the data that supported them. The project applies the principles of Linked Data and adopts the Open Annotation standard to record and publish commentary information. CHARMe contributes to the emerging landscape of "climate services", which will provide climate data and information to influence policy and decision-making. Although the project focuses on climate science, the technologies and concepts are very general and could be applied to other fields.