955 resultados para Information Retrieval, Document Databases, Digital Libraries


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discusses the technological changes that affects learning organizations as well as the human, technical, legal and sustainable aspects regarding learning objects repositories creation, maintenance and use. It presents concepts of information objects and learning objects, the functional requirements needed to their storage at Learning Management Systems. The role of Metadata is reviewed concerning learning objects creation and retrieval, followed by considerations about learning object repositories models, community participation/collaborative strategies and potential derived metrics/indicators. As a result of this desktop research, it can be said that not only technical competencies are critical to any learning objects repository implementation, but it urges that an engaged community of interest be establish as a key to support a learning object repository project. On that matter, researchers are applying Activity Theory (Vygostky, Luria y Leontiev) in order to seek joint perceptions and actions involving learning objects repository users, curators and managers, perceived as critical assets to a successful proposal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A distinctive feature of Chinese test is that a Chinese document is a sequence of Chinese with no space or boundary between Chinese words. This feature makes Chinese information retrieval more difficult since a retrieved document which contains the query term as a sequence of Chinese characters may not be really relevant to the query since the query term (as a sequence Chinese characters) may not be a valid Chinese word in that documents. On the other hand, a document that is actually relevant may not be retrieved because it does not contain the query sequence but contains other relevant words. In this research, we propose a hybrid Chinese information retrieval model by incorporating word-based techniques with the traditional character-based techniques. The aim of this approach is to investigate the influence of Chinese segmentation on the performance of Chinese information retrieval. Two ranking methods are proposed to rank retrieved documents based on the relevancy to the query calculated by combining character-based ranking and word-based ranking. Our experimental results show that Chinese segmentation can improve the performance of Chinese information retrieval, but the improvement is not significant if it incorporates only Chinese segmentation with the traditional character-based approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The current research activities of the Institute of Mathematics and Informatics at the Bulgarian Academy of Sciences (IMI—BAS) include the study and application of knowledge-based methods for the creation, integration and development of multimedia digital libraries with applications in cultural heritage. This report presents IMI-BAS’s developments at the digital library management systems and portals, i.e. the Bulgarian Iconographical Digital Library, the Bulgarian Folklore Digital Library and the Bulgarian Folklore Artery, etc. developed during the several national and international projects: - "Digital Libraries with Multimedia Content and its Application in Bulgarian Cultural Heritage" (contract 8/21.07.2005 between the IMI–BAS, and the State Agency for Information Technologies and Communications; - FP6/IST/P-027451 PROJECT LOGOS "Knowledge-on-Demand for Ubiquitous Learning", EU FP6, IST, Priority 2.4.13 "Strengthening the Integration of the ICT research effort in an Enlarged Europe" - NSF project D-002-189 SINUS "Semantic Technologies for Web Services and Technology Enhanced Learning". - NSF project IO-03-03/2006 ―Development of Digital Libraries and Information Portal with Virtual Exposition "Bulgarian Folklore Heritage". The presented prototypes aims to provide flexible and effective access to the multimedia presentation of the cultural heritage artefacts and collections, maintaining different forms and format of the digitized information content and rich functionality for interaction. The developments are a result of long- standing interests and work in the technological developments in information systems, knowledge processing and content management systems. The current research activities aims at creating innovative solutions for assembling multimedia digital libraries for collaborative use in specific cultural heritage context, maintaining their semantic interoperability and creating new services for dynamic aggregation of their resources, access improvement, personification, intelligent curation of content, and content protection. The investigations are directed towards the development of distributed tools for aggregating heterogeneous content and ensuring semantic compatibility with the European digital library EUROPEANA, thus providing possibilities for pan- European access to rich digitalised collections of Bulgarian cultural heritage.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this work is to improve retrieval and navigation services on bibliographic data held in digital libraries. This paper presents the design and implementation of OntoBib¸ an ontology-based bibliographic database system that adopts ontology-driven search in its retrieval. The presented work exemplifies how a digital library of bibliographic data can be managed using Semantic Web technologies and how utilizing the domain specific knowledge improves both search efficiency and navigation of web information and document retrieval.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information and communication technologies are the tools that underpin the emerging “Knowledge Society”. Exchange of information or knowledge between people and through networks of people has always taken place. But the ICT has radically changed the magnitude of this exchange, and thus factors such as timeliness of information and information dissemination patterns have become more important than ever.Since information and knowledge are so vital for the all round human development, libraries and institutions that manage these resources are indeed invaluable. So, the Library and Information Centres have a key role in the acquisition, processing, preservation and dissemination of information and knowledge. ln the modern context, library is providing service based on different types of documents such as manuscripts, printed, digital, etc. At the same time, acquisition, access, process, service etc. of these resources have become complicated now than ever before. The lCT made instrumental to extend libraries beyond the physical walls of a building and providing assistance in navigating and analyzing tremendous amounts of knowledge with a variety of digital tools. Thus, modern libraries are increasingly being re-defined as places to get unrestricted access to information in many formats and from many sources.The research was conducted in the university libraries in Kerala State, India. lt was identified that even though the information resources are flooding world over and several technologies have emerged to manage the situation for providing effective services to its clientele, most of the university libraries in Kerala were unable to exploit these technologies at maximum level. Though the libraries have automated many of their functions, wide gap prevails between the possible services and provided services. There are many good examples world over in the application of lCTs in libraries for the maximization of services and many such libraries have adopted the principles of reengineering and re-defining as a management strategy. Hence this study was targeted to look into how effectively adopted the modern lCTs in our libraries for maximizing the efficiency of operations and services and whether the principles of re-engineering and- redefining can be applied towards this.Data‘ was collected from library users, viz; student as well as faculty users; library ,professionals and university librarians, using structured questionnaires. This has been .supplemented by-observation of working of the libraries, discussions and interviews with the different types of users and staff, review of literature, etc. Personal observation of the organization set up, management practices, functions, facilities, resources, utilization of information resources and facilities by the users, etc. of the university libraries in Kerala have been made. Statistical techniques like percentage, mean, weighted mean, standard deviation, correlation, trend analysis, etc. have been used to analyse data.All the libraries could exploit only a very few possibilities of modern lCTs and hence they could not achieve effective Universal Bibliographic Control and desired efficiency and effectiveness in services. Because of this, the users as well as professionals are dissatisfied. Functional effectiveness in acquisition, access and process of information resources in various formats, development and maintenance of OPAC and WebOPAC, digital document delivery to remote users, Web based clearing of library counter services and resources, development of full-text databases, digital libraries and institutional repositories, consortia based operations for e-journals and databases, user education and information literacy, professional development with stress on lCTs, network administration and website maintenance, marketing of information, etc. are major areas need special attention to improve the situation. Finance, knowledge level on ICTs among library staff, professional dynamism and leadership, vision and support of the administrators and policy makers, prevailing educational set up and social environment in the state, etc. are some of the major hurdles in reaping the maximum possibilities of lCTs by the university libraries in Kerala. The principles of Business Process Re-engineering are found suitable to effectively apply to re-structure and redefine the operations and service system of the libraries. Most of the conventional departments or divisions prevailing in the university libraries were functioning as watertight compartments and their existing management system was more rigid to adopt the principles of change management. Hence, a thorough re-structuring of the divisions was indicated. Consortia based activities and pooling and sharing of information resources was advocated to meet the varied needs of the users in the main campuses and off campuses of the universities, affiliated colleges and remote stations. A uniform staff policy similar to that prevailing in CSIR, DRDO, ISRO, etc. has been proposed by the study not only in the university libraries in kerala but for the entire country.Restructuring of Lis education,integrated and Planned development of school,college,research and public library systems,etc.were also justified for reaping maximum benefits of the modern ICTs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Domain specific information retrieval has become in demand. Not only domain experts, but also average non-expert users are interested in searching domain specific (e.g., medical and health) information from online resources. However, a typical problem to average users is that the search results are always a mixture of documents with different levels of readability. Non-expert users may want to see documents with higher readability on the top of the list. Consequently the search results need to be re-ranked in a descending order of readability. It is often not practical for domain experts to manually label the readability of documents for large databases. Computational models of readability needs to be investigated. However, traditional readability formulas are designed for general purpose text and insufficient to deal with technical materials for domain specific information retrieval. More advanced algorithms such as textual coherence model are computationally expensive for re-ranking a large number of retrieved documents. In this paper, we propose an effective and computationally tractable concept-based model of text readability. In addition to textual genres of a document, our model also takes into account domain specific knowledge, i.e., how the domain-specific concepts contained in the document affect the document’s readability. Three major readability formulas are proposed and applied to health and medical information retrieval. Experimental results show that our proposed readability formulas lead to remarkable improvements in terms of correlation with users’ readability ratings over four traditional readability measures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Search technologies are critical to enable clinical sta to rapidly and e ectively access patient information contained in free-text medical records. Medical search is challenging as terms in the query are often general but those in rel- evant documents are very speci c, leading to granularity mismatch. In this paper we propose to tackle granularity mismatch by exploiting subsumption relationships de ned in formal medical domain knowledge resources. In symbolic reasoning, a subsumption (or `is-a') relationship is a parent-child rela- tionship where one concept is a subset of another concept. Subsumed concepts are included in the retrieval function. In addition, we investigate a number of initial methods for combining weights of query concepts and those of subsumed concepts. Subsumption relationships were found to provide strong indication of relevant information; their inclusion in retrieval functions yields performance improvements. This result motivates the development of formal models of rela- tionships between medical concepts for retrieval purposes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis presents new methods for classification and thematic grouping of billions of web pages, at scales previously not achievable. This process is also known as document clustering, where similar documents are automatically associated with clusters that represent various distinct topic. These automatically discovered topics are in turn used to improve search engine performance by only searching the topics that are deemed relevant to particular user queries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cost-effective semantic description and annotation of shared knowledge resources has always been of great importance for digital libraries and large scale information systems in general. With the emergence of the Social Web and Web 2.0 technologies, a more effective semantic description and annotation, e.g., folksonomies, of digital library contents is envisioned to take place in collaborative and personalised environments. However, there is a lack of foundation and mathematical rigour for coping with contextualised management and retrieval of semantic annotations throughout their evolution as well as diversity in users and user communities. In this paper, we propose an ontological foundation for semantic annotations of digital libraries in terms of flexonomies. The proposed theoretical model relies on a high dimensional space with algebraic operators for contextualised access of semantic tags and annotations. The set of the proposed algebraic operators, however, is an adaptation of the set theoretic operators selection, projection, difference, intersection, union in database theory. To this extent, the proposed model is meant to lay the ontological foundation for a Digital Library 2.0 project in terms of geometric spaces rather than logic (description) based formalisms as a more efficient and scalable solution to the semantic annotation problem in large scale.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"Papers presented at the 35th annual Clinic on Library Applications of Data Processing, March 22-24, 1998."

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Document ranking is an important process in information retrieval (IR). It presents retrieved documents in an order of their estimated degrees of relevance to query. Traditional document ranking methods are mostly based on the similarity computations between documents and query. In this paper we argue that the similarity-based document ranking is insufficient in some cases. There are two reasons. Firstly it is about the increased information variety. There are far too many different types documents available now for user to search. The second is about the users variety. In many cases user may want to retrieve documents that are not only similar but also general or broad regarding a certain topic. This is particularly the case in some domains such as bio-medical IR. In this paper we propose a novel approach to re-rank the retrieved documents by incorporating the similarity with their generality. By an ontology-based analysis on the semantic cohesion of text, document generality can be quantified. The retrieved documents are then re-ranked by their combined scores of similarity and the closeness of documents’ generality to the query’s. Our experiments have shown an encouraging performance on a large bio-medical document collection, OHSUMED, containing 348,566 medical journal references and 101 test queries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This is an extended version of an article presented at the Second International Conference on Software, Services and Semantic Technologies, Sofia, Bulgaria, 11–12 September 2010.