14 resultados para Open source information retrieval
em Cochin University of Science
Resumo:
Newspapers cover a large amount of information everyday on topics of varied interests. To a university, newspapers are essential components of communication as they cover various happenings in a university. These items of information are neither stored properly nor put in retrieval systems for future use. The news and views appeared in newspapers can effectively be organized in a digital library making use of open source software. The CUSAT digital library (http://dspace.cusat.ac.in/dspace/) has organized some news items that appeared in local newspapers about the university under a special community named “CUSAT-News”. This article describes the methods of collecting, selecting, organizing, providing access and preserving news items required by a university using DSpace open source software.
Resumo:
The goal of this work is to develop an Open Agent Architecture for Multilingual information retrieval from Relational Database. The query for information retrieval can be given in plain Hindi or Malayalam; two prominent regional languages of India. The system supports distributed processing of user requests through collaborating agents. Natural language processing techniques are used for meaning extraction from the plain query and information is given back to the user in his/ her native language. The system architecture is designed in a structured way so that it can be adapted to other regional languages of India
Resumo:
Free/Open Source Software (FOSS) concept is very important in the academic community. The open philosophy of FOSS is consistent with academic freedom and the open dissemination of knowledge and information in academia. FOSS can lower the barriers to access of ICTs by reducing the cost of the software. This article discusses the success story of CUSAT's adoption of Free/Open Source Software
Resumo:
Since last few years open source integrated library systems gaining attention of library and information science professionals. This paper tries to identify the extent of adoption of Koha, an open source ILS in libraries around the world through a Web based study. The study found that Koha adoption in libraries is still at infancy
Resumo:
The Central Library of Cochin University of Science and Technology (CUSAT) has been automated by proprietary software (Adlib Library) since 2000. After 11 years, in 2011, the university authorities decided to shift to an open source software (OSS), for integrated library management system (ILMS), Koha for automating the library housekeeping operations. In this context, this study attempts to share the experiences in cataloging with both type of software. The features of the cataloging modules of both the software are analysed on the badis of certain check points. It is found that the cataloging module of Koha is almost in par with that of proven proprietary software that has been in market for the past 25 years. Some suggestions made by this study may be incorporated for the further development and perfection of Koha.
Resumo:
The assessment of maturity of software is an important area in the general software sector. The field of OSS also applies various models to measure software maturity. However, measuring maturity of OSS being used for several applications in libraries is an area left with no research so far. This study has attempted to fill the research gap. Measuring maturity of software contributes knowledge on its sustainability over the long term. Maturity of software is one of the factors that positively influence adoption. The investigator measured the maturity of DSpace software using Woods and Guliani‟s Open Source Maturity Model-2005. The present study is significant as it addresses the aspects of maturity of OSS for libraries and fills the research gap on the area. In this sense the study opens new avenues to the field of library and information science by providing an additional tool for librarians in the selection and adoption of OSS. Measuring maturity brings in-depth knowledge on an OSS which will contribute towards the perceived usefulness and perceived ease of use as explained in the Technology Acceptance Model theory.
Resumo:
This paper describes about an English-Malayalam Cross-Lingual Information Retrieval system. The system retrieves Malayalam documents in response to query given in English or Malayalam. Thus monolingual information retrieval is also supported in this system. Malayalam is one of the most prominent regional languages of Indian subcontinent. It is spoken by more than 37 million people and is the native language of Kerala state in India. Since we neither had any full-fledged online bilingual dictionary nor any parallel corpora to build the statistical lexicon, we used a bilingual dictionary developed in house for translation. Other language specific resources like Malayalam stemmer, Malayalam morphological root analyzer etc developed in house were used in this work
Resumo:
The purpose of this paper is to describe the design and development of a digital library at Cochin University of Science and Technology (CUSAT), India, using DSpace open source software. The study covers the structure, contents and usage of CUSAT digital library. Design/methodology/approach – This paper examines the possibilities of applying open source in libraries. An evaluative approach is carried out to explore the features of the CUSAT digital library. The Google Analytics service is employed to measure the amount of use of digital library by users across the world. Findings – CUSAT has successfully applied DSpace open source software for building a digital library. The digital library has had visits from 78 countries, with the major share from India. The distribution of documents in the digital library is uneven. Past exam question papers share the major part of the collection. The number of research papers, articles and rare documents is less. Originality/value – The study is the first of its type that tries to understand digital library design and development using DSpace open source software in a university environment with a focus on the analysis of distribution of items and measuring the value by usage statistics employing the Google Analytics service. The digital library model can be useful for designing similar systems
Resumo:
Software systems are progressively being deployed in many facets of human life. The implication of the failure of such systems, has an assorted impact on its customers. The fundamental aspect that supports a software system, is focus on quality. Reliability describes the ability of the system to function under specified environment for a specified period of time and is used to objectively measure the quality. Evaluation of reliability of a computing system involves computation of hardware and software reliability. Most of the earlier works were given focus on software reliability with no consideration for hardware parts or vice versa. However, a complete estimation of reliability of a computing system requires these two elements to be considered together, and thus demands a combined approach. The present work focuses on this and presents a model for evaluating the reliability of a computing system. The method involves identifying the failure data for hardware components, software components and building a model based on it, to predict the reliability. To develop such a model, focus is given to the systems based on Open Source Software, since there is an increasing trend towards its use and only a few studies were reported on the modeling and measurement of the reliability of such products. The present work includes a thorough study on the role of Free and Open Source Software, evaluation of reliability growth models, and is trying to present an integrated model for the prediction of reliability of a computational system. The developed model has been compared with existing models and its usefulness of is being discussed.
Resumo:
The present work focuses on various facets of open access movement for managing intellectual output that eventually becomes available and accessible in public domain. Thus, purpose of this paper is to document and share the real time experience of managing and sharing of intellectual wealth of academia of Cochin University of Science & Technology by using open source platforms. This paper is trying to explore different intellectual information resources in the current era and also aims to suggest cost effective strategy of implementing new open access tools and technology for effective managing ofintellectual informatics
Resumo:
This work is aimed at building an adaptable frame-based system for processing Dravidian languages. There are about 17 languages in this family and they are spoken by the people of South India.Karaka relations are one of the most important features of Indian languages. They are the semabtuco-syntactic relations between verbs and other related constituents in a sentence. The karaka relations and surface case endings are analyzed for meaning extraction. This approach is comparable with the borad class of case based grammars.The efficiency of this approach is put into test in two applications. One is machine translation and the other is a natural language interface (NLI) for information retrieval from databases. The system mainly consists of a morphological analyzer, local word grouper, a parser for the source language and a sentence generator for the target language. This work make contributios like, it gives an elegant account of the relation between vibhakthi and karaka roles in Dravidian languages. This mapping is elegant and compact. The same basic thing also explains simple and complex sentence in these languages. This suggests that the solution is not just ad hoc but has a deeper underlying unity. This methodology could be extended to other free word order languages. Since the frame designed for meaning representation is general, they are adaptable to other languages coming in this group and to other applications.
Resumo:
Sharing of information with those in need of it has always been an idealistic goal of networked environments. With the proliferation of computer networks, information is so widely distributed among systems, that it is imperative to have well-organized schemes for retrieval and also discovery. This thesis attempts to investigate the problems associated with such schemes and suggests a software architecture, which is aimed towards achieving a meaningful discovery. Usage of information elements as a modelling base for efficient information discovery in distributed systems is demonstrated with the aid of a novel conceptual entity called infotron.The investigations are focused on distributed systems and their associated problems. The study was directed towards identifying suitable software architecture and incorporating the same in an environment where information growth is phenomenal and a proper mechanism for carrying out information discovery becomes feasible. An empirical study undertaken with the aid of an election database of constituencies distributed geographically, provided the insights required. This is manifested in the Election Counting and Reporting Software (ECRS) System. ECRS system is a software system, which is essentially distributed in nature designed to prepare reports to district administrators about the election counting process and to generate other miscellaneous statutory reports.Most of the distributed systems of the nature of ECRS normally will possess a "fragile architecture" which would make them amenable to collapse, with the occurrence of minor faults. This is resolved with the help of the penta-tier architecture proposed, that contained five different technologies at different tiers of the architecture.The results of experiment conducted and its analysis show that such an architecture would help to maintain different components of the software intact in an impermeable manner from any internal or external faults. The architecture thus evolved needed a mechanism to support information processing and discovery. This necessitated the introduction of the noveI concept of infotrons. Further, when a computing machine has to perform any meaningful extraction of information, it is guided by what is termed an infotron dictionary.The other empirical study was to find out which of the two prominent markup languages namely HTML and XML, is best suited for the incorporation of infotrons. A comparative study of 200 documents in HTML and XML was undertaken. The result was in favor ofXML.The concept of infotron and that of infotron dictionary, which were developed, was applied to implement an Information Discovery System (IDS). IDS is essentially, a system, that starts with the infotron(s) supplied as clue(s), and results in brewing the information required to satisfy the need of the information discoverer by utilizing the documents available at its disposal (as information space). The various components of the system and their interaction follows the penta-tier architectural model and therefore can be considered fault-tolerant. IDS is generic in nature and therefore the characteristics and the specifications were drawn up accordingly. Many subsystems interacted with multiple infotron dictionaries that were maintained in the system.In order to demonstrate the working of the IDS and to discover the information without modification of a typical Library Information System (LIS), an Information Discovery in Library Information System (lDLIS) application was developed. IDLIS is essentially a wrapper for the LIS, which maintains all the databases of the library. The purpose was to demonstrate that the functionality of a legacy system could be enhanced with the augmentation of IDS leading to information discovery service. IDLIS demonstrates IDS in action. IDLIS proves that any legacy system could be augmented with IDS effectively to provide the additional functionality of information discovery service.Possible applications of IDS and scope for further research in the field are covered.
Resumo:
This is a Named Entity Based Question Answering System for Malayalam Language. Although a vast amount of information is available today in digital form, no effective information access mechanism exists to provide humans with convenient information access. Information Retrieval and Question Answering systems are the two mechanisms available now for information access. Information systems typically return a long list of documents in response to a user’s query which are to be skimmed by the user to determine whether they contain an answer. But a Question Answering System allows the user to state his/her information need as a natural language question and receives most appropriate answer in a word or a sentence or a paragraph. This system is based on Named Entity Tagging and Question Classification. Document tagging extracts useful information from the documents which will be used in finding the answer to the question. Question Classification extracts useful information from the question to determine the type of the question and the way in which the question is to be answered. Various Machine Learning methods are used to tag the documents. Rule-Based Approach is used for Question Classification. Malayalam belongs to the Dravidian family of languages and is one of the four major languages of this family. It is one of the 22 Scheduled Languages of India with official language status in the state of Kerala. It is spoken by 40 million people. Malayalam is a morphologically rich agglutinative language and relatively of free word order. Also Malayalam has a productive morphology that allows the creation of complex words which are often highly ambiguous. Document tagging tools such as Parts-of-Speech Tagger, Phrase Chunker, Named Entity Tagger, and Compound Word Splitter are developed as a part of this research work. No such tools were available for Malayalam language. Finite State Transducer, High Order Conditional Random Field, Artificial Immunity System Principles, and Support Vector Machines are the techniques used for the design of these document preprocessing tools. This research work describes how the Named Entity is used to represent the documents. Single sentence questions are used to test the system. Overall Precision and Recall obtained are 88.5% and 85.9% respectively. This work can be extended in several directions. The coverage of non-factoid questions can be increased and also it can be extended to include open domain applications. Reference Resolution and Word Sense Disambiguation techniques are suggested as the future enhancements
Resumo:
Open access iiiovemerit and open source software movement plays an important role in creation of knowledge, knowledge management and knowledge dissemination. Scholarly communication and publishing are increasingly taking place in the electronic environment. With a growing proportion of the scholarly record now existing only in digital format, serious issues regarding access and preservation are being raised that are central to future scholarship. Institutional Repositories provide access to past. present and future scholarly literature and research documentation; ensures its preservation; assists users in discovery and use; and offers educational programs to enable users to develop lifelong literacy. This paper explores these aspects on how IR of Cochin University of Science & Technology supports scientific community for knowledge creation. knowledge Management, and knowledge dissemination.