7 resultados para Processing of Manuscripts
em Bulgarian Digital Mathematics Library at IMI-BAS
Resumo:
The paper informs about the history of manuscript digitization in the National Library of the Czech Republic as well as about other issues concerning processing of manuscripts. The main consequence of the massive digitization and record and/or full text processing is a paradigm shift leading to the digital history.
Resumo:
This article presents the principal results of the doctoral thesis “Recognition of neume notation in historical documents” by Lasko Laskov (Institute of Mathematics and Informatics at Bulgarian Academy of Sciences), successfully defended before the Specialized Academic Council for Informatics and Mathematical Modelling on 07 June 2010.
Resumo:
Mixed-content miscellanies (very frequent in the Byzantine and mediaeval Slavic written heritage) are usually defined as collections of works with non-occupational, non-liturgical application, and texts in them are selected and arranged according to no identifiable principle. It is a “readable” type of miscellanies which were compiled mainly on the basis of the cognitive interests of compilers and readers. Just like the occupational ones, they also appeared to satisfy public needs but were intended for individual usage. My textological comparison had shown that mixed- content miscellanies often showed evidence of a stable content – some of them include the same constituent works in the same order, regardless that the manuscripts had no obvious genetic relationship. These correspondences were sufficiently numerous and distinctive that they could not be merely fortuitous, and the only sensible interpretation was that even when the operative organizational principle was not based on independently identifiable criteria, such as the church calendar, liturgical function, or thematic considerations, mixed-content miscellanies (or, at least, portions of their contents) nonetheless fell into types. In this respect, the apparent free selection and arrangement of texts in mixed-content miscellanies turns out to be illusory. The problem was – as the corpus of manuscripts that I and my colleagues needed to examine grew – our ability to keep track of the structure of each one, and to identify structural correspondences among manuscripts within the corpus, diminished. So, at the end of 1993 I addressed a letter to Prof. David Birnbaum (University of Pittsburgh, PA) with a request to help me to solve the problem. He and my colleague Andrey Boyadzhiev (Sofia University) pointed out to me that computers are well suited to recording, processing, and analyzing large amounts of data, and to identifying patterns within the data, and their proposal was that we try to develop a computer system for description of manuscripts, for their analysis and of course, for searching the data. Our collaboration in this project is now ten years old, and our talk today presents an overview of that collaboration.
Resumo:
The technology of record, storage and processing of the texts, based on creation of integer index cycles is discussed. Algorithms of exact-match search and search similar on the basis of inquiry in a natural language are considered. The software realizing offered approaches is described, and examples of the electronic archives possessing properties of intellectual search are resulted.
Resumo:
The article describes the structure of an ontology model for Optimization of a sequential program. The components of an intellectual modeling system for program optimization are described. The functions of the intellectual modeling system are defined.
Resumo:
After many years of scholar study, manuscript collections continue to be an important source of novel information for scholars, concerning both the history of earlier times as well as the development of cultural documentation over the centuries. D-SCRIBE project aims to support and facilitate current and future efforts in manuscript digitization and processing. It strives toward the creation of a comprehensive software product, which can assist the content holders in turning an archive of manuscripts into a digital collection using automated methods. In this paper, we focus on the problem of recognizing early Christian Greek manuscripts. We propose a novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way. Based on the existence of closed cavity regions in the majority of characters and character ligatures in these scripts, we propose a novel, segmentation-free, fast and efficient technique that assists the recognition procedure by tracing and recognizing the most frequently appearing characters or character ligatures.
Resumo:
* The following text has been originally published in the Proceedings of the Language Recourses and Evaluation Conference held in Lisbon, Portugal, 2004, under the title of "Towards Intelligent Written Cultural Heritage Processing - Lexical processing". I present here a revised contribution of the aforementioned paper and I add here the latest efforts done in the Center for Computational Linguistic in Prague in the field under discussion.