4 resultados para Content processing
em Bulgarian Digital Mathematics Library at IMI-BAS
Resumo:
During the MEMORIAL project time an international consortium has developed a software solution called DDW (Digital Document Workbench). It provides a set of tools to support the process of digitisation of documents from the scanning up to the retrievable presentation of the content. The attention is focused to machine typed archival documents. One of the important features is the evaluation of quality in each step of the process. The workbench consists of automatic parts as well as of parts which request human activity. The measurable improvement of 20% shows the approach is successful.
Resumo:
After many years of scholar study, manuscript collections continue to be an important source of novel information for scholars, concerning both the history of earlier times as well as the development of cultural documentation over the centuries. D-SCRIBE project aims to support and facilitate current and future efforts in manuscript digitization and processing. It strives toward the creation of a comprehensive software product, which can assist the content holders in turning an archive of manuscripts into a digital collection using automated methods. In this paper, we focus on the problem of recognizing early Christian Greek manuscripts. We propose a novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way. Based on the existence of closed cavity regions in the majority of characters and character ligatures in these scripts, we propose a novel, segmentation-free, fast and efficient technique that assists the recognition procedure by tracing and recognizing the most frequently appearing characters or character ligatures.
Resumo:
Mixed-content miscellanies (very frequent in the Byzantine and mediaeval Slavic written heritage) are usually defined as collections of works with non-occupational, non-liturgical application, and texts in them are selected and arranged according to no identifiable principle. It is a “readable” type of miscellanies which were compiled mainly on the basis of the cognitive interests of compilers and readers. Just like the occupational ones, they also appeared to satisfy public needs but were intended for individual usage. My textological comparison had shown that mixed- content miscellanies often showed evidence of a stable content – some of them include the same constituent works in the same order, regardless that the manuscripts had no obvious genetic relationship. These correspondences were sufficiently numerous and distinctive that they could not be merely fortuitous, and the only sensible interpretation was that even when the operative organizational principle was not based on independently identifiable criteria, such as the church calendar, liturgical function, or thematic considerations, mixed-content miscellanies (or, at least, portions of their contents) nonetheless fell into types. In this respect, the apparent free selection and arrangement of texts in mixed-content miscellanies turns out to be illusory. The problem was – as the corpus of manuscripts that I and my colleagues needed to examine grew – our ability to keep track of the structure of each one, and to identify structural correspondences among manuscripts within the corpus, diminished. So, at the end of 1993 I addressed a letter to Prof. David Birnbaum (University of Pittsburgh, PA) with a request to help me to solve the problem. He and my colleague Andrey Boyadzhiev (Sofia University) pointed out to me that computers are well suited to recording, processing, and analyzing large amounts of data, and to identifying patterns within the data, and their proposal was that we try to develop a computer system for description of manuscripts, for their analysis and of course, for searching the data. Our collaboration in this project is now ten years old, and our talk today presents an overview of that collaboration.
Resumo:
A solar power satellite is paid attention to as a clean, inexhaustible large- scale base-load power supply. The following technology related to beam control is used: A pilot signal is sent from the power receiving site and after direction of arrival estimation the beam is directed back to the earth by same direction. A novel direction-finding algorithm based on linear prediction technique for exploiting cyclostationary statistical information (spatial and temporal) is explored. Many modulated communication signals exhibit a cyclostationarity (or periodic correlation) property, corresponding to the underlying periodicity arising from carrier frequencies or baud rates. The problem was solved by using both cyclic second-order statistics and cyclic higher-order statistics. By evaluating the corresponding cyclic statistics of the received data at certain cycle frequencies, we can extract the cyclic correlations of only signals with the same cycle frequency and null out the cyclic correlations of stationary additive noise and all other co-channel interferences with different cycle frequencies. Thus, the signal detection capability can be significantly improved. The proposed algorithms employ cyclic higher-order statistics of the array output and suppress additive Gaussian noise of unknown spectral content, even when the noise shares common cycle frequencies with the non-Gaussian signals of interest. The proposed method completely exploits temporal information (multiple lag ), and also can correctly estimate direction of arrival of desired signals by suppressing undesired signals. Our approach was generalized over direction of arrival estimation of cyclostationary coherent signals. In this paper, we propose a new approach for exploiting cyclostationarity that seems to be more advanced in comparison with the other existing direction finding algorithms.