Biblioteca Digital

6 resultados para Correction of texts

em Bulgarian Digital Mathematics Library at IMI-BAS

On Logical Correction of Neural Network Algorithms for Pattern Recognition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper is devoted to the description of hybrid pattern recognition method developed by research groups from Russia, Armenia and Spain. The method is based upon logical correction over the set of conventional neural networks. Output matrices of neural networks are processed according to the potentiality principle which allows increasing of recognition reliability.

Veja mais

Experiments in Detection and Correction of Russian Malapropisms by Means of the WEB

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Malapropism is a semantic error that is hardly detectable because it usually retains syntactical links between words in the sentence but replaces one content word by a similar word with quite different meaning. A method of automatic detection of malapropisms is described, based on Web statistics and a specially defined Semantic Compatibility Index (SCI). For correction of the detected errors, special dictionaries and heuristic rules are proposed, which retains only a few highly SCI-ranked correction candidates for the user’s selection. Experiments on Web-assisted detection and correction of Russian malapropisms are reported, demonstrating efficacy of the described method.

Veja mais

Paronyms for Accelerated Correction of Semantic Errors

Relevância:

100.00% 100.00%

Publicador:

Resumo:

* Work done under partial support of Mexican Government (CONACyT, SNI), IPN (CGPI, COFAA) and Korean Government (KIPA Professorship for Visiting Faculty Positions). The second author is currently on Sabbatical leave at Chung-Ang University.

Veja mais

Classification of Texts' Authorship Using a Regression Model on Compressed Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2010 Mathematics Subject Classification: 68T50,62H30,62J05.

Veja mais

Computer Processing of Medieval Slavic Sources in the Institute of Literature at BAS Repertorium Project (1994–2004)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Mixed-content miscellanies (very frequent in the Byzantine and mediaeval Slavic written heritage) are usually defined as collections of works with non-occupational, non-liturgical application, and texts in them are selected and arranged according to no identifiable principle. It is a “readable” type of miscellanies which were compiled mainly on the basis of the cognitive interests of compilers and readers. Just like the occupational ones, they also appeared to satisfy public needs but were intended for individual usage. My textological comparison had shown that mixed- content miscellanies often showed evidence of a stable content – some of them include the same constituent works in the same order, regardless that the manuscripts had no obvious genetic relationship. These correspondences were sufficiently numerous and distinctive that they could not be merely fortuitous, and the only sensible interpretation was that even when the operative organizational principle was not based on independently identifiable criteria, such as the church calendar, liturgical function, or thematic considerations, mixed-content miscellanies (or, at least, portions of their contents) nonetheless fell into types. In this respect, the apparent free selection and arrangement of texts in mixed-content miscellanies turns out to be illusory. The problem was – as the corpus of manuscripts that I and my colleagues needed to examine grew – our ability to keep track of the structure of each one, and to identify structural correspondences among manuscripts within the corpus, diminished. So, at the end of 1993 I addressed a letter to Prof. David Birnbaum (University of Pittsburgh, PA) with a request to help me to solve the problem. He and my colleague Andrey Boyadzhiev (Sofia University) pointed out to me that computers are well suited to recording, processing, and analyzing large amounts of data, and to identifying patterns within the data, and their proposal was that we try to develop a computer system for description of manuscripts, for their analysis and of course, for searching the data. Our collaboration in this project is now ten years old, and our talk today presents an overview of that collaboration.

Veja mais

Topic Segmentation: How Much Can We Do by Counting Words and Sequences of Words

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we present an innovative topic segmentation system based on a new informative similarity measure that takes into account word co-occurrence in order to avoid the accessibility to existing linguistic resources such as electronic dictionaries or lexico-semantic databases such as thesauri or ontology. Topic segmentation is the task of breaking documents into topically coherent multi-paragraph subparts. Topic segmentation has extensively been used in information retrieval and text summarization. In particular, our architecture proposes a language-independent topic segmentation system that solves three main problems evidenced by previous research: systems based uniquely on lexical repetition that show reliability problems, systems based on lexical cohesion using existing linguistic resources that are usually available only for dominating languages and as a consequence do not apply to less favored languages and finally systems that need previously existing harvesting training data. For that purpose, we only use statistics on words and sequences of words based on a set of texts. This solution provides a flexible solution that may narrow the gap between dominating languages and less favored languages thus allowing equivalent access to information.

Veja mais

6 resultados para Correction of texts

em Bulgarian Digital Mathematics Library at IMI-BAS

Filtro por publicador

On Logical Correction of Neural Network Algorithms for Pattern Recognition

Experiments in Detection and Correction of Russian Malapropisms by Means of the WEB

Paronyms for Accelerated Correction of Semantic Errors

Classification of Texts' Authorship Using a Regression Model on Compressed Data

Computer Processing of Medieval Slavic Sources in the Institute of Literature at BAS Repertorium Project (1994–2004)

Topic Segmentation: How Much Can We Do by Counting Words and Sequences of Words