3 resultados para Co-occurrence Relation

em Bulgarian Digital Mathematics Library at IMI-BAS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Word Sense Disambiguation, the process of identifying the meaning of a word in a sentence when the word has multiple meanings, is a critical problem of machine translation. It is generally very difficult to select the correct meaning of a word in a sentence, especially when the syntactical difference between the source and target language is big, e.g., English-Korean machine translation. To achieve a high level of accuracy of noun sense selection in machine translation, we introduced a statistical method based on co-occurrence relation of words in sentences and applied it to the English-Korean machine translator RyongNamSan. ACM Computing Classification System (1998): I.2.7.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we present an innovative topic segmentation system based on a new informative similarity measure that takes into account word co-occurrence in order to avoid the accessibility to existing linguistic resources such as electronic dictionaries or lexico-semantic databases such as thesauri or ontology. Topic segmentation is the task of breaking documents into topically coherent multi-paragraph subparts. Topic segmentation has extensively been used in information retrieval and text summarization. In particular, our architecture proposes a language-independent topic segmentation system that solves three main problems evidenced by previous research: systems based uniquely on lexical repetition that show reliability problems, systems based on lexical cohesion using existing linguistic resources that are usually available only for dominating languages and as a consequence do not apply to less favored languages and finally systems that need previously existing harvesting training data. For that purpose, we only use statistics on words and sequences of words based on a set of texts. This solution provides a flexible solution that may narrow the gap between dominating languages and less favored languages thus allowing equivalent access to information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a possibility for quantitative measuring of information security of objects, exposed to information attacks and processed with methods of compression, is represented. A co-efficient of information security, which reflects the influence of the level of compression obtained after applying methods of compression to objects and the time, required by the attack to get access to the corresponding object, is proposed. Methods’ groups with the highest and respectively the lowest values of the co-efficient of information security for all methods’ groups in relation to all attacks’ groups are determined. Assessments and conclusions for future investigations are proposed.