3 resultados para Language Resources

em Bulgarian Digital Mathematics Library at IMI-BAS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article briefly reviews multilingual language resources for Bulgarian, developed in the frame of some international projects: the first-ever annotated Bulgarian MTE digital lexical resources, Bulgarian-Polish corpus, Bulgarian-Slovak parallel and aligned corpus, and Bulgarian-Polish-Lithuanian corpus. These resources are valuable multilingual dataset for language engineering research and development for Bulgarian language. The multilingual corpora are large repositories of language data with an important role in preserving and supporting the world's cultural heritage, because the natural language is an outstanding part of the human cultural values and collective memory, and a bridge between cultures.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The paper describes three software packages - the main components of a software system for processing and web-presentation of Bulgarian language resources – parallel corpora and bilingual dictionaries. The author briefly presents current versions of the core components “Dictionary” and “Corpus” as well as the recently developed component “Connection” that links both “Dictionary” and “Corpus”. The components main functionalities are described as well. Some examples of the usage of the system’s web-applications are included.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Universal Networking Language (UNL) is an interlingua designed to be the base of several natural language processing systems aiming to support multilinguality in internet. One of the main components of the language is the dictionary of Universal Words (UWs), which links the vocabularies of the different languages involved in the project. As any NLP system, coverage and accuracy in its lexical resources are crucial for the development of the system. In this paper, the authors describes how a large coverage UWs dictionary was automatically created, based on an existent and well known resource like the English WordNet. Other aspects like implementation details and the evaluation of the final UW set are also depicted.