16 resultados para Automatic extraction of lexical information

em Bulgarian Digital Mathematics Library at IMI-BAS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Universal Networking Language (UNL) is an interlingua designed to be the base of several natural language processing systems aiming to support multilinguality in internet. One of the main components of the language is the dictionary of Universal Words (UWs), which links the vocabularies of the different languages involved in the project. As any NLP system, coverage and accuracy in its lexical resources are crucial for the development of the system. In this paper, the authors describes how a large coverage UWs dictionary was automatically created, based on an existent and well known resource like the English WordNet. Other aspects like implementation details and the evaluation of the final UW set are also depicted.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose an unsupervised methodology to automatically discover pairs of semantically related words by highlighting their local environment and evaluating their semantic similarity in local and global semantic spaces. This proposal di®ers from previous research as it tries to take the best of two different methodologies i.e. semantic space models and information extraction models. It can be applied to extract close semantic relations, it limits the search space and it is unsupervised.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Development-engineers use in their work languages intended for software or hardware systems design, and test engineers utilize languages effective in verification, analysis of the systems properties and testing. Automatic interfaces between languages of these kinds are necessary in order to avoid ambiguous understanding of specification of models of the systems and inconsistencies in the initial requirements for the systems development. Algorithm of automatic translation of MSC (Message Sequence Chart) diagrams compliant with MSC’2000 standard into Petri Nets is suggested in this paper. Each input MSC diagram is translated into Petri Net (PN), obtained PNs are sequentially composed in order to synthesize a whole system in one final combined PN. The principle of such composition is defined through the basic element of MSC language — conditions. While translating reference table is developed for maintenance of consistent coordination between the input system’s descriptions in MSC language and in PN format. This table is necessary to present the results of analysis and verification on PN in suitable for the development-engineer format of MSC diagrams. The proof of algorithm correctness is based on the use of process algebra ACP. The most significant feature of the given algorithm is the way of handling of conditions. The direction for future work is the development of integral, partially or completely automated technological process, which will allow designing system, testing and verifying its various properties in the one frame.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the followed methodology to automatically generate titles for a corpus of questions that belong to sociological opinion polls. Titles for questions have a twofold function: (1) they are the input of user searches and (2) they inform about the whole contents of the question and possible answer options. Thus, generation of titles can be considered as a case of automatic summarization. However, the fact that summarization had to be performed over very short texts together with the aforementioned quality conditions imposed on new generated titles led the authors to follow knowledge-rich and domain-dependent strategies for summarization, disregarding the more frequent extractive techniques for summarization.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article is dedicated to the vital problem of the creation of GIS-systems for the monitoring, prognostication and control of technogenic natural catastrophes. The decrease of risks, the protection of economic objects, averting the human victims, caused by the dynamism of avalanche centers, depends on the effectiveness of the prognostication procedures of avalanche danger used. In the article the structure of a prognostication subsystem information input is developed and the technology for the complex forecast of avalanche-prone situations is proposed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

False friends are pairs of words in two languages that are perceived as similar but have different meanings. We present an improved algorithm for acquiring false friends from sentence-level aligned parallel corpus based on statistical observations of words occurrences and co-occurrences in the parallel sentences. The results are compared with an entirely semantic measure for cross-lingual similarity between words based on using the Web as a corpus through analyzing the words’ local contexts extracted from the text snippets returned by searching in Google. The statistical and semantic measures are further combined into an improved algorithm for identification of false friends that achieves almost twice better results than previously known algorithms. The evaluation is performed for identifying cognates between Bulgarian and Russian but the proposed methods could be adopted for other language pairs for which parallel corpora and bilingual glossaries are available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This report examines important issues pertaining to the different ways of affecting the information security of file objects under information attacks through methods of compression. Accordingly, the report analyzes the three-way relationships which may exist among a selected set of attacks, methods and objects. Thus, a methodology is proposed for evaluation of information security, and a coefficient of information security is created. With respects to this coefficient, using different criteria and methods for evaluation and selection of alternatives, the lowest-risk methods of compression are selected.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The approaches to the analysis of various information resources pertinent to user requirements at a semantic level are determined by the thesauruses of the appropriate subject domains. The algorithms of formation and normalization of the multilinguistic thesaurus, and also methods of their comparison are given.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

* This work was financially supported by RFBF-04-01-00858.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The classification of types of information redundancy in symbolic and graphical forms representation of information is done. The general classification of compression technologies for graphical information is presented as well. The principles of design, tasks and variants for realizations of semantic compression technology of graphical information are suggested.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

AMS Subj. Classification: 68U05, 68P30

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present the Hungarian National Scientific Bibliography project: the MTMT. We argue that presently available commercial systems cannot be used as a comprehensive national bibliometric tool. The new database was created from existing databases of the Hungarian Academy of Sciences, but expected to be re-engineered in the future. The data curation model includes harvesting, the work of expert bibliographers and author feedback. MTMT will work together with the other services in the web of scientific information, using standard protocols and formats, and act as a hub. It will present the scientific output of Hungary together with the repositories containing the full text, wherever available. The database will be open, but not freely harvestable, and only for non-commercial use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this article is to evaluate the effectiveness of learning by doing as a practical tool for managing the training of students in "Library Management" at the ULSIT, Sofia, Bulgaria, by using the creation of project 'Data Base “Bulgarian Revival Towns” (CD), financed by Bulgarian Ministry of Education, Youth and Science (1/D002/144/13.10.2011) headed by Prof. DSc Ivanka Yankova, which aims to create new information resource for the towns which will serve the needs of scientific researches. By participating in generating the an array in the database through searching, selection and digitization of documents from these period, at the same time students get an opportunity to expand their skills to work effectively in a team, finding the interdisciplinary, a causal connection between the studied items, objects and subjects and foremost – practical experience in the field of digitization, information behavior, strategies for information search, etc. This method achieves good results for the accumulation of sustainable knowledge and it generates motivation to work in the field of library and information professions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62H30, 62M10, 62M20, 62P20, 94A13.