5 resultados para vignette in-text
em Bulgarian Digital Mathematics Library at IMI-BAS
Resumo:
The activities of the Institute of Information Technologies in the area of automatic text processing are outlined. Major problems related to different steps of processing are pointed out together with the shortcomings of the existing solutions.
Resumo:
2000 Mathematics Subject Classification: 62P99, 68T50
Resumo:
This work has been partially supported by Grant No. DO 02-275, 16.12.2008, Bulgarian NSF, Ministry of Education and Science.
Resumo:
Search engines sometimes apply the search on the full text of documents or web-pages; but sometimes they can apply the search on selected parts of the documents only, e.g. their titles. Full-text search may consume a lot of computing resources and time. It may be possible to save resources by applying the search on the titles of documents only, assuming that a title of a document provides a concise representation of its content. We tested this assumption using Google search engine. We ran search queries that have been defined by users, distinguishing between two types of queries/users: queries of users who are familiar with the area of the search, and queries of users who are not familiar with the area of the search. We found that searches which use titles provide similar and sometimes even (slightly) better results compared to searches which use the full-text. These results hold for both types of queries/users. Moreover, we found an advantage in title-search when searching in unfamiliar areas because the general terms used in queries in unfamiliar areas match better with general terms which tend to be used in document titles.
Resumo:
This paper presents an algorithmic solution for management of related text objects, in which are integrated algorithms for their extraction from paper or electronic format, for their storage and processing in a relational database. The developed algorithms for data extraction and data analysis enable one to find specific features and relations between the text objects from the database. The algorithmic solution is applied to data from the field of phytopharmacy in Bulgaria. It can be used as a tool and methodology for other subject areas where there are complex relationships between text objects.