2 resultados para Cross-lingual Link Discovery
em Bulgarian Digital Mathematics Library at IMI-BAS
Resumo:
False friends are pairs of words in two languages that are perceived as similar but have different meanings. We present an improved algorithm for acquiring false friends from sentence-level aligned parallel corpus based on statistical observations of words occurrences and co-occurrences in the parallel sentences. The results are compared with an entirely semantic measure for cross-lingual similarity between words based on using the Web as a corpus through analyzing the words’ local contexts extracted from the text snippets returned by searching in Google. The statistical and semantic measures are further combined into an improved algorithm for identification of false friends that achieves almost twice better results than previously known algorithms. The evaluation is performed for identifying cognates between Bulgarian and Russian but the proposed methods could be adopted for other language pairs for which parallel corpora and bilingual glossaries are available.
Resumo:
Directions the outcomes of the OpenAIRE project, which implements the EC Open Access (OA) pilot. Capitalizing on the OpenAIRE infrastructure, built for managing FP7 and ERC funded articles, and the associated supporting mechanism of the European Helpdesk System, OpenAIREplus will “develop an open access, participatory infrastructure for scientific information”. It will significantly expand its base of harvested publications to also include all OA publications indexed by the DRIVER infrastructure (more than 270 validated institutional repositories) and any other repository containing “peer-reviewed literature” that complies with certain standards. It will also generically harvest and index the metadata of scientific datasets in selected diverse OA thematic data repositories. It will support the concept of linked publications by deploying novel services for “linking peer- reviewed literature and associated data sets and collections”, from link discovery based on diverse forms of mining (textual, usage, etc.), to storage, visual representation, and on-line exploration. It will offer both user-level services to experts and “non-scientists” alike as well as programming interfaces for “providers of value-added services” to build applications on its content. Deposited articles and data will be openly accessible through an enhanced version of the OpenAIRE portal, together with any available relevant information on associated project funding and usage statistics. OpenAIREplus will retain its European footprint, engaging people and scientific repositories in almost all 27 EU member states and beyond. The technical work will be complemented by a suite of studies and associated research efforts that will partly proceed in collaboration with “different European initiatives” and investigate issues of “intellectual property rights, efficient financing models, and standards”.