Estrazione terminologica automatica: sistemi a confronto


Autoria(s): Ferri, Veronica
Contribuinte(s)

Castagnoli, Sara

Data(s)

12/03/2015

Resumo

In any terminological study, candidate term extraction is a very time-consuming task. Corpus analysis tools have automatized some processes allowing the detection of relevant data within the texts, facilitating term candidate selection as well. Nevertheless, these tools are (normally) not specific for terminology research; therefore, the units which are automatically extracted need manual evaluation. Over the last few years some software products have been specifically developed for automatic term extraction. They are based on corpus analysis, but use linguistic and statistical information to filter data more precisely. As a result, the time needed for manual evaluation is reduced. In this framework, we tried to understand if and how these new tools can really be an advantage. In order to develop our project, we simulated a terminology study: we chose a domain (i.e. legal framework for medicinal products for human use) and compiled a corpus from which we extracted terms and phraseologisms using AntConc, a corpus analysis tool. Afterwards, we compared our list with the lists extracted automatically from three different tools (TermoStat Web, TaaS e Sketch Engine) in order to evaluate their performance. In the first chapter we describe some principles relating to terminology and phraseology in language for special purposes and show the advantages offered by corpus linguistics. In the second chapter we illustrate some of the main concepts of the domain selected, as well as some of the main features of legal texts. In the third chapter we describe automatic term extraction and the main criteria to evaluate it; moreover, we introduce the term-extraction tools used for this project. In the fourth chapter we describe our research method and, in the fifth chapter, we show our results and draw some preliminary conclusions on the performance and usefulness of term-extraction tools.

Formato

application/pdf

Identificador

http://amslaurea.unibo.it/8190/1/ferri_veronica_tesi.pdf

Ferri, Veronica (2015) Estrazione terminologica automatica: sistemi a confronto. [Laurea magistrale], Università di Bologna, Corso di Studio in Traduzione specializzata [LM-DM270] - Forli' <http://amslaurea.unibo.it/view/cds/CDS8061/>

Relação

http://amslaurea.unibo.it/8190/

Direitos

info:eu-repo/semantics/restrictedAccess

Palavras-Chave #corpora, terminologia, termini, estrazione automatica #scuola :: 843894 :: Lingue e Letterature, Traduzione e Interpretazione #cds :: 8061 :: Traduzione specializzata [LM-DM270] - Forli' #sessione :: terza
Tipo

PeerReviewed