2 resultados para alignment, corpora, translation technology, English as a Lingua Franca, academic course descriptions

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to idiosyncrasies in their syntax, semantics or frequency, Multiword Expressions (MWEs) have received special attention from the NLP community, as the methods and techniques developed for the treatment of simplex words are not necessarily suitable for them. This is certainly the case for the automatic acquisition of MWEs from corpora. A lot of effort has been directed to the task of automatically identifying them, with considerable success. In this paper, we propose an approach for the identification of MWEs in a multilingual context, as a by-product of a word alignment process, that not only deals with the identification of possible MWE candidates, but also associates some multiword expressions with semantics. The results obtained indicate the feasibility and low costs in terms of tools and resources demanded by this approach, which could, for example, facilitate and speed up lexicographic work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Establishing metrics to assess machine translation (MT) systems automatically is now crucial owing to the widespread use of MT over the web. In this study we show that such evaluation can be done by modeling text as complex networks. Specifically, we extend our previous work by employing additional metrics of complex networks, whose results were used as input for machine learning methods and allowed MT texts of distinct qualities to be distinguished. Also shown is that the node-to-node mapping between source and target texts (English-Portuguese and Spanish-Portuguese pairs) can be improved by adding further hierarchical levels for the metrics out-degree, in-degree, hierarchical common degree, cluster coefficient, inter-ring degree, intra-ring degree and convergence ratio. The results presented here amount to a proof-of-principle that the possible capturing of a wider context with the hierarchical levels may be combined with machine learning methods to yield an approach for assessing the quality of MT systems. (C) 2010 Elsevier B.V. All rights reserved.