920 resultados para Newspaper translation
Resumo:
The search for translation universals has been an important topic in translation studies over the past decades. In this paper, we focus on the notion of explicitation through a multifaceted study of causal connectives, integrating four different variables: the role of the source and the target languages, the influence of specific connectives and the role of the discourse relation they convey. Our results indicate that while source and target languages do not globally influence explicitation, specific connectives have a significant impact on this phenomenon. We also show that in English and French, the most frequently used connectives for explicitation share a similar semantic profile. Finally, we demonstrate that explicitation also varies across different discourse relations, even when they are conveyed by a single connective.
Annotating discourse connectives by looking at their translation: The translation-spotting technique
Resumo:
The various meanings of discourse connectives like while and however are difficult to identify and annotate, even for trained human annotators. This problem is all the more important that connectives are salient textual markers of cohesion and need to be correctly interpreted for many NLP applications. In this paper, we suggest an alternative route to reach a reliable annotation of connectives, by making use of the information provided by their translation in large parallel corpora. This method thus replaces the difficult explicit reasoning involved in traditional sense annotation by an empirical clustering of the senses emerging from the translations. We argue that this method has the advantage of providing more reliable reference data than traditional sense annotation. In addition, its simplicity allows for the rapid constitution of large annotated datasets.
Resumo:
This paper describes methods and results for the annotation of two discourse-level phenomena, connectives and pronouns, over a multilingual parallel corpus. Excerpts from Europarl in English and French have been annotated with disambiguation information for connectives and pronouns, for about 3600 tokens. This data is then used in several ways: for cross-linguistic studies, for training automatic disambiguation software, and ultimately for training and testing discourse-aware statistical machine translation systems. The paper presents the annotation procedures and their results in detail, and overviews the first systems trained on the annotated resources and their use for machine translation.