73 resultados para Romance languages -- Connectives
Resumo:
I argue that the communication of given information is part of the procedural instructions conveyed by some connectives like the French puisque. I submit in addition that the encoding of givenness has cognitive implications that are visible during online processing. I assess this hypothesis empirically by comparing the way the clauses introduced by two French causal connectives, puisque and parce que, are processed during online reading when the following segment is ‘given’ or ‘new’. I complement these results by an acceptability judgement task using the same sentences. These experiments confirm that introducing a clause conveying given information is a core feature characterizing puisque, as the segment following it is read faster when it contains given rather than new information, and puisque is rated as more acceptable than parce que in such contexts. I discuss the implications of these results for future research on the description of the meaning of connectives.
Annotating discourse connectives by looking at their translation: The translation-spotting technique
Resumo:
The various meanings of discourse connectives like while and however are difficult to identify and annotate, even for trained human annotators. This problem is all the more important that connectives are salient textual markers of cohesion and need to be correctly interpreted for many NLP applications. In this paper, we suggest an alternative route to reach a reliable annotation of connectives, by making use of the information provided by their translation in large parallel corpora. This method thus replaces the difficult explicit reasoning involved in traditional sense annotation by an empirical clustering of the senses emerging from the translations. We argue that this method has the advantage of providing more reliable reference data than traditional sense annotation. In addition, its simplicity allows for the rapid constitution of large annotated datasets.
Resumo:
Discourse connectives are often said to be language specific, and therefore not easily paired with a translation equivalent in a target language. However, few studies have assessed the magnitude and the causes of these divergences. In this paper, we provide an overview of the similarities and discrepancies between causal connectives in two typologically related languages: English and French. We first discuss two criteria used in the literature to account for these differences: the notion of domains of use and the information status of the cause segment. We then test the validity of these criteria through an empirical contrastive study of causal connectives in English and French, performed on a bidirectional corpus. Our results indicate that French and English connectives have only partially overlapping profiles and that translation equivalents are adequately predicted by these two criteria.
Resumo:
In French, a causal relation is often conveyed by the connectives car, parce que or puisque. Since the seminal work of the Lambda-l Group (1975), it has generally been assumed that parce que, used to relate semantic content, contrasts with car and puisque, both used to connect either speech act or epistemic content. However, this analysis leaves a number of questions unanswered. In this paper, I present a reanalysis of this trio, using empirical methods such as corpus analysis and constrained elicitation. Results indicate that car and parce que are interchangeable in many contexts, even if they are still prototypically used in their respective domain in writing. As for puisque, its distribution does not overlap with car, despite their similar domains of use. I argue that the specificity of puisque with respect to the other two connectives is to introduce a cause with an echoic meaning.
Resumo:
Discourse connectives are lexical items indicating coherence relations between discourse segments. Even though many languages possess a whole range of connectives, important divergences exist cross-linguistically in the number of connectives that are used to express a given relation. For this reason, connectives are not easily paired with a univocal translation equivalent across languages. This paper is a first attempt to design a reliable method to annotate the meaning of discourse connectives cross-linguistically using corpus data. We present the methodological choices made to reach this aim and report three annotation experiments using the framework of the Penn Discourse Tree Bank.
Resumo:
The concept of theory of mind (ToM), a hot topic in cognitive psychology for the past twenty-five years, has gained increasing importance in the fields of linguistics and pragmatics. However, even though the relationship between ToM and verbal communication is now recognized, the extent, causality and full implications of this connection remain mostly to be explored. This book presents a comprehensive discussion of the interface between language, communication, and theory of mind, and puts forward an innovative proposal regarding the role of discourse connectives for this interface. The proposed analysis of connectives is tested from the perspective of their acquisition, using empirical methods such as corpus analysis and controlled experiments, thus placing the study of connectives within the emerging framework of experimental pragmatics.
Resumo:
This paper describes methods and results for the annotation of two discourse-level phenomena, connectives and pronouns, over a multilingual parallel corpus. Excerpts from Europarl in English and French have been annotated with disambiguation information for connectives and pronouns, for about 3600 tokens. This data is then used in several ways: for cross-linguistic studies, for training automatic disambiguation software, and ultimately for training and testing discourse-aware statistical machine translation systems. The paper presents the annotation procedures and their results in detail, and overviews the first systems trained on the annotated resources and their use for machine translation.
Resumo:
In this paper, we question the homogeneity of a large parallel corpus by measuring the similarity between various sub-parts. We compare results obtained using a general measure of lexical similarity based on χ2 and by counting the number of discourse connectives. We argue that discourse connectives provide a more sensitive measure, revealing differences that are not visible with the general measure. We also provide evidence for the existence of specific characteristics defining translated texts as opposed to non-translated ones, due to a universal tendency for explicitation.