35 resultados para lexical
Resumo:
In this paper, we question the homogeneity of a large parallel corpus by measuring the similarity between various sub-parts. We compare results obtained using a general measure of lexical similarity based on χ2 and by counting the number of discourse connectives. We argue that discourse connectives provide a more sensitive measure, revealing differences that are not visible with the general measure. We also provide evidence for the existence of specific characteristics defining translated texts as opposed to non-translated ones, due to a universal tendency for explicitation.
Resumo:
Research has mainly focussed on the perceptual nature of synaesthesia. However, synaesthetic experiences are also semantically represented. It was our aim to develop a task to investigate the semantic representation of the concurrent and its relation to the inducer in grapheme-colour synaesthesia. Non-synaesthetes were either tested with a lexical-decision (i.e., word / non-word) or a semantic-classification (i.e., edibility decision) task. Targets consisted of words which were strongly associated with a specific colour (e.g., banana - yellow) and words which were neutral and not associated with a specific colour (e.g., aunt). Target words were primed with colours: the prime target relationship was either intramodal (i.e., word - word) or crossmodal (colour patch - word). Each of the four task versions consisted of three conditions: congruent (same colour for prime and target), incongruent (different colour), and unrelated (neutral target). For both tasks (i.e., lexical and semantic) and both versions of the task (i.e., intramodal and crossmodal), we expected faster reaction times (RTs) in the congruent condition than in the neutral condition and slower RTs in the incongruent condition than the neutral condition. Stronger effects were expected in the intramodal condition due to the overlap in the prime target modality. The results suggest that the hypotheses were partly confirmed. We conclude that the tasks and hypotheses can be readily adopted to investigate the nature of the representation of the synaesthetic experiences.
Resumo:
The goal of the present thesis was to investigate the production of code-switched utterances in bilinguals’ speech production. This study investigates the availability of grammatical-category information during bilingual language processing. The specific aim is to examine the processes involved in the production of Persian-English bilingual compound verbs (BCVs). A bilingual compound verb is formed when the nominal constituent of a compound verb is replaced by an item from the other language. In the present cases of BCVs the nominal constituents are replaced by a verb from the other language. The main question addressed is how a lexical element corresponding to a verb node can be placed in a slot that corresponds to a noun lemma. This study also investigates how the production of BCVs might be captured within a model of BCVs and how such a model may be integrated within incremental network models of speech production. In the present study, both naturalistic and experimental data were used to investigate the processes involved in the production of BCVs. In the first part of the present study, I collected 2298 minutes of a popular Iranian TV program and found 962 code-switched utterances. In 83 (8%) of the switched cases, insertions occurred within the Persian compound verb structure, hence, resulting in BCVs. As to the second part of my work, a picture-word interference experiment was conducted. This study addressed whether in the case of the production of Persian-English BCVs, English verbs compete with the corresponding Persian compound verbs as a whole, or whether English verbs compete with the nominal constituents of Persian compound verbs only. Persian-English bilinguals named pictures depicting actions in 4 conditions in Persian (L1). In condition 1, participants named pictures of action using the whole Persian compound verb in the context of its English equivalent distractor verb. In condition 2, only the nominal constituent was produced in the presence of the light verb of the target Persian compound verb and in the context of a semantically closely related English distractor verb. In condition 3, the whole Persian compound verb was produced in the context of a semantically unrelated English distractor verb. In condition 4, only the nominal constituent was produced in the presence of the light verb of the target Persian compound verb and in the context of a semantically unrelated English distractor verb. The main effect of linguistic unit was significant by participants and items. Naming latencies were longer in the nominal linguistic unit compared to the compound verb (CV) linguistic unit. That is, participants were slower to produce the nominal constituent of compound verbs in the context of a semantically closely related English distractor verb compared to producing the whole compound verbs in the context of a semantically closely related English distractor verb. The three-way interaction between version of the experiment (CV and nominal versions), linguistic unit (nominal and CV linguistic units), and relation (semantically related and unrelated distractor words) was significant by participants. In both versions, naming latencies were longer in the semantically related nominal linguistic unit compared to the response latencies in the semantically related CV linguistic unit. In both versions, naming latencies were longer in the semantically related nominal linguistic unit compared to response latencies in the semantically unrelated nominal linguistic unit. Both the analysis of the naturalistic data and the results of the experiment revealed that in the case of the production of the nominal constituent of BCVs, a verb from the other language may compete with a noun from the base language, suggesting that grammatical category does not necessarily provide a constraint on lexical access during the production of the nominal constituent of BCVs. There was a minimal context in condition 2 (the nominal linguistic unit) in which the nominal constituent was produced in the presence of its corresponding light verb. The results suggest that generating words within a context may not guarantee that the effect of grammatical class becomes available. A model is proposed in order to characterize the processes involved in the production of BCVs. Implications for models of bilingual language production are discussed.
Resumo:
Abstract Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely. Usually water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a language engineer has to create water tailored to each individual island. Such an approach is fragile, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by an engineer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually. In this paper we propose a new technique of island parsing —- bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. Our work focuses on applications of island parsing to data extraction from source code. We have integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.
Resumo:
Reading strategies vary across languages according to orthographic depth - the complexity of the grapheme in relation to phoneme conversion rules - notably at the level of eye movement patterns. We recently demonstrated that a group of early bilinguals, who learned both languages equally under the age of seven, presented a first fixation location (FFL) closer to the beginning of words when reading in German as compared with French. Since German is known to be orthographically more transparent than French, this suggested that different strategies were being engaged depending on the orthographic depth of the used language. Opaque languages induce a global reading strategy, and transparent languages force a local/serial strategy. Thus, pseudo-words were processed using a local strategy in both languages, suggesting that the link between word forms and their lexical representation may also play a role in selecting a specific strategy. In order to test whether corresponding effects appear in late bilinguals with low proficiency in their second language (L2), we present a new study in which we recorded eye movements while two groups of late German-French and French-German bilinguals read aloud isolated French and German words and pseudo-words. Since, a transparent reading strategy is local and serial, with a high number of fixations per stimuli, and the level of the bilingual participants' L2 is low, the impact of language opacity should be observed in L1. We therefore predicted a global reading strategy if the bilinguals' L1 was French (FFL close to the middle of the stimuli with fewer fixations per stimuli) and a local and serial reading strategy if it was German. Thus, the L2 of each group, as well as pseudo-words, should also require a local and serial reading strategy. Our results confirmed these hypotheses, suggesting that global word processing is only achieved by bilinguals with an opaque L1 when reading in an opaque language; the low level in the L2 gives way to a local and serial reading strategy. These findings stress the fact that reading behavior is influenced not only by the linguistic mode but also by top-down factors, such as readers' proficiency.