14 resultados para Language Models
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
The Objective was to describe the contributions of Joseph Jules Dejerine and his wife Augusta Dejerine-Klumpke to our understanding of cerebral association fiber tracts and language processing. The Dejerines (and not Constantin von Monakow) were the first to describe the superior longitudinal fasciculus/arcuate fasciculus (SLF/AF) as an association fiber tract uniting Broca's area, Wernicke's area, and a visual image center in the angular gyrus of a left hemispheric language zone. They were also the first to attribute language-related functions to the fasciculi occipito-frontalis (FOF) and the inferior longitudinal fasciculus (ILF) after describing aphasia patients with degeneration of the SLF/AF, ILF, uncinate fasciculus (UF), and FOF. These fasciculi belong to a functional network known as the Dejerines' language zone, which exceeds the borders of the classically defined cortical language centers. The Dejerines provided the first descriptions of the anatomical pillars of present-day language models (such as the SLF/AF). Their anatomical descriptions of fasciculi in aphasia patients provided a foundation for our modern concept of the dorsal and ventral streams in language processing.
Resumo:
The goal of the present thesis was to investigate the production of code-switched utterances in bilinguals’ speech production. This study investigates the availability of grammatical-category information during bilingual language processing. The specific aim is to examine the processes involved in the production of Persian-English bilingual compound verbs (BCVs). A bilingual compound verb is formed when the nominal constituent of a compound verb is replaced by an item from the other language. In the present cases of BCVs the nominal constituents are replaced by a verb from the other language. The main question addressed is how a lexical element corresponding to a verb node can be placed in a slot that corresponds to a noun lemma. This study also investigates how the production of BCVs might be captured within a model of BCVs and how such a model may be integrated within incremental network models of speech production. In the present study, both naturalistic and experimental data were used to investigate the processes involved in the production of BCVs. In the first part of the present study, I collected 2298 minutes of a popular Iranian TV program and found 962 code-switched utterances. In 83 (8%) of the switched cases, insertions occurred within the Persian compound verb structure, hence, resulting in BCVs. As to the second part of my work, a picture-word interference experiment was conducted. This study addressed whether in the case of the production of Persian-English BCVs, English verbs compete with the corresponding Persian compound verbs as a whole, or whether English verbs compete with the nominal constituents of Persian compound verbs only. Persian-English bilinguals named pictures depicting actions in 4 conditions in Persian (L1). In condition 1, participants named pictures of action using the whole Persian compound verb in the context of its English equivalent distractor verb. In condition 2, only the nominal constituent was produced in the presence of the light verb of the target Persian compound verb and in the context of a semantically closely related English distractor verb. In condition 3, the whole Persian compound verb was produced in the context of a semantically unrelated English distractor verb. In condition 4, only the nominal constituent was produced in the presence of the light verb of the target Persian compound verb and in the context of a semantically unrelated English distractor verb. The main effect of linguistic unit was significant by participants and items. Naming latencies were longer in the nominal linguistic unit compared to the compound verb (CV) linguistic unit. That is, participants were slower to produce the nominal constituent of compound verbs in the context of a semantically closely related English distractor verb compared to producing the whole compound verbs in the context of a semantically closely related English distractor verb. The three-way interaction between version of the experiment (CV and nominal versions), linguistic unit (nominal and CV linguistic units), and relation (semantically related and unrelated distractor words) was significant by participants. In both versions, naming latencies were longer in the semantically related nominal linguistic unit compared to the response latencies in the semantically related CV linguistic unit. In both versions, naming latencies were longer in the semantically related nominal linguistic unit compared to response latencies in the semantically unrelated nominal linguistic unit. Both the analysis of the naturalistic data and the results of the experiment revealed that in the case of the production of the nominal constituent of BCVs, a verb from the other language may compete with a noun from the base language, suggesting that grammatical category does not necessarily provide a constraint on lexical access during the production of the nominal constituent of BCVs. There was a minimal context in condition 2 (the nominal linguistic unit) in which the nominal constituent was produced in the presence of its corresponding light verb. The results suggest that generating words within a context may not guarantee that the effect of grammatical class becomes available. A model is proposed in order to characterize the processes involved in the production of BCVs. Implications for models of bilingual language production are discussed.
Resumo:
Object-oriented meta-languages such as MOF or EMOF are often used to specify domain specific languages. However, these meta-languages lack the ability to describe behavior or operational semantics. Several approaches used a subset of Java mixed with OCL as executable meta-languages. In this paper, we report our experience of using Smalltalk as an executable and integrated meta-language. We validated this approach in incrementally building over the last decade, Moose, a meta-described reengineering environment. The reflective capabilities of Smalltalk support a uniform way of letting the base developer focus on his tasks while at the same time allowing him to meta-describe his domain model. The advantage of our this approach is that the developer uses the same tools and environment
Resumo:
Object-oriented modelling languages such as EMOF are often used to specify domain specific meta-models. However, these modelling languages lack the ability to describe behavior or operational semantics. Several approaches have used a subset of Java mixed with OCL as executable meta-languages. In this experience report we show how we use Smalltalk as an executable meta-language in the context of the Moose reengineering environment. We present how we implemented EMOF and its behavioral aspects. Over the last decade we validated this approach through incrementally building a meta-described reengineering environment. Such an approach bridges the gap between a code-oriented view and a meta-model driven one. It avoids the creation of yet another language and reuses the infrastructure and run-time of the underlying implementation language. It offers an uniform way of letting developers focus on their tasks while at the same time allowing them to meta-describe their domain model. The advantage of our approach is that developers use the same tools and environment they use for their regular tasks. Still the approach is not Smalltalk specific but can be applied to language offering an introspective API such as Ruby, Python, CLOS, Java and C#.
Resumo:
Virtual machines emulating hardware devices are generally implemented in low-level languages and using a low-level style for performance reasons. This trend results in largely difficult to understand, difficult to extend and unmaintainable systems. As new general techniques for virtual machines arise, it gets harder to incorporate or test these techniques because of early design and optimization decisions. In this paper we show how such decisions can be postponed to later phases by separating virtual machine implementation issues from the high-level machine-specific model. We construct compact models of whole-system VMs in a high-level language, which exclude all low-level implementation details. We use the pluggable translation toolchain PyPy to translate those models to executables. During the translation process, the toolchain reintroduces the VM implementation and optimization details for specific target platforms. As a case study we implement an executable model of a hardware gaming device. We show that our approach to VM building increases understandability, maintainability and extendability while preserving performance.
Resumo:
Systems must co-evolve with their context. Reverse engineering tools are a great help in this process of required adaption. In order for these tools to be flexible, they work with models, abstract representations of the source code. The extraction of such information from source code can be done using a parser. However, it is fairly tedious to build new parsers. And this is made worse by the fact that it has to be done over and over again for every language we want to analyze. In this paper we propose a novel approach which minimizes the knowledge required of a certain language for the extraction of models implemented in that language by reflecting on the implementation of preparsed ASTs provided by an IDE. In a second phase we use a technique referred to as Model Mapping by Example to map platform dependent models onto domain specific model.
Resumo:
New Zealand English first emerged at the beginning of the 19th century as a result of the dialect contact of British (51%), Scottish (27.3%) and Irish (22%) migrants (Hay and Gordon 2008:6). This variety has subsequently developed into an autonomous and legitimised national variety and enjoys a distinct socio-political status, recognition and codification. In fact, a number of dictionaries of New Zealand English have been published1 and the variety is routinely used as the official medium on TV, radio and other media. This however, has not always been the case, as for long only British standard norms were deemed suitable for media broadcasting. While there is some work already on lay commentary about New Zealand English (see for example Gordon 1983, 1994; Hundt 1998), there is much more to be done especially concerning more recent periods of the history of this variety and the ideologies underlying its development and legitimisation. Consequently, the current project aims at investigating the metalinguistic discourses during the period of transition from a British norm to a New Zealand norm in the media context, this will be done by focusing on debates about language in light of the advent of radio and television. The main purpose of this investigation is thus to examine the (language) ideologies that have shaped and underlain these discourses (e.g. discussions about the appropriateness of New Zealand English vis à vis external, British models of language) and their related practices in these media (e.g. broadcasting norms). The sociolinguistic and pragmatic effects of these ideologies will also be taken into account. Furthermore, a comparison will be carried out, at a later stage in the project, between New Zealand English and a more problematic and less legitimised variety: Estuary English. Despite plenty of evidence of media and other public discourses on Estuary English, in fact, there has been very little metalinguistic analysis of this evidence, nor examinations of the underlying ideologies in these discourses. The comparison will seek to discover whether similar themes emerge in the ideologies played out in publish discourses about these varieties, themes which serve to legitimise one variety, whilst denying such legitimacy to the other.
Resumo:
New Zealand English first emerged at the beginning of the 19th century as a result of the dialect contact of British (51%), Scottish (27.3%) and Irish (22%) migrants (Hay and Gordon 2008:6). This variety has subsequently developed into an autonomous and legitimised national variety and enjoys a distinct socio-political status, recognition and codification. In fact, a number of dictionaries of New Zealand English have been published1 and the variety is routinely used as the official medium on TV, radio and other media. This however, has not always been the case, as for long only British standard norms were deemed suitable for media broadcasting. While there is some work already on lay commentary about New Zealand English (see for example Gordon 1983, 1994; Hundt 1998), there is much more to be done especially concerning more recent periods of the history of this variety and the ideologies underlying its development and legitimisation. Consequently, the current project aims at investigating the metalinguistic discourses during the period of transition from a British norm to a New Zealand norm in the media context, this will be done by focusing on debates about language in light of the advent of radio and television. The main purpose of this investigation is thus to examine the (language) ideologies that have shaped and underlain these discourses (e.g. discussions about the appropriateness of New Zealand English vis à vis external, British models of language) and their related practices in these media (e.g. broadcasting norms). The sociolinguistic and pragmatic effects of these ideologies will also be taken into account. Furthermore, a comparison will be carried out, at a later stage in the project, between New Zealand English and a more problematic and less legitimised variety: Estuary English. Despite plenty of evidence of media and other public discourses on Estuary English, in fact, there has been very little metalinguistic analysis of this evidence, nor examinations of the underlying ideologies in these discourses. The comparison will seek to discover whether similar themes emerge in the ideologies played out in publish discourses about these varieties, themes which serve to legitimise one variety, whilst denying such legitimacy to the other.
Resumo:
Software corpora facilitate reproducibility of analyses, however, static analysis for an entire corpus still requires considerable effort, often duplicated unnecessarily by multiple users. Moreover, most corpora are designed for single languages increasing the effort for cross-language analysis. To address these aspects we propose Pangea, an infrastructure allowing fast development of static analyses on multi-language corpora. Pangea uses language-independent meta-models stored as object model snapshots that can be directly loaded into memory and queried without any parsing overhead. To reduce the effort of performing static analyses, Pangea provides out-of-the box support for: creating and refining analyses in a dedicated environment, deploying an analysis on an entire corpus, using a runner that supports parallel execution, and exporting results in various formats. In this tool demonstration we introduce Pangea and provide several usage scenarios that illustrate how it reduces the cost of analysis.
Resumo:
Crowdsourcing linguistic phenomena with smartphone applications is relatively new. In linguistics, apps have predominantly been developed to create pronunciation dictionaries, to train acoustic models, and to archive endangered languages. This paper presents the first account of how apps can be used to collect data suitable for documenting language change: we created an app, Dialäkt Äpp (DÄ), which predicts users’ dialects. For 16 linguistic variables, users select a dialectal variant from a drop-down menu. DÄ then geographically locates the user’s dialect by suggesting a list of communes where dialect variants most similar to their choices are used. Underlying this prediction are 16 maps from the historical Linguistic Atlas of German-speaking Switzerland, which documents the linguistic situation around 1950. Where users disagree with the prediction, they can indicate what they consider to be their dialect’s location. With this information, the 16 variables can be assessed for language change. Thanks to the playfulness of its functionality, DÄ has reached many users; our linguistic analyses are based on data from nearly 60,000 speakers. Results reveal a relative stability for phonetic variables, while lexical and morphological variables seem more prone to change. Crowdsourcing large amounts of dialect data with smartphone apps has the potential to complement existing data collection techniques and to provide evidence that traditional methods cannot, with normal resources, hope to gather. Nonetheless, it is important to emphasize a range of methodological caveats, including sparse knowledge of users’ linguistic backgrounds (users only indicate age, sex) and users’ self-declaration of their dialect. These are discussed and evaluated in detail here. Findings remain intriguing nevertheless: as a means of quality control, we report that traditional dialectological methods have revealed trends similar to those found by the app. This underlines the validity of the crowdsourcing method. We are presently extending DÄ architecture to other languages.