15 resultados para Bilingual Corpus
em Queensland University of Technology - ePrints Archive
Resumo:
Introduction Many bilinguals will have had the experience of unintentionally reading something in a language other than the intended one (e.g. MUG to mean mosquito in Dutch rather than a receptacle for a hot drink, as one of the possible intended English meanings), of finding themselves blocked on a word for which many alternatives suggest themselves (but, somewhat annoyingly, not in the right language), of their accent changing when stressed or tired and, occasionally, of starting to speak in a language that is not understood by those around them. These instances where lexical access appears compromised and control over language behavior is reduced hint at the intricate structure of the bilingual lexical architecture and the complexity of the processes by which knowledge is accessed and retrieved. While bilinguals might tend to blame word finding and other language problems on their bilinguality, these difficulties per se are not unique to the bilingual population. However, what is unique, and yet far more common than is appreciated by monolinguals, is the cognitive architecture that subserves bilingual language processing. With bilingualism (and multilingualism) the rule rather than the exception (Grosjean, 1982), this architecture may well be the default structure of the language processing system. As such, it is critical that we understand more fully not only how the processing of more than one language is subserved by the brain, but also how this understanding furthers our knowledge of the cognitive architecture that encapsulates the bilingual mental lexicon. The neurolinguistic approach to bilingualism focuses on determining the manner in which the two (or more) languages are stored in the brain and how they are differentially (or similarly) processed. The underlying assumption is that the acquisition of more than one language requires at the very least a change to or expansion of the existing lexicon, if not the formation of language-specific components, and this is likely to manifest in some way at the physiological level. There are many sources of information, ranging from data on bilingual aphasic patients (Paradis, 1977, 1985, 1997) to lateralization (Vaid, 1983; see Hull & Vaid, 2006, for a review), recordings of event-related potentials (ERPs) (e.g. Ardal et al., 1990; Phillips et al., 2006), and positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) studies of neurologically intact bilinguals (see Indefrey, 2006; Vaid & Hull, 2002, for reviews). Following the consideration of methodological issues and interpretative limitations that characterize these approaches, the chapter focuses on how the application of these approaches has furthered our understanding of (1) selectivity of bilingual lexical access, (2) distinctions between word types in the bilingual lexicon and (3) control processes that enable language selection.
Resumo:
The QUT-NOISE-TIMIT corpus consists of 600 hours of noisy speech sequences designed to enable a thorough evaluation of voice activity detection (VAD) algorithms across a wide variety of common background noise scenarios. In order to construct the final mixed-speech database, a collection of over 10 hours of background noise was conducted across 10 unique locations covering 5 common noise scenarios, to create the QUT-NOISE corpus. This background noise corpus was then mixed with speech events chosen from the TIMIT clean speech corpus over a wide variety of noise lengths, signal-to-noise ratios (SNRs) and active speech proportions to form the mixed-speech QUT-NOISE-TIMIT corpus. The evaluation of five baseline VAD systems on the QUT-NOISE-TIMIT corpus is conducted to validate the data and show that the variety of noise available will allow for better evaluation of VAD systems than existing approaches in the literature.
Resumo:
Extracellular matrix regulates many cellular processes likely to be important for development and regression of corpora lutea. Therefore, we identified the types and components of the extracellular matrix of the human corpus luteum at different stages of the menstrual cycle. Two different types of extracellular matrix were identified by electron microscopy; subendothelial basal laminas and an interstitial matrix located as aggregates at irregular intervals between the non-vascular cells. No basal laminas were associated with luteal cells. At all stages, collagen type IV α1 and laminins α5, β2 and γ1 were localized by immunohistochemistry to subendothelial basal laminas, and collagen type IV α1 and laminins α2, α5, β1 and β2 localized in the interstitial matrix. Laminin α4 and β1 chains occurred in the subendothelial basal lamina from mid-luteal stage to regression; at earlier stages, a punctate pattern of staining was observed. Therefore, human luteal subendothelial basal laminas potentially contain laminin 11 during early luteal development and, additionally, laminins 8, 9 and 10 at the mid-luteal phase. Laminin α1 and α3 chains were not detected in corpora lutea. Versican localized to the connective tissue extremities of the corpus luteum. Thus, during the formation of the human corpus luteum, remodelling of extracellular matrix does not result in basal laminas as present in the adrenal cortex or ovarian follicle. Instead, novel aggregates of interstitial matrix of collagen and laminin are deposited within the luteal parenchyma, and it remains to be seen whether this matrix is important for maintaining the luteal cell phenotype.
Resumo:
Studies of orthographic skills transfer between languages focus mostly on working memory (WM) ability in alphabetic first language (L1) speakers when learning another, often alphabetically congruent, language. We report two studies that, instead, explored the transferability of L1 orthographic processing skills in WM in logographic-L1 and alphabetic-L1 speakers. English-French bilingual and English monolingual (alphabetic-L1) speakers, and Chinese-English (logographic-L1) speakers, learned a set of artificial logographs and associated meanings (Study 1). The logographs were used in WM tasks with and without concurrent articulatory or visuo-spatial suppression. The logographic-L1 bilinguals were markedly less affected by articulatory suppression than alphabetic-L1 monolinguals (who did not differ from their bilingual peers). Bilinguals overall were less affected by spatial interference, reflecting superior phonological processing skills or, conceivably, greater executive control. A comparison of span sizes for meaningful and meaningless logographs (Study 2) replicated these findings. However, the logographic-L1 bilinguals’ spans in L1 were measurably greater than those of their alphabetic-L1 (bilingual and monolingual) peers; a finding unaccounted for by faster articulation rates or differences in general intelligence. The overall pattern of results suggests an advantage (possibly perceptual) for logographic-L1 speakers, over and above the bilingual advantage also seen elsewhere in third language (L3) acquisition.
Resumo:
The advent of eLearning has seen online discussion forums widely used in both undergraduate and postgraduate nursing education. This paper reports an Australian university experience of design, delivery and redevelopment of a distance education module developed for Vietnamese nurse academics. The teaching experience of Vietnamese nurse academics is mixed and frequently limited. It was decided that the distance module should attempt to utilise the experience of senior Vietnamese nurse academics - asynchronous online discussion groups were used to facilitate this. Online discussion occurred in both Vietnamese and English and was moderated by an Australian academic working alongside a Vietnamese translator. This paper will discuss the design of an online learning environment for foreign correspondents, the resources and translation required to maximise the success of asynchronous online discussion groups, as well as the rationale of delivering complex content in a foreign language. While specifically addressing the first iteration of the first distance module designed, this paper will also address subsequent changes made for the second iteration of the module and comment on their success. While a translator is clearly a key component of success, the elements of simplicity and clarity combined with supportive online moderation must not be overlooked.
Resumo:
In this paper, we describe a machine-translated parallel English corpus for the NTCIR Chinese, Japanese and Korean (CJK) Wikipedia collections. This document collection is named CJK2E Wikipedia XML corpus. The corpus could be used by the information retrieval research community and knowledge sharing in Wikipedia in many ways; for example, this corpus could be used for experimentations in cross-lingual information retrieval, cross-lingual link discovery, or omni-lingual information retrieval research. Furthermore, the translated CJK articles could be used to further expand the current coverage of the English Wikipedia.
Resumo:
Measures of semantic similarity between medical concepts are central to a number of techniques in medical informatics, including query expansion in medical information retrieval. Previous work has mainly considered thesaurus-based path measures of semantic similarity and has not compared different corpus-driven approaches in depth. We evaluate the effectiveness of eight common corpus-driven measures in capturing semantic relatedness and compare these against human judged concept pairs assessed by medical professionals. Our results show that certain corpus-driven measures correlate strongly (approx 0.8) with human judgements. An important finding is that performance was significantly affected by the choice of corpus used in priming the measure, i.e., used as evidence from which corpus-driven similarities are drawn. This paper provides guidelines for the implementation of semantic similarity measures for medical informatics and concludes with implications for medical information retrieval.
Resumo:
This paper evaluates the efficiency of a number of popular corpus-based distributional models in performing discovery on very large document sets, including online collections. Literature-based discovery is the process of identifying previously unknown connections from text, often published literature, that could lead to the development of new techniques or technologies. Literature-based discovery has attracted growing research interest ever since Swanson's serendipitous discovery of the therapeutic effects of fish oil on Raynaud's disease in 1986. The successful application of distributional models in automating the identification of indirect associations underpinning literature-based discovery has been heavily demonstrated in the medical domain. However, we wish to investigate the computational complexity of distributional models for literature-based discovery on much larger document collections, as they may provide computationally tractable solutions to tasks including, predicting future disruptive innovations. In this paper we perform a computational complexity analysis on four successful corpus-based distributional models to evaluate their fit for such tasks. Our results indicate that corpus-based distributional models that store their representations in fixed dimensions provide superior efficiency on literature-based discovery tasks.
Resumo:
The majority of the world speaks more than one language yet the impact of learning a second language has rarely been studied from a child’s perspective. This paper describes monolingual children’s insights into becoming bilingual at four time points: two months before moving to another country (while living in Australia), as well as one, six, and twelve months after moving to Germany. The participants were two monolingual English-speaking siblings (a male aged 7- to 8-years and a female aged 9- to 10-years) who subsequently learned to speak German. At each of the four time points, interviews were undertaken with each child using child-friendly drawing and questionnaire techniques. Three themes were identified: (1) the children’s awareness of language competence; (2) inclusion factors, and; (3) exclusion factors that influenced friendship formation. The impact of language ability on making friends was a dominant theme that arose across the four time points and was triangulated across data collection methods. The children made friends with others who had similar language competence in German, even though they were younger, and did not share the same first language. Age-matched peers who were more competent in German were less likely to be described as friends. Across all three themes, the playground was highlighted by both children as the key site where becoming bilingual most strongly impacted initiation and negotiation of friendships. Becoming bilingual impacted the children’s friendship formation and socialization opportunities with more competent language users.