2 resultados para cross-language speaker recognition
em CORA - Cork Open Research Archive - University College Cork - Ireland
Resumo:
Users seeking information may not find relevant information pertaining to their information need in a specific language. But information may be available in a language different from their own, but users may not know that language. Thus users may experience difficulty in accessing the information present in different languages. Since the retrieval process depends on the translation of the user query, there are many issues in getting the right translation of the user query. For a pair of languages chosen by a user, resources, like incomplete dictionary, inaccurate machine translation system may exist. These resources may be insufficient to map the query terms in one language to its equivalent terms in another language. Also for a given query, there might exist multiple correct translations. The underlying corpus evidence may suggest a clue to select a probable set of translations that could eventually perform a better information retrieval. In this paper, we present a cross language information retrieval approach to effectively retrieve information present in a language other than the language of the user query using the corpus driven query suggestion approach. The idea is to utilize the corpus based evidence of one language to improve the retrieval and re-ranking of news documents in the other language. We use FIRE corpora - Tamil and English news collections in our experiments and illustrate the effectiveness of the proposed cross language information retrieval approach.
Using parent report to assess early lexical production in children exposed to more than one language
Resumo:
Limited expressive vocabulary skills in young children are considered to be the first warning signs of a potential Specific Language Impairment (SLI) (Ellis & Thal, 2008). In bilingual language learning environments, the expressive vocabulary size in each of the child’s developing languages is usually smaller compared to the number of words produced by monolingual peers (e.g. De Houwer, 2009). Nonetheless, evidence shows children’s total productive lexicon size across both languages to be comparable to monolingual peers’ vocabularies (e.g. Pearson et al., 1993; Pearson & Fernandez, 1994). Since there is limited knowledge as to which level of bilingual vocabulary size should be considered as a risk factor for SLI, the effects of bilingualism and language-learning difficulties on early lexical production are often confounded. The compilation of profiles for early vocabulary production in children exposed to more than one language, and their comparison across language pairs, should enable more accurate identification of vocabulary delays that signal a risk for SLI in bilingual populations. These considerations prompted the design of a methodology for assessing early expressive vocabulary in children exposed to more than one language, which is described in the present chapter. The implementation of this methodological framework is then outlined by presenting the design of a study that measured the productive lexicons of children aged 24-36 months who were exposed to different language pairs, namely Maltese and English, Irish and English, Polish and English, French and Portuguese, Turkish and German as well as English and Hebrew. These studies were designed and coordinated in COST Action IS0804 Working Group 3 (WG3) and will be described in detail in a series of subsequent publications. Expressive vocabulary size was measured through parental report, by employing the vocabulary checklist of the MacArthur-Bates Communicative Development Inventory: Words and Sentences (CDI: WS) (Fenson et al., 1993, 2007) and its adaptations to the participants’ languages. Here we describe the novelty of the study’s methodological design, which lies in its attempt to harmonize the use of vocabulary checklist adaptations, together with parental questionnaires addressing language exposure and developmental history, across participant groups characterized by different language exposure variables. This chapter outlines the various methodological considerations that paved the way for meaningful cross-linguistic comparison of the participants’ expressive lexicon sizes. In so doing, it hopes to provide a template for and encourage further research directed at establishing a threshold for SLI risk in children exposed to more than one language.