845 resultados para Turkish language--Orthography and spelling
Resumo:
N-gram language models and lexicon-based word-recognition are popular methods in the literature to improve recognition accuracies of online and offline handwritten data. However, there are very few works that deal with application of these techniques on online Tamil handwritten data. In this paper, we explore methods of developing symbol-level language models and a lexicon from a large Tamil text corpus and their application to improving symbol and word recognition accuracies. On a test database of around 2000 words, we find that bigram language models improve symbol (3%) and word recognition (8%) accuracies and while lexicon methods offer much greater improvements (30%) in terms of word recognition, there is a large dependency on choosing the right lexicon. For comparison to lexicon and language model based methods, we have also explored re-evaluation techniques which involve the use of expert classifiers to improve symbol and word recognition accuracies.
Resumo:
Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other ``auxiliary'' languages. We observe that translations have many topically related words in common in the auxiliary language. To model this, we define the notion of a translingual theme, a set of topically related words from auxiliary language corpora, and present a probabilistic framework for translation induction. Extensive experiments on 35 comparable corpora using English and French as auxiliary languages show that this approach can yield dramatic improvements in performance (e.g. MRR improves by 124% to 0.419 for Telugu-Kannada). A user study on WikiTSu, a system for cross-lingual Wikipedia title suggestion that uses our approach, shows a 20% improvement in the quality of titles suggested.
Resumo:
[EN]In the newEuropean higher education space, Universities in Europe are exhorted to cultivate and develop multilingualism. The European Commission’s 2004–2006 action plan for promoting language learning and diversity speaks of the need to build an environment which is favourable to languages. Yet reality indicates that it is English which reigns supreme and has become the main foreign language used as means of instruction at European universities. Internationalisation has played a key role in this process, becoming one of the main drivers of the linguistic hegemony exerted by English. In this paper we examine the opinions of teaching staff involved in English-medium instruction, from pedagogical ecologyof-language and personal viewpoints. Data were gathered using group discussion. The study was conducted at a multilingual Spanish university where majority (Spanish), minority (Basque) and foreign (English) languages coexist, resulting in some unavoidable linguistic strains. The implications for English-medium instruction are discussed at the end of this paper.
Resumo:
Roughly one half of World's languages are in danger of extinction. The endangered languages, spoken by minorities, typically compete with powerful languages such as En- glish or Spanish. Consequently, the speakers of minority languages have to consider that not everybody can speak their language, converting the language choice into strategic,coordination-like situation. We show experimentally that the displacement of minority languages may be partially explained by the imperfect information about the linguistic type of the partner, leading to frequent failure to coordinate on the minority language even between two speakers who can and prefer to use it. The extent of miscoordination correlates with how minoritarian a language is and with the real-life linguistic condition of subjects: the more endangered a language the harder it is to coordinate on its use, and people on whom the language survival relies the most acquire behavioral strategies that lower its use. Our game-theoretical treatment of the issue provides a new perspective for linguistic policies.
Resumo:
Does language-specific orthography help language detection and lexical access in naturalistic bilingual contexts? This study investigates how L2 orthotactic properties influence bilingual language detection in bilingual societies and the extent to which it modulates lexical access and single word processing. Language specificity of naturalistically learnt L2 words was manipulated by including bigram combinations that could be either L2 language-specific or common in the two languages known by bilinguals. A group of balanced bilinguals and a group of highly proficient but unbalanced bilinguals who grew up in a bilingual society were tested, together with a group of monolinguals (for control purposes). All the participants completed a speeded language detection task and a progressive demasking task. Results showed that the use of the information of orthotactic rules across languages depends on the task demands at hand, and on participants' proficiency in the second language. The influence of language orthotactic rules during language detection, lexical access and word identification are discussed according to the most prominent models of bilingual word recognition.
Resumo:
We present a new online psycholinguistic resource for Greek based on analyses of written corpora combined with text processing technologies developed at the Institute for Language & Speech Processing (ILSP), Greece. The "ILSP PsychoLinguistic Resource" (IPLR) is a freely accessible service via a dedicated web page, at http://speech.ilsp.gr/iplr. IPLR provides analyses of user-submitted letter strings (words and nonwords) as well as frequency tables for important units and conditions such as syllables, bigrams, and neighbors, calculated over two word lists based on printed text corpora and their phonetic transcription. Online tools allow retrieval of words matching user-specified orthographic or phonetic patterns. All results and processing code (in the Python programming language) are freely available for noncommercial educational or research use. © 2010 Springer Science+Business Media B.V.
Resumo:
This paper explores the relationships between a computation theory of temporal representation (as developed by James Allen) and a formal linguistic theory of tense (as developed by Norbert Hornstein) and aspect. It aims to provide explicit answers to four fundamental questions: (1) what is the computational justification for the primitive of a linguistic theory; (2) what is the computational explanation of the formal grammatical constraints; (3) what are the processing constraints imposed on the learnability and markedness of these theoretical constructs; and (4) what are the constraints that a linguistic theory imposes on representations. We show that one can effectively exploit the interface between the language faculty and the cognitive faculties by using linguistic constraints to determine restrictions on the cognitive representation and vice versa. Three main results are obtained: (1) We derive an explanation of an observed grammatical constraint on tense?? Linear Order Constraint??m the information monotonicity property of the constraint propagation algorithm of Allen's temporal system: (2) We formulate a principle of markedness for the basic tense structures based on the computational efficiency of the temporal representations; and (3) We show Allen's interval-based temporal system is not arbitrary, but it can be used to explain independently motivated linguistic constraints on tense and aspect interpretations. We also claim that the methodology of research developed in this study??oss-level" investigation of independently motivated formal grammatical theory and computational models??a powerful paradigm with which to attack representational problems in basic cognitive domains, e.g., space, time, causality, etc.
Resumo:
Form-focused instruction is usually based on traditional practical/pedagogical grammar descriptions of grammatical features. The comparison of such traditional accounts with cognitive grammar (CG) descriptions seems to favor CG as a basis of pedagogical rules. This is due to the insistence of CG on the meaningfulness of grammar and its detailed analyses of the meanings of particular grammatical features. The differences between traditional and CG rules/descriptions are exemplified by juxtaposing the two kinds of principles concerning the use of the present simple and present progressive to refer to situations happening or existing at speech time. The descriptions provided the bases for the instructional treatment in a quasi-experimental study exploring the effectiveness of using CG descriptions of the two tenses, and of their interplay with stative (imperfective) and dynamic (perfective) verbs, and comparing this effectiveness with the value of grammar teaching relying on traditional accounts found in standard pedagogical grammars. The study involved 50 participants divided into three groups, with one of them constituting the control group and the other two being experimental ones. One of the latter received treatment based on CG descriptions and the other on traditional accounts. CG-based instruction was found to be at least moderately effective in terms of fostering mostly explicit grammatical knowledge and its effectiveness turned out be comparable to that of teaching based on traditional descriptions.
Resumo:
This paper analyzes the relationship between communication apprehension and language anxiety from the perspective of gender. As virtually no empirical studies have addressed the explicit influence of gender on language anxiety in communication apprehensives, this paper proposes that females are generally more sensitive to anxiety, as reflected in various spheres of communication. For this reason, language anxiety levels in communication apprehensive females should be higher, unlike those of communication apprehensive males. Comparisons between them were made using a student t test, two-way ANOVA, and post-hoc Tukey test. The results revealed that Polish communication apprehensive secondary grammar school males and females do not differ in their levels of language anxiety, although nonapprehensive males experience significantly lower language anxiety than their female peers. It is argued that the finding can be attributed to developmental patterns, gender socialization processes, classroom practices, and the uniqueness of the FL learning process, which is a stereotypically female domain.
Resumo:
Celem publikacji jest podjęcie próby przeanalizowania polityki, jaką Turcja prowadzi wobec obcokrajowców poszukujących schronienia na jej terytorium. O ważkości zagadnienia zadecydowała w ostatnich latach przede wszystkim tocząca się w Syrii wojna domowa, w wyniku której na terytorium Turcji znalazło się ponad 700 tysięcy Syryjczyków. Szczególne w tym kontekście kontrowersje budzi fakt stosowania przez Turcję podwójnych standardów w przedmiocie nadawania imigrantom konwencyjnego statusu uchodźcy. Państwo to, jako jedno z czterech na świecie, w momencie przystępowania do Konwencji dotyczącej statusu uchodźców i Protokołu nowojorskiego zastrzegło sobie prawo do stosowania w tej materii tzw. kryterium geograficznego. W efekcie, o ile status uchodźcy nadany być może osobom przybywającym zza zachodnich granic Turcji, o tyle uciekinierzy z państw takich, jak Syria, Iran, czy Irak z formalnego punktu widzenia są „poszukującymi schronienia” (tur. sığınmacı). To zaś oznacza brak ich konwencyjnej ochrony. Celem artykułu jest jednak nie tylko przeanalizowanie prawnego i rzeczywistego położenia, w jakim znajdują się ofiary syryjskiej wojny domowej, przybywające na terytorium Turcji, a także próba przewidzenia scenariusza rozwoju tejże sytuacji. Celem uczynienia analizy możliwie najbardziej rzetelną, odwołano się zarówno do anglo, jak i tureckojęzycznych materiałów źródłowych.
Resumo:
Based on the experience that today's students find it more difficult than students of previous decades to relate to literature and appreciate its high cultural value, this paper argues that too little is known about the actual teaching and learning processes which take place in literature courses and that, in order to ensure the survival of literary studies in German curricula, future research needs to elucidate for students, the wider public and, most importantly, educational policy makers, why the study of literature should continue to have an important place in modern language curricula. Contending that students' willingness to engage with literature will, in the future, depend to a great extent on the use of imaginative methodology on the part of the teacher, we give a detailed account of an action research project carried out at University College Cork from October to December 2002 which set out to explore the potential of a drama in education approach to the teaching and learning of foreign language literature. We give concrete examples of how this approach works in practice, situate our approach within the subject debate surrounding Drama and the Language Arts and evaluate in detail the learning processes which are typical of performance-based literature learning. Based on converging evidence from different data sources and overall very positive feedback from students, we conclude by recommending that modern language departments introduce courses which offer a hands-on experience of literature that is different from that encountered in lectures and teacher-directed seminars.
Resumo:
This longitudinal study tracked third-level French (n=10) and Chinese (n=7) learners of English as a second language (L2) during an eight-month study abroad (SA) period at an Irish university. The investigation sought to determine whether there was a significant relationship between length of stay (LoS) abroad and gains in the learners' oral complexity, accuracy and fluency (CAF), what the relationship was between these three language constructs and whether the two learner groups would experience similar paths to development. Additionally, the study also investigated whether specific reported out-of-class contact with the L2 was implicated in oral CAF gains. Oral data were collected at three equidistant time points; at the beginning of SA (T1), midway through the SA sojourn (T2) and at the end (T3), allowing for a comparison of CAF gains arising during one semester abroad to those arising during a subsequent semester. Data were collected using Sociolinguistic Interviews (Labov, 1984) and adapted versions of the Language Contact Profile (Freed et al., 2004). Overall, the results point to LoS abroad as a highly influential variable in gains to be expected in oral CAF during SA. While one semester in the TL country was not enough to foster statistically significant improvement in any of the CAF measures employed, significant improvement was found during the second semester of SA. Significant differences were also revealed between the two learner groups. Finally, significant correlations, some positive, some negative, were found between gains in CAF and specific usage of the L2. All in all, the disaggregation of the group data clearly illustrates, in line with other recent enquiries (e.g. Wright and Cong, 2014) that each individual learner's path to CAF development was unique and highly individualised, thus providing strong evidence for the recent claim that SLA is "an individualized nonlinear endeavor" (Polat and Kim, 2014: 186).
Resumo:
The Leaving Certificate (LC) is the national, standardised state examination in Ireland necessary for entry to third level education – this presents a massive, raw corpus of data with the potential to yield invaluable insight into the phenomena of learner interlanguage. With samples of official LC Spanish examination data, this project has compiled a digitised corpus of learner Spanish comprised of the written and oral production of 100 candidates. This corpus was then analysed using a specific investigative corpus technique, Computer-aided Error Analysis (CEA, Dagneaux et al, 1998). CEA is a powerful apparatus in that it greatly facilitates the quantification and analysis of a large learner corpus in digital format. The corpus was both compiled and analysed with the use of UAM Corpus Tool (O’Donnell 2013). This Tool allows for the recording of candidate-specific variables such as grade, examination level, task type and gender, therefore allowing for critical analysis of the corpus as one unit, as separate written and oral sub corpora and also of performance per task, level and gender. This is an interdisciplinary work combining aspects of Applied Linguistics, Learner Corpus Research and Foreign Language (FL) Learning. Beginning with a review of the context of FL learning in Ireland and Europe, I go on to discuss the disciplinary context and theoretical framework for this work and outline the methodology applied. I then perform detailed quantitative and qualitative analyses before going on to combine all research findings outlining principal conclusions. This investigation does not make a priori assumptions about the data set, the LC Spanish examination, the context of FLs or of any aspect of learner competence. It undertakes to provide the linguistic research community and the domain of Spanish language learning and pedagogy in Ireland with an empirical, descriptive profile of real learner performance, characterising learner difficulty.
Resumo:
Japanese Language Teaching examines the practical aspects of the acquisition of Japanese as a second language, underpinned by current theory and research. Each chapter examines the theory and practice of language teaching, and progresses to a consideration of the practical design of tasks for teaching. The final section applies theory and practice to an empirical case study, drawn from a classroom with Japanese as a second language. With its emphasis on practice underpinned by contemporary theory, this book will be of interest to postgraduates studying second language acquisition and applied linguistics. [Source: publisher's description].