87 resultados para Fribourg language question
em Helda - Digital Repository of University of Helsinki
Resumo:
In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.
Resumo:
The common focus of the studies brought together in this work is the prosodic segmentation of spontaneous speech. The theoretically most central aspect is the introduction and further development of the IJ-model of intonational chunking. The study consists of a general introduction and five detailed studies that approach prosodic chunking from different perspectives. The data consist of recordings of face-to-face interaction in several spoken varieties of Finnish and Finland Swedish; the methodology is usage-based and qualitative. The term “speech prosody” refers primarily to the melodic and rhythmic characteristics of speech. Both speaking and understanding speech require the ability to segment the flow of speech into suitably sized prosodic chunks. In order to be usage-based, a study of spontaneous speech consequently needs to be based on material that is segmented into prosodic chunks of various sizes. The segmentation is seen to form a hierarchy of chunking. The prosodic models that have so far been developed and employed in Finland have been based on sentences read aloud, which has made it difficult to apply these models in the analysis of spontaneous speech. The prosodic segmentation of spontaneous speech has not previously been studied in detail in Finland. This research focuses mainly on the following three questions: (1) What are the factors that need to be considered when developing a model of prosodic segmentation of speech, so that the model can be employed regardless of the language or dialect under analysis? (2) What are the characteristics of a prosodic chunk, and what are the similarities in the ways chunks of different languages and varieties manifest themselves that will make it possible to analyze different data according to the same criteria? (3) How does the IJ-model of intonational chunking introduced as a solution to question (1) function in practice in the study of different varieties of Finnish and Finland Swedish? The boundaries of the prosodic chunks were manually marked in the material according to context-specific acoustic and auditory criteria. On the basis of the data analyzed, the IJ-model was further elaborated and implemented, thus allowing comparisons between different language varieties. On the basis of the empirical comparisons, a prosodic typology is presented for the dialects of Swedish in Finland. The general contention is that the principles of the IJ-model can readily be used as a methodological tool for prosodic analysis irrespective of language varieties.
Resumo:
The book consists of an Introduction and four articles published both in Finland and abroad, written in English or Russian. They present the studies of eight Finnish and Russian idiomatic constructions that appear in the following examples: Ikkuna rikki — Окно сломано, lit.: ‘the window broken’, Äiti täällä — Мама здесь, lit.: ‘mother here’, Kaikki myymälöihin! — Все в магазин, lit.: ‘all to the shops’, Пить так пить! ≈ ‘When I drink, I drink (a lot)!’, etc. The aim of the studies is to reconstruct the origins and to trace the development of the above-mentioned constructions up to their modern usages. To this end, the constructions are investigated both from historical and from comparative perspectives. Finally, the case studies provide a possibility to develop more general bases of development of these 'ungrammatical' items. By attempting to answer the question why such constructions develop even though they destroy the harmonious structure of а language, some principles of idiomatization are postulated in the Conclusion.
Resumo:
"The Art of Sympathy: Forms of Moral and Emotional Persuasion" in Fiction is an interdisciplinary study that looks closely at the ways that stories evoke sympathy, and the significance of this emotion for the development of moral attitudes and awareness. By linking readers' emotional responses to fiction with the potential impact of such responses on "the moral imagination," the study builds on empirical research conducted by literary scholars and psychologists into the emotional effects of reading fiction, as well as social psychological research into the connections between empathy/sympathy and moral development. I first investigate the dynamics of readers beliefs regarding characters in fictional narratives, and the nature of the emotions that they may experience as a result of those beliefs. The analysis demonstrates that there are important similarities between real emotions and emotions generated by fiction. Recognizing these similarities, I claim, can help us to conceptualize the nature of sympathetic responses to fictional characters. Building on these assertions, I then draw on research from social psychology and philosophy to develop a comprehensive definition of sympathy and to clarify the ways in which sympathy operates, both in people s daily lives and in readers sympathetic responses to fictional characters. Having established this definition and delineated its practical implications, I then examine how particular stories, through a variety of narrative techniques, persuade readers to feel sympathy for characters who are unsympathetic in certain ways. In order to verify my claims about the impact of these stories on readers emotions, I also review the results of tests that I conducted with nearly 200 adolescent readers. Through these tests, which were constructed and scored according to methods prevalent in social psychological research, it was determined that a majority of readers felt sympathy for the protagonists in two of the stories included in the study. These results were combined with data from an additional test, a standard measure of empathy and sympathy in the field of social psychology. The cross-tabulation of these results suggests that there was not a strong connection between readers responses and their general tendencies to feel sympathy for others. This finding would appear to support my hypotheses regarding the sympathetic persuasiveness of the stories in question. In light of these results, finally, I consider the potential contribution that fiction can make to adolescent emotional and moral development and the implications of that potential for future language arts curricula in the schools. In particular, I suggest the pedagogical importance of providing adolescents with opportunities to engage with the lives of fictional characters, and especially to experience feelings of sympathy for individuals towards whom they ordinarily might feel aversion.
Resumo:
The present dissertation analyses 36 local vernaculars of villages surrounding the northern Russian city of Vologda in relation to the system of the vowels in the stressed syllables and those preceding the stressed syllables by using the available dialectological researches. The system in question differs from the corresponding standard Russian system by that the palatalisation of the surrounding consonants affects the vowels much more significantly in the vernaculars, whereas the phonetic difference between the stressed and non-stressed vowels is less obvious in them. The detailed information on the local vernaculars is retrieved from the Dialektologičeskij Atlas Russkogo Jazyka dialect atlas, the data for which were collected, for the most part, in the 1940 s and 1950 s. The theoretical framework of the research consists of a brief cross-section of western sociolinguistic theory related to language change and that of historical linguistics related to the Slavonic vowel development, which includes some new theories concerning the development of the Russian vowel phonemes. The author has collected dialect data in one of the 36 villages and three villages surrounding it. During the fieldwork, speech of nine elderly persons and ten school children was recorded. The speech data were then transcribed with coded information on the corresponding etymological vowels, the phonetic position, and the factual pronunciation at each appearance of vowels in the phonetic positions named above. The data from both of the dialect strata were then systematised to two corresponding systems that were compared with the information retrievable from the dialect atlas and other dialectological literature on the vowel phoneme system of the traditional local vernacular. As a result, it was found out (as hypothesised) that the vernacular vowel phoneme system has approached that of the standard language but has nonetheless not become similar to it. The phoneme quantity of the traditional vernacular is by one greater than that of the standard language, whereas the vowel phoneme quantity in the speech of the school children coincides with that in the standard language, although the phonetic realisations differ to some extent. The analysis of the speech of the elderly people resulted in that it is quite difficult to define the exact phoneme quantity of this stratum due to the fluctuation and irregularities in the realisation of the old phoneme that has ceased to exist in the newest stratum. It was noticed that the effect of the quality of the surrounding consonants on the phonetic realisation of the vowel phonemes has diminished, and the dependence of the phonetic realisation of a vowel phoneme on its place in a word in relation to the word stress has become more and more obvious, which is the state of affairs in the standard language as well.
Resumo:
The methodology of extracting information from texts has widely been described in the current literature. However, the methodology has been developed mainly for the purposes of other fields than terminology science. In addition, the research has been English language oriented. Therefore, there are no satisfactory language-independent methods for extracting terminological information from texts. The aim of the present study is to form the basis for a further improvement of methods for extraction of terminological information. A further aim is to determine differences in term extraction between subject groups with or without knowledge of the special field in question. The study is based on the theory of terminology, and has mainly a qualitative approach. The research material consists of electronically readable specialized texts in the subject domain of maritime safety. Textbooks, conference papers, research reports and articles from professional journals in Finnish and in Russian are included. The thesis first deals with certain term extraction methods. These are manual term identification and semi-automatic term extraction, the latter of which was carried out by using three commercial computer programs. The results of term extraction were compared and the recall and precision of the methods were evaluated. The latter part of the study is dedicated to the identification of concept relations. Certain linguistic expressions, which some researchers call knowledge probes, were applied to identify concept relations. The results of the present thesis suggest that special field knowledge is an advantage in manual term identification. However, in the candidate term lists the variation between subject groups was not as remarkable as it was between individual subjects. The term extraction software tested here produces candidate term lists which can be useful, but only after some manual work. Therefore, the work emphasizes the need to further develop term extraction software. Furthermore, the analyses indicate that there are a certain number of terms which were extracted by all the subjects and the software. These terms we call core terms. As the result of the experiment on linguistic expressions which signal concept relations, a proposal of Finnish and Russian knowledge probes in the field of maritime safety was made. The main finding was that it would be useful to combine the use of knowledge probes with semi-automatic term extraction since knowledge probes usually occur in the vicinity of terms.
Resumo:
This study deals with language change and variation in the correspondence of the eighteenth-century Bluestocking circle, a social network which provided learned men and women with an informal environment for the pursuit of scholarly entertainment. Elizabeth Montagu (1718 1800), a notable social hostess and a Shakespearean scholar, was one of their key figures. The study presents the reconstruction of Elizabeth Montagu s social networks from her youth to her later years with a special focus on the Bluestocking circle, and linguistic research on private correspondence between Montagu and her Bluestocking friends and family members between the years 1738 1778. The epistolary language use is investigated using the methods and frameworks of corpus linguistics, historical sociolinguistics, and social network analysis. The approach is diachronic and concerns real-time language change. The research is based on a selection of manuscript letters which I have edited and compiled into an electronic corpus (Bluestocking Corpus). I have also devised a network strength scale in order to quantify the strength of network ties and to compare the results of the linguistic research with the network analysis. The studies range from the reconstruction and analysis of Elizabeth Montagu s most prominent social networks to the analysis of changing morphosyntactic features and spelling variation in Montagu s and her network members correspondence. The linguistic studies look at the use of the progressive construction, preposition stranding and pied piping, and spelling variation in terms of preterite and past participle endings in the regular paradigm (-ed, - d, -d, - t, -t) and full / contracted spellings of auxiliary verbs. The results are analysed in terms of social network membership, sociolinguistic variables of the correspondents, and, when relevant, aspects of eighteenth-century linguistic prescriptivism. The studies showed a slight diachronic increase in the use of the progressive, a significant decrease of the stigmatised preposition stranding and increase of pied piping, and relatively informal but socially controlled epistolary spelling. Certain significant changes in Elizabeth Montagu s language use over the years could be attributed to her increasingly prominent social standing and the changes in her social networks, and the strength of ties correlated strongly with the use of the progressive in the Bluestocking Corpus. Gender, social rank, and register in terms of kinship/friendship had a significant influence in language use, and an effect of prescriptivism could also be detected. Elizabeth Montagu s network ties resulted in language variation in terms of network membership, her own position in a given network, and the social factors that controlled eighteenth-century interaction. When all the network ties are strong, linguistic variation seems to be essentially linked to the social variables of the informants.
Resumo:
This doctoral thesis focuses on the translation of Finnish prose literature into English in the United Kingdom between 1945 and 2003. The subject is approached using translation archaeology, interviews, archival material, detailed text analysis and reception material. The main theoretical framework is Descriptive Translation Studies, and certain sociological theories (Bourdieu s field theory, actor-network theory) are also used. After charting the published translations, two periods of time are selected for closer analysis: an earlier period from 1955 to 1959, involving eight translations, and a later one from 1990 to 2003, with a total of six translations. While these translation numbers may appear low, they are actually rather high in proportion to the total number of 28 one-author literary prose translations published in the UK over the approximately 60 years being studied. The two periods of time, the 1950s and 1990s, are compared in terms of the sociological context of translation activity, the reception of translations and their textual features. The comparisons show that the main changes in translation practice between these two periods are increased completeness (translations in the 1950s group often being shortened by hundreds of pages) and lesser use of indirect translation via an intermediary language (about half of the 1950s translations having been translated via Swedish). Otherwise, translation practices have not changed much: except for large omissions, which are far more frequent in the 1950s, variation within each group is larger than between groups. As to the sociological context, the main changes are an increase in long-term institution-level contacts and an increase in the promotion of foreign translation rights by Finnish publishing houses. This is in contrast to the 1950s when translation rights were mainly sold through personal contacts by individual authors and translators. The reception of translations is difficult to study because of scarce material. However, the 1950s translations were aggressively marketed and therefore obtained far more reviews and reprints than the 1990s translations. Several of the 1950s books, mostly historical novels by Mika Waltari, were mainstream bestsellers at the time, while current translations are frequently made for niche markets. The thesis introduces ample new material on the translation of Finnish prose literature into English in the UK. The results are also relevant to translation from a minority literature into a majority one. As to translation theory, they lead us to question the social nature of translation norms and the assumption of a static target culture. The translations analysed here are located in a very fragmented interculture and gain a stronger position in the Finnish culture than in the British one.
Resumo:
My dissertation is a corpus-based study of non-finite constructions in Old English (OE). It revisits the question of Latin influence on the OE syntax, offering a new evaluation of syntactic interference between Latin and OE, and, more generally, of the contact situation in the OE period, drawing on methods used in studying grammaticalization and language contact. I address three non-finite constructions: absolute participial construction, accusative-and-infinitive construction, and nominative-and-infinitive construction, exemplified respectively in present-day English as - She looked like a pixie sometimes, her eyes darting here and there, forever watchful (BNC CCM 98); - My first acquaintance with her was when I heard her sing (BNC CFY 2215); - Charles the Bald was said to resemble his grandfather physically (BNC HPT 175). This study compares data from translated texts against the background of original OE writings, establishing dependencies and differences between the two. Although the contrastive analysis of source and target texts is one of the major methods employed in the study, translation and translation strategies as such are only my secondary foci. The emphasis is rather on what source/target comparison can tell us about the OE non-finite syntax and the typological differences between Latin and OE in this domain, and on whether contact-induced change can originate in translation. In terms of theoretical framework, I have adopted functional-typological approach, which rests on the principles of iconicity and event integration, and to the best of my knowledge, has not been applied systematically to OE non-finite constructions. Therefore one more aim of the dissertation is to test this framework and to see how OE fits into the cross-linguistic picture of non-finites. My research corpus consists of two samples: 1) written OE closely dependent on the Latin originals, based on editions of two gloss texts, five translations, and Latin originals of these texts, representing four text types: hymns, religious regulations, homily/life narrative, and biblical narrative (180,622 words); and 2) written OE as far independent from Latin as possible, based on a selection from the York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE) and representing five text types: laws, charters, correspondence, chronicle narrative, and homily/life narrative (274,757 words).
Resumo:
In the thesis it is discussed in what ways concepts and methodology developed in evolutionary biology can be applied to the explanation and research of language change. The parallel nature of the mechanisms of biological evolution and language change is explored along with the history of the exchange of ideas between these two disciplines. Against this background computational methods developed in evolutionary biology are taken into consideration in terms of their applicability to the study of historical relationships between languages. Different phylogenetic methods are explained in common terminology, avoiding the technical language of statistics. The thesis is on one hand a synthesis of earlier scientific discussion, and on the other an attempt to map out the problems of earlier approaches in addition to finding new guidelines in the study of language change on their basis. Primarily literature about the connections between evolutionary biology and language change, along with research articles describing applications of phylogenetic methods into language change have been used as source material. The thesis starts out by describing the initial development of the disciplines of evolutionary biology and historical linguistics, a process which right from the beginning can be seen to have involved an exchange of ideas concerning the mechanisms of language change and biological evolution. The historical discussion lays the foundation for the handling of the generalised account of selection developed during the recent few decades. This account is aimed for creating a theoretical framework capable of explaining both biological evolution and cultural change as selection processes acting on self-replicating entities. This thesis focusses on the capacity of the generalised account of selection to describe language change as a process of this kind. In biology, the mechanisms of evolution are seen to form populations of genetically related organisms through time. One of the central questions explored in this thesis is whether selection theory makes it possible to picture languages are forming populations of a similar kind, and what a perspective like this can offer to the understanding of language in general. In historical linguistics, the comparative method and other, complementing methods have been traditionally used to study the development of languages from a common ancestral language. Computational, quantitative methods have not become widely used as part of the central methodology of historical linguistics. After the fading of a limited popularity enjoyed by the lexicostatistical method since the 1950s, only in the recent years have also the computational methods of phylogenetic inference used in evolutionary biology been applied to the study of early language history. In this thesis the possibilities offered by the traditional methodology of historical linguistics and the new phylogenetic methods are compared. The methods are approached through the ways in which they have been applied to the Indo-European languages, which is the most thoroughly investigated language family using both the traditional and the phylogenetic methods. The problems of these applications along with the optimal form of the linguistic data used in these methods are explored in the thesis. The mechanisms of biological evolution are seen in the thesis as parallel in a limited sense to the mechanisms of language change, however sufficiently so that the development of a generalised account of selection is deemed as possibly fruiful for understanding language change. These similarities are also seen to support the validity of using phylogenetic methods in the study of language history, although the use of linguistic data and the models of language change employed by these models are seen to await further development.
Resumo:
Tämä tutkielma on osa Helsingin yliopiston rahoittamaa HY-talk -tutkimusprojektia, jonka tavoite on vankentaa puheviestinnän, erityisesti vieraiden kielten suullisen taidon opetusta ja arviointia yleissivistävässä koulutuksessa ja korkeakouluasteella. Tämän tutkielman tavoite on selvittää millaisia korjauksia englantia vieraana kielenä puhuvat ihmiset tekevät puheeseensa ja tutkia itsekorjauksen ja sujuvuuden välistä suhdetta. Korjausjäsennystä ja itsekorjausta on aiemmin tutkittu sekä keskustelunanalyysin että psykolingvistiikan aloilla, ja vaikka tämä tutkielma onkin lähempänä aiempaa keskustelunanalyyttistä kuin psykolingvististä tutkimusta, siinä hyödynnetään molempia suuntauksia. Itsekorjausta on yleisesti pidetty merkkinä erityisesti ei-natiivien kielenpuhujien sujuvuuden puutteesta. Tämän tutkielman tarkoitus on selvittää, kuinka läheisesti itsekorjaus todella liittyy sujuvuuteen tai sen puutteeseen. Tutkielman materiaali koostuu HY-talk -projektia varten kerätyistä puhenäytteistä ja niiden pohjalta tehdyistä taitotasoarvioinneista. Puhenäytteet kerättiin vuonna 2007 projektia varten järjestettyjen puhekielen testaustilanteiden yhteydessä kolmessa eteläsuomalaisessa koulussa. Koska projektin tavoitteena on tutkia ja parantaa kielten suullisen taidon arviointia, projektissa mukana olleet kieliammattilaiset arvioivat puhujien taitotasot projektia varten (Eurooppalaisen Viitekehyksen taitotasokuvainten pohjalta) koottujen arviointiasteikoiden perusteella, ja nämä arvioinnit tallennettiin osaksi projektin materiaalia. Tutkielmassa analysoidaan itsekorjauksia aiemman psykolingvistisen tutkimuksen pohjalta kootun korjaustyyppiluokituksen sekä tätä tutkielmaa varten luodun korjausten oikeellisuutta vertailevan luokituksen avulla. Lisäksi siinä vertaillaan kahden korkeamman ja kahden matalamman taitotasoarvioinnin saaneen puhujan itsekorjauksia. Tulokset osoittavat, että ei-natiivien puheessa esiintyy monenlaisia eri korjaustyyppejä, ja että yleisimpiä korjauksia ovat alkuperäisen lausuman toistot. Yleisiä ovat myös korjaukset, joissa puhuja korjaa virheen tai keskeyttää puheensa ja aloittaa kokonaan uuden lausuman. Lisäksi tuloksista käy ilmi, ettei suurin osa korjauksista todennäköisesti johdu puhujien sujuvuuden puutteesta. Yleisimmät korjaustyypit voivat johtua suurimmaksi osaksi yksilön puhetyylistä, siitä, että puhuja hakee jotain tiettyä sanaa tai ilmausta mielessään tai siitä, että puhuja korjaa puheessaan huomaamansa kieliopillisen, sanastollisen tai äänteellisen virheen. Vertailu korkeammalle ja matalammalle taitotasolle arvioitujen puhujien välillä osoittaa selkeimmin, ettei suurin osa itsekorjauksista ole yhteydessä puhujan sujuvuuteen. Vertailusta käy ilmi, ettei pelkkä itsekorjausten määrä kerro kuinka sujuvasti puhuja käyttää kieltä, sillä toinen korkeammalle taitotasolle arvioiduista puhujista korjaa puhettaan lähes yhtä monesti kuin matalammalle tasolle arvioidut puhujat. Lisäksi korjausten oikeellisuutta vertailevan luokituksen tulokset viittaavat siihen, etteivät niin korkeammalle kuin matalammallekaan tasolle arvioidut puhujat useimmiten korjaa puhettaan siksi, etteivät pystyisi ilmaisemaan viestiään oikein ja ymmärrettävästi.
Resumo:
A 26-hour English reading comprehension course was taught to two groups of second year Finnish Pharmacy students: a virtual group (33 students) and a teacher-taught group (25 students). The aims of the teaching experiment were to find out: 1.What has to be taken into account when teaching English reading comprehension to students of pharmacy via the Internet and using TopClass? 2. How will the learning outcomes of the virtual group and the control group differ? 3. How will the students and the Department of Pharmacy respond to the different and new method, i.e. the virtual teaching method? 4. Will it be possible to test English reading comprehension learning material using the groupware tool TopClass? The virtual exercises were written within the Internet authoring environment, TopClass. The virtual group was given the reading material and grammar booklet on paper, but they did the reading comprehension tasks (written by the teacher), autonomously via the Internet. The control group was taught by the same teacher in 12 2-hour sessions, while the virtual group could work independently within the given six weeks. Both groups studied the same material: ten pharmaceutical articles with reading comprehension tasks as well as grammar and vocabulary exercises. Both groups took the same final test. Students in both groups were asked to evaluate the course using a 1 to 5 rating scale and they were also asked to assess their respective courses verbally. A detailed analysis of the different aspects of the student evaluation is given. Conclusions: 1.The virtual students learned pharmaceutical English relatively well but not significantly better than the classroom students 2. The overall student satisfaction in the virtual pharmacy English reading comprehension group was found to be higher than that in the teacher-taught control group. 3. Virtual learning is easier for linguistically more able students; less able students need more time with the teacher. 4. The sample in this study is rather small, but it is a pioneering study. 5. The Department of Pharmacy in the University of Helsinki wishes to incorporate virtual English reading comprehension teaching in its curriculum. 6. The sophisticated and versatile TopClass system is relatively easy for a traditional teacher and quite easy for the students to learn. It can be used e.g. for automatic checking of routine answers and document transfer, which both lighten the workloads of both parties. It is especially convenient for teaching reading comprehension. Key words: English reading comprehension, teacher-taught class, virtual class, attitudes of students, learning outcomes