40 resultados para Literatures of Germanic languages
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
The research on language equations has been active during last decades. Compared to the equations on words the equations on languages are much more difficult to solve. Even very simple equations that are easy to solve for words can be very hard for languages. In this thesis we study two of such equations, namely commutation and conjugacy equations. We study these equations on some limited special cases and compare some of these results to the solutions of corresponding equations on words. For both equations we study the maximal solutions, the centralizer and the conjugator. We present a fixed point method that we can use to search these maximal solutions and analyze the reasons why this method is not successful for all languages. We give also several examples to illustrate the behaviour of this method.
Resumo:
Jussi-Pekka Hakkaraisen esitys Viron kielen instituutissa (Eesti keele instituut) Tallinnassa 23.10.2013.
Resumo:
Presentation at "Soome-ugri keelte andmebaasid ja e-leksikograafia" at Eesti Keele Instituut (Institution of Estonian Languages) in Tallnn on the 18th of November 2014.
Resumo:
Jussi-Pekka Hakkaraisen esitys Ala-Saksin Valtiollisessa ja Yliopistollisessa kirjastossa Göttingenissä 28.5.2013
Resumo:
Presentation of Jussi-Pekka Hakkarainen, held at the Emtacl15 conference on the 20th of April 2015 in Trondheim, Norway.
Resumo:
The emerging technologies have recently challenged the libraries to reconsider their role as a mere mediator between the collections, researchers, and wider audiences (Sula, 2013), and libraries, especially the nationwide institutions like national libraries, haven’t always managed to face the challenge (Nygren et al., 2014). In the Digitization Project of Kindred Languages, the National Library of Finland has become a node that connects the partners to interplay and work for shared goals and objectives. In this paper, I will be drawing a picture of the crowdsourcing methods that have been established during the project to support both linguistic research and lingual diversity. The National Library of Finland has been executing the Digitization Project of Kindred Languages since 2012. The project seeks to digitize and publish approximately 1,200 monograph titles and more than 100 newspapers titles in various, and in some cases endangered Uralic languages. Once the digitization has been completed in 2015, the Fenno-Ugrica online collection will consist of 110,000 monograph pages and around 90,000 newspaper pages to which all users will have open access regardless of their place of residence. The majority of the digitized literature was originally published in the 1920s and 1930s in the Soviet Union, and it was the genesis and consolidation period of literary languages. This was the era when many Uralic languages were converted into media of popular education, enlightenment, and dissemination of information pertinent to the developing political agenda of the Soviet state. The ‘deluge’ of popular literature in the 1920s to 1930s suddenly challenged the lexical orthographic norms of the limited ecclesiastical publications from the 1880s onward. Newspapers were now written in orthographies and in word forms that the locals would understand. Textbooks were written to address the separate needs of both adults and children. New concepts were introduced in the language. This was the beginning of a renaissance and period of enlightenment (Rueter, 2013). The linguistically oriented population can also find writings to their delight, especially lexical items specific to a given publication, and orthographically documented specifics of phonetics. The project is financially supported by the Kone Foundation in Helsinki and is part of the Foundation’s Language Programme. One of the key objectives of the Kone Foundation Language Programme is to support a culture of openness and interaction in linguistic research, but also to promote citizen science as a tool for the participation of the language community in research. In addition to sharing this aspiration, our objective within the Language Programme is to make sure that old and new corpora in Uralic languages are made available for the open and interactive use of the academic community as well as the language societies. Wordlists are available in 17 languages, but without tokenization, lemmatization, and so on. This approach was verified with the scholars, and we consider the wordlists as raw data for linguists. Our data is used for creating the morphological analyzers and online dictionaries at the Helsinki and Tromsø Universities, for instance. In order to reach the targets, we will produce not only the digitized materials but also their development tools for supporting linguistic research and citizen science. The Digitization Project of Kindred Languages is thus linked with the research of language technology. The mission is to improve the usage and usability of digitized content. During the project, we have advanced methods that will refine the raw data for further use, especially in the linguistic research. How does the library meet the objectives, which appears to be beyond its traditional playground? The written materials from this period are a gold mine, so how could we retrieve these hidden treasures of languages out of the stack that contains more than 200,000 pages of literature in various Uralic languages? The problem is that the machined-encoded text (OCR) contains often too many mistakes to be used as such in research. The mistakes in OCRed texts must be corrected. For enhancing the OCRed texts, the National Library of Finland developed an open-source code OCR editor that enabled the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary to implement, since these rare and peripheral prints did often include already perished characters, which are sadly neglected by the modern OCR software developers, but belong to the historical context of kindred languages and thus are an essential part of the linguistic heritage (van Hemel, 2014). Our crowdsourcing tool application is essentially an editor of Alto XML format. It consists of a back-end for managing users, permissions, and files, communicating through a REST API with a front-end interface—that is, the actual editor for correcting the OCRed text. The enhanced XML files can be retrieved from the Fenno-Ugrica collection for further purposes. Could the crowd do this work to support the academic research? The challenge in crowdsourcing lies in its nature. The targets in the traditional crowdsourcing have often been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguists are not necessarily met. Also, the remarkable downside is the lack of shared goal or the social affinity. There is no reward in the traditional methods of crowdsourcing (de Boer et al., 2012). Also, there has been criticism that digital humanities makes the humanities too data-driven and oriented towards quantitative methods, losing the values of critical qualitative methods (Fish, 2012). And on top of that, the downsides of the traditional crowdsourcing become more imminent when you leave the Anglophone world. Our potential crowd is geographically scattered in Russia. This crowd is linguistically heterogeneous, speaking 17 different languages. In many cases languages are close to extinction or longing for language revitalization, and the native speakers do not always have Internet access, so an open call for crowdsourcing would not have produced appeasing results for linguists. Thus, one has to identify carefully the potential niches to complete the needed tasks. When using the help of a crowd in a project that is aiming to support both linguistic research and survival of endangered languages, the approach has to be a different one. In nichesourcing, the tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for complex tasks with high-quality product expectations found in nichesourcing. Communities have a purpose and identity, and their regular interaction engenders social trust and reputation. These communities can correspond to research more precisely (de Boer et al., 2012). Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. In nichesourcing, we hand in such assignments that would precisely fill the gaps in linguistic research. A typical task would be editing and collecting the words in such fields of vocabularies where the researchers do require more information. For instance, there is lack of Hill Mari words and terminology in anatomy. We have digitized the books in medicine, and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with the OCR editor. From the nichesourcing’s perspective, it is essential that altruism play a central role when the language communities are involved. In nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit from the results. For instance, the corrected words in Ingrian will be added to an online dictionary, which is made freely available for the public, so the society can benefit, too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of ‘two masters’: research and society.
Resumo:
There are more than 7000 languages in the world, and many of these have emerged through linguistic divergence. While questions related to the drivers of linguistic diversity have been studied before, including studies with quantitative methods, there is no consensus as to which factors drive linguistic divergence, and how. In the thesis, I have studied linguistic divergence with a multidisciplinary approach, applying the framework and quantitative methods of evolutionary biology to language data. With quantitative methods, large datasets may be analyzed objectively, while approaches from evolutionary biology make it possible to revisit old questions (related to, for example, the shape of the phylogeny) with new methods, and adopt novel perspectives to pose novel questions. My chief focus was on the effects exerted on the speakers of a language by environmental and cultural factors. My approach was thus an ecological one, in the sense that I was interested in how the local environment affects humans and whether this human-environment connection plays a possible role in the divergence process. I studied this question in relation to the Uralic language family and to the dialects of Finnish, thus covering two different levels of divergence. However, as the Uralic languages have not previously been studied using quantitative phylogenetic methods, nor have population genetic methods been previously applied to any dialect data, I first evaluated the applicability of these biological methods to language data. I found the biological methodology to be applicable to language data, as my results were rather similar to traditional views as to both the shape of the Uralic phylogeny and the division of Finnish dialects. I also found environmental conditions, or changes in them, to be plausible inducers of linguistic divergence: whether in the first steps in the divergence process, i.e. dialect divergence, or on a large scale with the entire language family. My findings concerning Finnish dialects led me to conclude that the functional connection between linguistic divergence and environmental conditions may arise through human cultural adaptation to varying environmental conditions. This is also one possible explanation on the scale of the Uralic language family as a whole. The results of the thesis bring insights on several different issues in both a local and a global context. First, they shed light on the emergence of the Finnish dialects. If the approach used in the thesis is applied to the dialects of other languages, broader generalizations may be drawn as to the inducers of linguistic divergence. This again brings us closer to understanding the global patterns of linguistic diversity. Secondly, the quantitative phylogeny of the Uralic languages, with estimated times of language divergences, yields another hypothesis as to the shape and age of the language family tree. In addition, the Uralic languages can now be added to the growing list of language families studied with quantitative methods. This will allow broader inferences as to global patterns of language evolution, and more language families can be included in constructing the tree of the world’s languages. Studying history through language, however, is only one way to illuminate the human past. Therefore, thirdly, the findings of the thesis, when combined with studies of other language families, and those for example in genetics and archaeology, bring us again closer to an understanding of human history.
Resumo:
The purpose of this comparative study is to profile second language learners by exploring the factors which have an impact on their learning. The subjects come from two different countries: one group comes from Milwaukee, US, and the other from Turku, Finland. The subjects have attended bilingual classes from elementary school to senior high school in their respective countries. In the United States, the subjects (N = 57) started in one elementary school from where they moved on to two high schools in the district. The Finnish subjects (N = 39) attended the same school from elementary to high school. The longitudinal study was conducted during 1994-2004 and combines both qualitative and quantitative research methods. A Pilot Study carried out in 1990-1991 preceded the two subsequent studies that form the core material of this research. The theoretical part of the study focuses first on language policies in the United States and Finland: special emphasis is given to the history, development and current state of bilingual education, and the factors that have affected policy-making in the provision of language instruction. Current language learning theories and models form the theoretical foundation of the research, and underpin the empirical studies. Cognitively-labeled theories are at the forefront, but sociocultural theory and the ecological approach are also accounted for. The research methods consist of questionnaires, compositions and interviews. A combination of statistical methods as well as content analysis were used in the analysis. The attitude of the bilingual learners toward L1 and L2 was generally positive: the subjects enjoyed learning through two languages and were motivated to learn both. The knowledge of L1 and parental support, along with early literacy in L1, facilitated the learning of L2. This was particularly evident in the American subject group. The American subjects’ L2 learning was affected by the attitudes of the learners to the L1 culture and its speakers. Furthermore, the negative attitudes taken by L1 speakers toward L2 speakers and the lack of opportunities to engage in activities in the L1 culture affected the American subjects’ learning of L2, English. The research showed that many American L2 learners were isolated from the L1 culture and were even afraid to use English in everyday communication situations. In light of the research results, a politically neutral linguistic environment, which the Finnish subjects inhabited, was seen to be more favorable for learning. The Finnish subjects were learning L2, English, in a neutral zone where their own attitudes and motivation dictated their learning. The role of L2 as a means of international communication in Finland, as opposed to a means of exercising linguistic power, provided a neutral atmosphere for learning English. In both the American and Finnish groups, the learning of other languages was facilitated when the learner had a good foundation in their L1, and the learning of L1 and L2 were in balance. Learning was also fostered when the learners drew positive experiences from their surroundings and were provided with opportunities to engage in activities where L2 was used.
Resumo:
The focus of this study is to examine the role of police and immigrants’ relations, as less is known about this process in the country. The studies were approached in two different ways. Firstly, an attempt was made to examine how immigrants view their encounters with the police. Secondly, the studies explored how aware the police are of immigrants’ experiences in their various encounters and interactions on the street level. An ancillary aim of the studies is to clarify, analyse and discuss how prejudice and stereotypes can be tackled, thereby contributing to the general debate about racism and discrimination for better ethnic relations in the country. The data in which this analysis was based is on a group of adults (n=88) from the total of 120 Africans questioned for the entire study (n=45) police cadets and (n=6) serving police officers from Turku. The present thesis is a compilation of five articles. A summary of each article findings follows, as the same data was used in all five studies. In the first study, a theoretical model was developed to examine the perceived knowledge of bias by immigrants resulting from race, culture and belief. This was also an attempt to explore whether this knowledge was predetermined in my attempt to classify and discuss as well as analyse the factors that may be influencing immigrants’ allegations of unfair treatment by the police in Turku. The main finding shows that in the first paper there was ignorance and naivety on the part of the police in their attitudes towards the African immigrant’s prior experiences with the police, and this may probably have resulted from stereotypes or their lack of experience as well as prior training with immigrants where these kinds of experience are rampant in the country (Egharevba, 2003 and 2004a). In exploring what leads to stereotypes, a working definition is the assumption that is prevalent among some segments of the population, including the police, that Finland is a homogenous country by employing certain conducts and behaviour towards ethnic and immigrant groups in the country. This to my understanding is stereotype. Historically this was true, but today the social topography of the country is changing and becoming even more complex. It is true that, on linguistic grounds, the country is multilingual, as there are a few recognised national minority languages (Swedish, Sami and Russian) as well as a number of immigrant languages including English. Apparently it is vital for the police to have a line of communication open when addressing the problem associated with immigrants in the country. The second paper moved a step further by examining African immigrants’ understanding of human rights as well as what human rights violation means or entails in their views as a result of their experiences with the police, both in Finland and in their country of origin. This approach became essential during the course of the study, especially when the participants were completing the questionnaire (N=88), where volunteers were solicited for a later date for an in-depth interview with the author. Many of the respondents came from countries where human rights are not well protected and seldom discussed publicly, therefore understanding their views on the subject can help to explain why some of the immigrants are sceptical about coming forward to report cases of batteries and assaults to the police, or even their experiences of being monitored in shopping malls in their new home and the reason behind their low level of trust in public authorities in Finland. The study showed that knowledge of human rights is notably low among some of the participants. The study also found that female respondents were less aware of human rights when compared with their male counterparts. This has resulted in some of the male participants focussing more on their traditional ways of thinking by not realising that they are in a new country where there is equality in sexes and lack of respect on gender terms is not condoned. The third paper focussed on the respondents’ experiences with the police in Turku and tried to explore police attitudes towards African immigrant clients, in addition to the role stereotype plays in police views of different cultures and how these views have impacted on immigrants’ views of discriminatory policing in Turku. The data is the same throughout the entire studies (n=88), except that some few participants were interviewed for the third paper thirty-five persons. The results showed that there is some bias in mass-media reports on the immigrants’ issues, due to selective portrayal of biases without much investigation being carried out before jumping to conclusions, especially when the issues at stake involve an immigrant (Egharevba, 2005a; Egharevba, 2004a and 2004b). In this vein, there was an allegation that the police are even biased while investigating cases of theft, especially if the stolen property is owned by an immigrant (Egharevba, 2006a, Egharevba, 2006b). One vital observation from the respondents’ various comments was that race has meaning in their encounters and interaction with the police in the country. This result led the author to conclude that the relation between the police and immigrants is still a challenge, as there is rampant fear and distrust towards the police by some segments of the participating respondents in the study. In the fourth paper the focus was on examining the respondents’ view of the police, with special emphasis on race and culture as well as the respondents’ perspective on police behaviour in Turku. This is because race, as it was relayed to me in the study, is a significant predictor of police perception (Egharevba, 2005a; Egharevba and Hannikianen, 2005). It is a known scientific fact that inter-group racial attitudes are the representation of group competition and perceived threat to power and status (Group-position theory). According to Blumer (1958) a sense of group threat is an essential element for the emergence of racial prejudice. Consequently, it was essential that we explored the existing relationship between the respondents and the police in order to have an understanding of this concept. The result indicates some local and international contextual issues and assumptions that were of importance tackling prejudice and discrimination as it exists within the police in the country. Moreover, we have to also remember that, for years, many of these African immigrants have been on the receiving end of unjust law enforcement in their various countries of origin, which has resulted in many of them feeling inferior and distrustful of the police even in their own country of origin. While discussing the issues of cultural difference and how it affects policing, we must also keep in mind the socio-cultural background of the participants, their level of language proficiency and educational background. The research data analysed in this study also confirmed the difficulties associated with cultural misunderstandings in interpreting issues and how these misunderstandings have affected police and immigrant relations in Finland. Finally, the fifth paper focussed on cadets’ attitudes towards African immigrants as well as serving police officers’ interaction with African clients. Secondly, the police level of awareness of African immigrants’ distrustfulness of their profession was unclear. For this reason, my questions in this fifth study examined the experiences and attitudes of police cadets and serving police officers as well as those of African immigrants in understanding how to improve this relationship in the country. The data was based on (n=88) immigrant participants, (n=45) police cadets and 6 serving police officers from the Turku police department. The result suggests that there is distrust of the police in the respondents’ interaction; this tends to have galvanised a heightened tension resulting from the lack of language proficiency (Egharevba and White, 2007; Egharevba and Hannikainen, 2005, and Egharevba, 2006b) The result also shows that the allegation of immigrants as being belittled by the police stems from the misconceptions of both parties as well as the notion of stop and search by the police in Turku. All these factors were observed to have contributed to the alleged police evasiveness and the lack of regular contact between the respondents and the police in their dealings. In other words, the police have only had job-related contact with many of the participants in the present study. The results also demonstrated the complexities caused by the low level of education among some of the African immigrants in their understanding about the Finnish culture, norms and values in the country. Thus, the framework constructed in these studies embodies diversity in national culture as well as the need for a further research study with a greater number of respondents (both from the police and immigrant/majority groups), in order to explore the different role cultures play in immigrant and majority citizens’ understanding of police work.
Resumo:
Software plays an important role in our society and economy. Software development is an intricate process, and it comprises many different tasks: gathering requirements, designing new solutions that fulfill these requirements, as well as implementing these designs using a programming language into a working system. As a consequence, the development of high quality software is a core problem in software engineering. This thesis focuses on the validation of software designs. The issue of the analysis of designs is of great importance, since errors originating from designs may appear in the final system. It is considered economical to rectify the problems as early in the software development process as possible. Practitioners often create and visualize designs using modeling languages, one of the more popular being the Uni ed Modeling Language (UML). The analysis of the designs can be done manually, but in case of large systems, the need of mechanisms that automatically analyze these designs arises. In this thesis, we propose an automatic approach to analyze UML based designs using logic reasoners. This approach firstly proposes the translations of the UML based designs into a language understandable by reasoners in the form of logic facts, and secondly shows how to use the logic reasoners to infer the logical consequences of these logic facts. We have implemented the proposed translations in the form of a tool that can be used with any standard compliant UML modeling tool. Moreover, we authenticate the proposed approach by automatically validating hundreds of UML based designs that consist of thousands of model elements available in an online model repository. The proposed approach is limited in scope, but is fully automatic and does not require any expertise of logic languages from the user. We exemplify the proposed approach with two applications, which include the validation of domain specific languages and the validation of web service interfaces.
Resumo:
Jussi-Pekka Hakkaraisen esitys 24. Kansainvälisessä tieteen-, teknologian ja lääketieteen historian kongressissa (24th International Congress of History of Science, Technology and Medicine) Manchesterissa 26.7.2013
Resumo:
Presentation at the 12th Bibliotheca Baltica Symposium at Södertörn University Library
Resumo:
The National Library of Finland is implementing the Digitization Project of Kindred Languages in 2012–16. Within the project we will digitize materials in the Uralic languages as well as develop tools to support linguistic research and citizen science. Through this project, researchers will gain access to new corpora 329 and to which all users will have open access regardless of their place of residence. Our objective is to make sure that the new corpora are made available for the open and interactive use of both the academic community and the language societies as a whole. The project seeks to digitize and publish approximately 1200 monograph titles and more than 100 newspapers titles in various Uralic languages. The digitization will be completed by the early of 2015, when the Fenno-Ugrica collection would contain around 200 000 pages of editable text. The researchers cannot spend so much time with the material that they could retrieve a satisfactory amount of edited words, so the participation of a crowd in editing work is needed. Often the targets in crowdsourcing have been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguistic research are not necessarily met. Also, the number of pages is too high to deal with. The remarkable downside is the lack of shared goal or social affinity. There is no reward in traditional methods of crowdsourcing. Nichesourcing is a specific type of crowdsourcing where tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for the complex tasks with high-quality product expectations found in nichesourcing. Communities have purpose, identity and their regular interactions engenders social trust and reputation. These communities can correspond to research more precisely. Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. Some selection must be made, since we are not aiming to correct all 200,000 pages which we have digitized, but give such assignments to citizen scientists that would precisely fill the gaps in linguistic research. A typical task would editing and collecting the words in such fields of vocabularies, where the researchers do require more information. For instance, there’s a lack of Hill Mari words in anatomy. We have digitized the books in medicine and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with OCR editor. From the nichesourcing’s perspective, it is essential that the altruism plays a central role, when the language communities involve. Upon the nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit on the results. For instance, the corrected words in Ingrian will be added onto the online dictionary, which is made freely available for the public and the society can benefit too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of “two masters”, the research and the society.
Resumo:
We have investigated Russian children’s reading acquisition during an intermediate period in their development: after literacy onset, but before they have acquired well-developed decoding skills. The results of our study suggest that Russian first graders rely primarily on phonemes and syllables as reading grain-size units. Phonemic awareness seems to have reached the metalinguistic level more rapidly than syllabic awareness after the onset of reading instruction, the reversal which is typical for the initial stages of formal reading instruction creating external demand for phonemic awareness. Another reason might be the inherent instability of syllabic boundaries in Russian. We have shown that body-coda is a more natural representation of subsyllabic structure in Russian than onset-rime. We also found that Russian children displayed variability of syllable onset and offset decisions which can be attributed to the lack of congruence between syllabic and morphemic word division in Russian. We suggest that fuzziness of syllable boundary decisions is a sign of the transitional nature of this stage in the reading development and it indicates progress towards an awareness of morphologically determined closed syllables. Our study also showed that orthographic complexity exerts an influence on reading in Russian from the very start of reading acquisition. Besides, we found that Russian first graders experience fluency difficulties in reading orthographically simple words and nonwords of two and more syllables. The transition from monosyllabic to bisyllabic lexical items constitutes a certain threshold, for which the syllabic structure seemed to be of no difference. When we compared the outcomes of the Russian children with the ones produced by speakers of other languages, we discovered that in the tasks which could be performed with the help of alphabetic recoding Russian children’s accuracy was comparable to that of children learning to read in relatively shallow orthographies. In tasks where this approach works only partially, Russian children demonstrated accuracy results similar to those in deeper orthographies. This pattern of moderate results in accuracy and excellent performance in terms of reaction times is an indication that children apply phonological recoding as their dominant strategy to various reading tasks and are only beginning to develop suitable multiple strategies in dealing with orthographically complex material. The development of these strategies is not completed during Grade 1 and the shift towards diversification of strategies apparently continues in Grade 2.