24 resultados para regular languages

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The three main topics of this work are independent systems and chains of word equations, parametric solutions of word equations on three unknowns, and unique decipherability in the monoid of regular languages. The most important result about independent systems is a new method giving an upper bound for their sizes in the case of three unknowns. The bound depends on the length of the shortest equation. This result has generalizations for decreasing chains and for more than three unknowns. The method also leads to shorter proofs and generalizations of some old results. Hmelevksii’s theorem states that every word equation on three unknowns has a parametric solution. We give a significantly simplified proof for this theorem. As a new result we estimate the lengths of parametric solutions and get a bound for the length of the minimal nontrivial solution and for the complexity of deciding whether such a solution exists. The unique decipherability problem asks whether given elements of some monoid form a code, that is, whether they satisfy a nontrivial equation. We give characterizations for when a collection of unary regular languages is a code. We also prove that it is undecidable whether a collection of binary regular languages is a code.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis introduces an extension of Chomsky’s context-free grammars equipped with operators for referring to left and right contexts of strings.The new model is called grammar with contexts. The semantics of these grammars are given in two equivalent ways — by language equations and by logical deduction, where a grammar is understood as a logic for the recursive definition of syntax. The motivation for grammars with contexts comes from an extensive example that completely defines the syntax and static semantics of a simple typed programming language. Grammars with contexts maintain most important practical properties of context-free grammars, including a variant of the Chomsky normal form. For grammars with one-sided contexts (that is, either left or right), there is a cubic-time tabular parsing algorithm, applicable to an arbitrary grammar. The time complexity of this algorithm can be improved to quadratic,provided that the grammar is unambiguous, that is, it only allows one parsefor every string it defines. A tabular parsing algorithm for grammars withtwo-sided contexts has fourth power time complexity. For these grammarsthere is a recognition algorithm that uses a linear amount of space. For certain subclasses of grammars with contexts there are low-degree polynomial parsing algorithms. One of them is an extension of the classical recursive descent for context-free grammars; the version for grammars with contexts still works in linear time like its prototype. Another algorithm, with time complexity varying from linear to cubic depending on the particular grammar, adapts deterministic LR parsing to the new model. If all context operators in a grammar define regular languages, then such a grammar can be transformed to an equivalent grammar without context operators at all. This allows one to represent the syntax of languages in a more succinct way by utilizing context specifications. Linear grammars with contexts turned out to be non-trivial already over a one-letter alphabet. This fact leads to some undecidability results for this family of grammars

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The emerging technologies have recently challenged the libraries to reconsider their role as a mere mediator between the collections, researchers, and wider audiences (Sula, 2013), and libraries, especially the nationwide institutions like national libraries, haven’t always managed to face the challenge (Nygren et al., 2014). In the Digitization Project of Kindred Languages, the National Library of Finland has become a node that connects the partners to interplay and work for shared goals and objectives. In this paper, I will be drawing a picture of the crowdsourcing methods that have been established during the project to support both linguistic research and lingual diversity. The National Library of Finland has been executing the Digitization Project of Kindred Languages since 2012. The project seeks to digitize and publish approximately 1,200 monograph titles and more than 100 newspapers titles in various, and in some cases endangered Uralic languages. Once the digitization has been completed in 2015, the Fenno-Ugrica online collection will consist of 110,000 monograph pages and around 90,000 newspaper pages to which all users will have open access regardless of their place of residence. The majority of the digitized literature was originally published in the 1920s and 1930s in the Soviet Union, and it was the genesis and consolidation period of literary languages. This was the era when many Uralic languages were converted into media of popular education, enlightenment, and dissemination of information pertinent to the developing political agenda of the Soviet state. The ‘deluge’ of popular literature in the 1920s to 1930s suddenly challenged the lexical orthographic norms of the limited ecclesiastical publications from the 1880s onward. Newspapers were now written in orthographies and in word forms that the locals would understand. Textbooks were written to address the separate needs of both adults and children. New concepts were introduced in the language. This was the beginning of a renaissance and period of enlightenment (Rueter, 2013). The linguistically oriented population can also find writings to their delight, especially lexical items specific to a given publication, and orthographically documented specifics of phonetics. The project is financially supported by the Kone Foundation in Helsinki and is part of the Foundation’s Language Programme. One of the key objectives of the Kone Foundation Language Programme is to support a culture of openness and interaction in linguistic research, but also to promote citizen science as a tool for the participation of the language community in research. In addition to sharing this aspiration, our objective within the Language Programme is to make sure that old and new corpora in Uralic languages are made available for the open and interactive use of the academic community as well as the language societies. Wordlists are available in 17 languages, but without tokenization, lemmatization, and so on. This approach was verified with the scholars, and we consider the wordlists as raw data for linguists. Our data is used for creating the morphological analyzers and online dictionaries at the Helsinki and Tromsø Universities, for instance. In order to reach the targets, we will produce not only the digitized materials but also their development tools for supporting linguistic research and citizen science. The Digitization Project of Kindred Languages is thus linked with the research of language technology. The mission is to improve the usage and usability of digitized content. During the project, we have advanced methods that will refine the raw data for further use, especially in the linguistic research. How does the library meet the objectives, which appears to be beyond its traditional playground? The written materials from this period are a gold mine, so how could we retrieve these hidden treasures of languages out of the stack that contains more than 200,000 pages of literature in various Uralic languages? The problem is that the machined-encoded text (OCR) contains often too many mistakes to be used as such in research. The mistakes in OCRed texts must be corrected. For enhancing the OCRed texts, the National Library of Finland developed an open-source code OCR editor that enabled the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary to implement, since these rare and peripheral prints did often include already perished characters, which are sadly neglected by the modern OCR software developers, but belong to the historical context of kindred languages and thus are an essential part of the linguistic heritage (van Hemel, 2014). Our crowdsourcing tool application is essentially an editor of Alto XML format. It consists of a back-end for managing users, permissions, and files, communicating through a REST API with a front-end interface—that is, the actual editor for correcting the OCRed text. The enhanced XML files can be retrieved from the Fenno-Ugrica collection for further purposes. Could the crowd do this work to support the academic research? The challenge in crowdsourcing lies in its nature. The targets in the traditional crowdsourcing have often been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguists are not necessarily met. Also, the remarkable downside is the lack of shared goal or the social affinity. There is no reward in the traditional methods of crowdsourcing (de Boer et al., 2012). Also, there has been criticism that digital humanities makes the humanities too data-driven and oriented towards quantitative methods, losing the values of critical qualitative methods (Fish, 2012). And on top of that, the downsides of the traditional crowdsourcing become more imminent when you leave the Anglophone world. Our potential crowd is geographically scattered in Russia. This crowd is linguistically heterogeneous, speaking 17 different languages. In many cases languages are close to extinction or longing for language revitalization, and the native speakers do not always have Internet access, so an open call for crowdsourcing would not have produced appeasing results for linguists. Thus, one has to identify carefully the potential niches to complete the needed tasks. When using the help of a crowd in a project that is aiming to support both linguistic research and survival of endangered languages, the approach has to be a different one. In nichesourcing, the tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for complex tasks with high-quality product expectations found in nichesourcing. Communities have a purpose and identity, and their regular interaction engenders social trust and reputation. These communities can correspond to research more precisely (de Boer et al., 2012). Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. In nichesourcing, we hand in such assignments that would precisely fill the gaps in linguistic research. A typical task would be editing and collecting the words in such fields of vocabularies where the researchers do require more information. For instance, there is lack of Hill Mari words and terminology in anatomy. We have digitized the books in medicine, and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with the OCR editor. From the nichesourcing’s perspective, it is essential that altruism play a central role when the language communities are involved. In nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit from the results. For instance, the corrected words in Ingrian will be added to an online dictionary, which is made freely available for the public, so the society can benefit, too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of ‘two masters’: research and society.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ohjelmiston kehitystyökalut käyttävät infromaatiota kehittäjän tuottamasta lähdekoodista. Informaatiota hyödynnetään ohjelmistoprojektin eri vaiheissa ja eri tarkoituksissa. Moderneissa ohjelmistoprojekteissa käytetyn informaation määrä voi kasvaa erittäin suureksi. Ohjelmistotyökaluilla on omat informaatiomallinsa ja käyttömekanisminsa. Informaation määrä sekä erilliset työkaluinformaatiomallit tekevät erittäin hankalaksi rakentaa joustavaa työkaluympäristöä, erityisesti ongelma-aluekohtaiseen ohjelmiston kehitysprosessiin. Tässä työssä on analysoitu perusinformaatiometamalleja Unified Modeling language kielestä, Python ohjelmointikielestä ja C++ ohjelmointikielestä. Metainformaation taso on rajoitettu rakenteelliselle tasolle. Ajettavat rakenteet on jätetty pois. ModelBase metamalli on yhdistetty olemassa olevista analysoiduista metamalleista. Tätä metamallia voidaan käyttää tulevaisuudessa ohjelmistotyökalujen kehitykseen.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The research on language equations has been active during last decades. Compared to the equations on words the equations on languages are much more difficult to solve. Even very simple equations that are easy to solve for words can be very hard for languages. In this thesis we study two of such equations, namely commutation and conjugacy equations. We study these equations on some limited special cases and compare some of these results to the solutions of corresponding equations on words. For both equations we study the maximal solutions, the centralizer and the conjugator. We present a fixed point method that we can use to search these maximal solutions and analyze the reasons why this method is not successful for all languages. We give also several examples to illustrate the behaviour of this method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of domain-specific languages (DSLs) has been proposed as an approach to cost-e ectively develop families of software systems in a restricted application domain. Domain-specific languages in combination with the accumulated knowledge and experience of previous implementations, can in turn be used to generate new applications with unique sets of requirements. For this reason, DSLs are considered to be an important approach for software reuse. However, the toolset supporting a particular domain-specific language is also domain-specific and is per definition not reusable. Therefore, creating and maintaining a DSL requires additional resources that could be even larger than the savings associated with using them. As a solution, di erent tool frameworks have been proposed to simplify and reduce the cost of developments of DSLs. Developers of tool support for DSLs need to instantiate, customize or configure the framework for a particular DSL. There are di erent approaches for this. An approach is to use an application programming interface (API) and to extend the basic framework using an imperative programming language. An example of a tools which is based on this approach is Eclipse GEF. Another approach is to configure the framework using declarative languages that are independent of the underlying framework implementation. We believe this second approach can bring important benefits as this brings focus to specifying what should the tool be like instead of writing a program specifying how the tool achieves this functionality. In this thesis we explore this second approach. We use graph transformation as the basic approach to customize a domain-specific modeling (DSM) tool framework. The contributions of this thesis includes a comparison of di erent approaches for defining, representing and interchanging software modeling languages and models and a tool architecture for an open domain-specific modeling framework that e ciently integrates several model transformation components and visual editors. We also present several specific algorithms and tool components for DSM framework. These include an approach for graph query based on region operators and the star operator and an approach for reconciling models and diagrams after executing model transformation programs. We exemplify our approach with two case studies MICAS and EFCO. In these studies we show how our experimental modeling tool framework has been used to define tool environments for domain-specific languages.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The focus of this study is to examine the role of police and immigrants’ relations, as less is known about this process in the country. The studies were approached in two different ways. Firstly, an attempt was made to examine how immigrants view their encounters with the police. Secondly, the studies explored how aware the police are of immigrants’ experiences in their various encounters and interactions on the street level. An ancillary aim of the studies is to clarify, analyse and discuss how prejudice and stereotypes can be tackled, thereby contributing to the general debate about racism and discrimination for better ethnic relations in the country. The data in which this analysis was based is on a group of adults (n=88) from the total of 120 Africans questioned for the entire study (n=45) police cadets and (n=6) serving police officers from Turku. The present thesis is a compilation of five articles. A summary of each article findings follows, as the same data was used in all five studies. In the first study, a theoretical model was developed to examine the perceived knowledge of bias by immigrants resulting from race, culture and belief. This was also an attempt to explore whether this knowledge was predetermined in my attempt to classify and discuss as well as analyse the factors that may be influencing immigrants’ allegations of unfair treatment by the police in Turku. The main finding shows that in the first paper there was ignorance and naivety on the part of the police in their attitudes towards the African immigrant’s prior experiences with the police, and this may probably have resulted from stereotypes or their lack of experience as well as prior training with immigrants where these kinds of experience are rampant in the country (Egharevba, 2003 and 2004a). In exploring what leads to stereotypes, a working definition is the assumption that is prevalent among some segments of the population, including the police, that Finland is a homogenous country by employing certain conducts and behaviour towards ethnic and immigrant groups in the country. This to my understanding is stereotype. Historically this was true, but today the social topography of the country is changing and becoming even more complex. It is true that, on linguistic grounds, the country is multilingual, as there are a few recognised national minority languages (Swedish, Sami and Russian) as well as a number of immigrant languages including English. Apparently it is vital for the police to have a line of communication open when addressing the problem associated with immigrants in the country. The second paper moved a step further by examining African immigrants’ understanding of human rights as well as what human rights violation means or entails in their views as a result of their experiences with the police, both in Finland and in their country of origin. This approach became essential during the course of the study, especially when the participants were completing the questionnaire (N=88), where volunteers were solicited for a later date for an in-depth interview with the author. Many of the respondents came from countries where human rights are not well protected and seldom discussed publicly, therefore understanding their views on the subject can help to explain why some of the immigrants are sceptical about coming forward to report cases of batteries and assaults to the police, or even their experiences of being monitored in shopping malls in their new home and the reason behind their low level of trust in public authorities in Finland. The study showed that knowledge of human rights is notably low among some of the participants. The study also found that female respondents were less aware of human rights when compared with their male counterparts. This has resulted in some of the male participants focussing more on their traditional ways of thinking by not realising that they are in a new country where there is equality in sexes and lack of respect on gender terms is not condoned. The third paper focussed on the respondents’ experiences with the police in Turku and tried to explore police attitudes towards African immigrant clients, in addition to the role stereotype plays in police views of different cultures and how these views have impacted on immigrants’ views of discriminatory policing in Turku. The data is the same throughout the entire studies (n=88), except that some few participants were interviewed for the third paper thirty-five persons. The results showed that there is some bias in mass-media reports on the immigrants’ issues, due to selective portrayal of biases without much investigation being carried out before jumping to conclusions, especially when the issues at stake involve an immigrant (Egharevba, 2005a; Egharevba, 2004a and 2004b). In this vein, there was an allegation that the police are even biased while investigating cases of theft, especially if the stolen property is owned by an immigrant (Egharevba, 2006a, Egharevba, 2006b). One vital observation from the respondents’ various comments was that race has meaning in their encounters and interaction with the police in the country. This result led the author to conclude that the relation between the police and immigrants is still a challenge, as there is rampant fear and distrust towards the police by some segments of the participating respondents in the study. In the fourth paper the focus was on examining the respondents’ view of the police, with special emphasis on race and culture as well as the respondents’ perspective on police behaviour in Turku. This is because race, as it was relayed to me in the study, is a significant predictor of police perception (Egharevba, 2005a; Egharevba and Hannikianen, 2005). It is a known scientific fact that inter-group racial attitudes are the representation of group competition and perceived threat to power and status (Group-position theory). According to Blumer (1958) a sense of group threat is an essential element for the emergence of racial prejudice. Consequently, it was essential that we explored the existing relationship between the respondents and the police in order to have an understanding of this concept. The result indicates some local and international contextual issues and assumptions that were of importance tackling prejudice and discrimination as it exists within the police in the country. Moreover, we have to also remember that, for years, many of these African immigrants have been on the receiving end of unjust law enforcement in their various countries of origin, which has resulted in many of them feeling inferior and distrustful of the police even in their own country of origin. While discussing the issues of cultural difference and how it affects policing, we must also keep in mind the socio-cultural background of the participants, their level of language proficiency and educational background. The research data analysed in this study also confirmed the difficulties associated with cultural misunderstandings in interpreting issues and how these misunderstandings have affected police and immigrant relations in Finland. Finally, the fifth paper focussed on cadets’ attitudes towards African immigrants as well as serving police officers’ interaction with African clients. Secondly, the police level of awareness of African immigrants’ distrustfulness of their profession was unclear. For this reason, my questions in this fifth study examined the experiences and attitudes of police cadets and serving police officers as well as those of African immigrants in understanding how to improve this relationship in the country. The data was based on (n=88) immigrant participants, (n=45) police cadets and 6 serving police officers from the Turku police department. The result suggests that there is distrust of the police in the respondents’ interaction; this tends to have galvanised a heightened tension resulting from the lack of language proficiency (Egharevba and White, 2007; Egharevba and Hannikainen, 2005, and Egharevba, 2006b) The result also shows that the allegation of immigrants as being belittled by the police stems from the misconceptions of both parties as well as the notion of stop and search by the police in Turku. All these factors were observed to have contributed to the alleged police evasiveness and the lack of regular contact between the respondents and the police in their dealings. In other words, the police have only had job-related contact with many of the participants in the present study. The results also demonstrated the complexities caused by the low level of education among some of the African immigrants in their understanding about the Finnish culture, norms and values in the country. Thus, the framework constructed in these studies embodies diversity in national culture as well as the need for a further research study with a greater number of respondents (both from the police and immigrant/majority groups), in order to explore the different role cultures play in immigrant and majority citizens’ understanding of police work.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The thesis presents results obtained during the authors PhD-studies. First systems of language equations of a simple form consisting of just two equations are proved to be computationally universal. These are systems over unary alphabet, that are seen as systems of equations over natural numbers. The systems contain only an equation X+A=B and an equation X+X+C=X+X+D, where A, B, C and D are eventually periodic constants. It is proved that for every recursive set S there exists natural numbers p and d, and eventually periodic sets A, B, C and D such that a number n is in S if and only if np+d is in the unique solution of the abovementioned system of two equations, so all recursive sets can be represented in an encoded form. It is also proved that all recursive sets cannot be represented as they are, so the encoding is really needed. Furthermore, it is proved that the family of languages generated by Boolean grammars is closed under injective gsm-mappings and inverse gsm-mappings. The arguments apply also for the families of unambiguous Boolean languages, conjunctive languages and unambiguous languages. Finally, characterizations for morphisims preserving subfamilies of context-free languages are presented. It is shown that the families of deterministic and LL context-free languages are closed under codes if and only if they are of bounded deciphering delay. These families are also closed under non-codes, if they map every letter into a submonoid generated by a single word. The family of unambiguous context-free languages is closed under all codes and under the same non-codes as the families of deterministic and LL context-free languages.