948 resultados para Maximal Rewriting of a Regular Language at a Regular Substitution


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The emerging technologies have recently challenged the libraries to reconsider their role as a mere mediator between the collections, researchers, and wider audiences (Sula, 2013), and libraries, especially the nationwide institutions like national libraries, haven’t always managed to face the challenge (Nygren et al., 2014). In the Digitization Project of Kindred Languages, the National Library of Finland has become a node that connects the partners to interplay and work for shared goals and objectives. In this paper, I will be drawing a picture of the crowdsourcing methods that have been established during the project to support both linguistic research and lingual diversity. The National Library of Finland has been executing the Digitization Project of Kindred Languages since 2012. The project seeks to digitize and publish approximately 1,200 monograph titles and more than 100 newspapers titles in various, and in some cases endangered Uralic languages. Once the digitization has been completed in 2015, the Fenno-Ugrica online collection will consist of 110,000 monograph pages and around 90,000 newspaper pages to which all users will have open access regardless of their place of residence. The majority of the digitized literature was originally published in the 1920s and 1930s in the Soviet Union, and it was the genesis and consolidation period of literary languages. This was the era when many Uralic languages were converted into media of popular education, enlightenment, and dissemination of information pertinent to the developing political agenda of the Soviet state. The ‘deluge’ of popular literature in the 1920s to 1930s suddenly challenged the lexical orthographic norms of the limited ecclesiastical publications from the 1880s onward. Newspapers were now written in orthographies and in word forms that the locals would understand. Textbooks were written to address the separate needs of both adults and children. New concepts were introduced in the language. This was the beginning of a renaissance and period of enlightenment (Rueter, 2013). The linguistically oriented population can also find writings to their delight, especially lexical items specific to a given publication, and orthographically documented specifics of phonetics. The project is financially supported by the Kone Foundation in Helsinki and is part of the Foundation’s Language Programme. One of the key objectives of the Kone Foundation Language Programme is to support a culture of openness and interaction in linguistic research, but also to promote citizen science as a tool for the participation of the language community in research. In addition to sharing this aspiration, our objective within the Language Programme is to make sure that old and new corpora in Uralic languages are made available for the open and interactive use of the academic community as well as the language societies. Wordlists are available in 17 languages, but without tokenization, lemmatization, and so on. This approach was verified with the scholars, and we consider the wordlists as raw data for linguists. Our data is used for creating the morphological analyzers and online dictionaries at the Helsinki and Tromsø Universities, for instance. In order to reach the targets, we will produce not only the digitized materials but also their development tools for supporting linguistic research and citizen science. The Digitization Project of Kindred Languages is thus linked with the research of language technology. The mission is to improve the usage and usability of digitized content. During the project, we have advanced methods that will refine the raw data for further use, especially in the linguistic research. How does the library meet the objectives, which appears to be beyond its traditional playground? The written materials from this period are a gold mine, so how could we retrieve these hidden treasures of languages out of the stack that contains more than 200,000 pages of literature in various Uralic languages? The problem is that the machined-encoded text (OCR) contains often too many mistakes to be used as such in research. The mistakes in OCRed texts must be corrected. For enhancing the OCRed texts, the National Library of Finland developed an open-source code OCR editor that enabled the editing of machine-encoded text for the benefit of linguistic research. This tool was necessary to implement, since these rare and peripheral prints did often include already perished characters, which are sadly neglected by the modern OCR software developers, but belong to the historical context of kindred languages and thus are an essential part of the linguistic heritage (van Hemel, 2014). Our crowdsourcing tool application is essentially an editor of Alto XML format. It consists of a back-end for managing users, permissions, and files, communicating through a REST API with a front-end interface—that is, the actual editor for correcting the OCRed text. The enhanced XML files can be retrieved from the Fenno-Ugrica collection for further purposes. Could the crowd do this work to support the academic research? The challenge in crowdsourcing lies in its nature. The targets in the traditional crowdsourcing have often been split into several microtasks that do not require any special skills from the anonymous people, a faceless crowd. This way of crowdsourcing may produce quantitative results, but from the research’s point of view, there is a danger that the needs of linguists are not necessarily met. Also, the remarkable downside is the lack of shared goal or the social affinity. There is no reward in the traditional methods of crowdsourcing (de Boer et al., 2012). Also, there has been criticism that digital humanities makes the humanities too data-driven and oriented towards quantitative methods, losing the values of critical qualitative methods (Fish, 2012). And on top of that, the downsides of the traditional crowdsourcing become more imminent when you leave the Anglophone world. Our potential crowd is geographically scattered in Russia. This crowd is linguistically heterogeneous, speaking 17 different languages. In many cases languages are close to extinction or longing for language revitalization, and the native speakers do not always have Internet access, so an open call for crowdsourcing would not have produced appeasing results for linguists. Thus, one has to identify carefully the potential niches to complete the needed tasks. When using the help of a crowd in a project that is aiming to support both linguistic research and survival of endangered languages, the approach has to be a different one. In nichesourcing, the tasks are distributed amongst a small crowd of citizen scientists (communities). Although communities provide smaller pools to draw resources, their specific richness in skill is suited for complex tasks with high-quality product expectations found in nichesourcing. Communities have a purpose and identity, and their regular interaction engenders social trust and reputation. These communities can correspond to research more precisely (de Boer et al., 2012). Instead of repetitive and rather trivial tasks, we are trying to utilize the knowledge and skills of citizen scientists to provide qualitative results. In nichesourcing, we hand in such assignments that would precisely fill the gaps in linguistic research. A typical task would be editing and collecting the words in such fields of vocabularies where the researchers do require more information. For instance, there is lack of Hill Mari words and terminology in anatomy. We have digitized the books in medicine, and we could try to track the words related to human organs by assigning the citizen scientists to edit and collect words with the OCR editor. From the nichesourcing’s perspective, it is essential that altruism play a central role when the language communities are involved. In nichesourcing, our goal is to reach a certain level of interplay, where the language communities would benefit from the results. For instance, the corrected words in Ingrian will be added to an online dictionary, which is made freely available for the public, so the society can benefit, too. This objective of interplay can be understood as an aspiration to support the endangered languages and the maintenance of lingual diversity, but also as a servant of ‘two masters’: research and society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The current study investigated the effects that barriers (both real and perceived) had on participation and completion of speech and language programs for preschool children with communication delays. I compared 36 families of preschool children with an identified communication delay that have completed services (completers) to 13 families that have not completed services (non-completers) prescribed by Speech and Language professionals. Data findings reported were drawn from an interview with the mother, a speech and language assessment of the child, and an extensive package of measures completed by the mother. Children ranged in age from 32 to 71 mos. These data were collected as part of a project funded by the Canadian Language and Literacy Research Networks of Centres of Excellence. Findings suggest that completers and non-completers shared commonalities in a number of parenting characteristics but differed significantly in two areas. Mothers in the noncompleting group were more permissive and had lower maternal education than mothers in the completing families. From a systemic standpoint, families also differed in the number of perceived barriers to treatment experienced during their time with Speech Services Niagara. Mothers in the non-completing group experienced more perceived barriers to treatment than completing mothers. Specifically, these mothers perceived more stressors and obstacles that competed with treatment, perceived more treatment demands and they perceived the relevance of treatment as less important than the completing group. Despite this, the findings suggest that non-completing families were 100% satisfied with services. Contrary to predictions, there were no significant differences in child characterisfics and economic characteristics between completers and non-completers. The findings in this study are considered exploratory and tentative due to the small sample size.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examined the effects that a training program in phonological awareness had on the early writing skills of children in a Grade One class in the Lincoln County Separate school system. The intent of the training program was to provide consistent and systematic practice in the manipulation of the phonological structure of language. The games and activities of the training program were related to a framework of developmental phonological skills and practised in a group setting during an unstructured period of the regular classroom schedule. The training program operated three days in a six-day cycle for approximately twenty minutes a day, from November until mid-March. All children were tested at the outset and conclusion of the study to determine level of functioning in letter identification, word recognition, verbal intelligence, phonological awareness and spelling. Results of the pre-tests and post-tests were compared to determine differences between the experimental and control groups over time. In addition, a systematic analysis of the children's writing looked at the development of the spelling of regular and irregular words. The results of this study provided strong support for the hypothesis that the treatment group would progress through the stages of early writing development more quickly than children without such training. On the basis of differences between the groups over time, it was evident that training in phonological awareness had a direct positive effect on the spelling of regular words for children during the early stages of writing. The training program did not have a significant effect on the spelling of irregular words. Test results evaluating phonological awareness indicated a significant difference within each group over time but no significance between the groups during the experimental period. It would appear that the results of these tests reflect maturational changes in the child rather than causal effects of the training program. Nor did the effects of the training program transfer significantly to other aspects of language. Although some of the hypotheses considered were not supported by the study, the results do indicate that children during the early stages of writing development can benefit from a training program in phonological awareness. The theoretical direction for effective programming as a result of this study is discussed. The educational implications of training phonological awareness concurrent to beginning efforts in writing are considered.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The energy of a graph G is the sum of the absolute values of its eigenvalues. In this paper, we study the energies of some classes of non-regular graphs. Also the spectrum of some non-regular graphs and their complements are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Restarting automata can be seen as analytical variants of classical automata as well as of regulated rewriting systems. We study a measure for the degree of nondeterminism of (context-free) languages in terms of deterministic restarting automata that are (strongly) lexicalized. This measure is based on the number of auxiliary symbols (categories) used for recognizing a language as the projection of its characteristic language onto its input alphabet. This type of recognition is typical for analysis by reduction, a method used in linguistics for the creation and verification of formal descriptions of natural languages. Our main results establish a hierarchy of classes of context-free languages and two hierarchies of classes of non-context-free languages that are based on the expansion factor of a language.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In der vorliegenden Dissertation werden Systeme von parallel arbeitenden und miteinander kommunizierenden Restart-Automaten (engl.: systems of parallel communicating restarting automata; abgekürzt PCRA-Systeme) vorgestellt und untersucht. Dabei werden zwei bekannte Konzepte aus den Bereichen Formale Sprachen und Automatentheorie miteinander vescrknüpft: das Modell der Restart-Automaten und die sogenannten PC-Systeme (systems of parallel communicating components). Ein PCRA-System besteht aus endlich vielen Restart-Automaten, welche einerseits parallel und unabhängig voneinander lokale Berechnungen durchführen und andererseits miteinander kommunizieren dürfen. Die Kommunikation erfolgt dabei durch ein festgelegtes Kommunikationsprotokoll, das mithilfe von speziellen Kommunikationszuständen realisiert wird. Ein wesentliches Merkmal hinsichtlich der Kommunikationsstruktur in Systemen von miteinander kooperierenden Komponenten ist, ob die Kommunikation zentralisiert oder nichtzentralisiert erfolgt. Während in einer nichtzentralisierten Kommunikationsstruktur jede Komponente mit jeder anderen Komponente kommunizieren darf, findet jegliche Kommunikation innerhalb einer zentralisierten Kommunikationsstruktur ausschließlich mit einer ausgewählten Master-Komponente statt. Eines der wichtigsten Resultate dieser Arbeit zeigt, dass zentralisierte Systeme und nichtzentralisierte Systeme die gleiche Berechnungsstärke besitzen (das ist im Allgemeinen bei PC-Systemen nicht so). Darüber hinaus bewirkt auch die Verwendung von Multicast- oder Broadcast-Kommunikationsansätzen neben Punkt-zu-Punkt-Kommunikationen keine Erhöhung der Berechnungsstärke. Desweiteren wird die Ausdrucksstärke von PCRA-Systemen untersucht und mit der von PC-Systemen von endlichen Automaten und mit der von Mehrkopfautomaten verglichen. PC-Systeme von endlichen Automaten besitzen bekanntermaßen die gleiche Ausdrucksstärke wie Einwegmehrkopfautomaten und bilden eine untere Schranke für die Ausdrucksstärke von PCRA-Systemen mit Einwegkomponenten. Tatsächlich sind PCRA-Systeme auch dann stärker als PC-Systeme von endlichen Automaten, wenn die Komponenten für sich genommen die gleiche Ausdrucksstärke besitzen, also die regulären Sprachen charakterisieren. Für PCRA-Systeme mit Zweiwegekomponenten werden als untere Schranke die Sprachklassen der Zweiwegemehrkopfautomaten im deterministischen und im nichtdeterministischen Fall gezeigt, welche wiederum den bekannten Komplexitätsklassen L (deterministisch logarithmischer Platz) und NL (nichtdeterministisch logarithmischer Platz) entsprechen. Als obere Schranke wird die Klasse der kontextsensitiven Sprachen gezeigt. Außerdem werden Erweiterungen von Restart-Automaten betrachtet (nonforgetting-Eigenschaft, shrinking-Eigenschaft), welche bei einzelnen Komponenten eine Erhöhung der Berechnungsstärke bewirken, in Systemen jedoch deren Stärke nicht erhöhen. Die von PCRA-Systemen charakterisierten Sprachklassen sind unter diversen Sprachoperationen abgeschlossen und einige Sprachklassen sind sogar abstrakte Sprachfamilien (sogenannte AFL's). Abschließend werden für PCRA-Systeme spezifische Probleme auf ihre Entscheidbarkeit hin untersucht. Es wird gezeigt, dass Leerheit, Universalität, Inklusion, Gleichheit und Endlichkeit bereits für Systeme mit zwei Restart-Automaten des schwächsten Typs nicht semientscheidbar sind. Für das Wortproblem wird gezeigt, dass es im deterministischen Fall in quadratischer Zeit und im nichtdeterministischen Fall in exponentieller Zeit entscheidbar ist.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The central thesis of this report is that human language is NP-complete. That is, the process of comprehending and producing utterances is bounded above by the class NP, and below by NP-hardness. This constructive complexity thesis has two empirical consequences. The first is to predict that a linguistic theory outside NP is unnaturally powerful. The second is to predict that a linguistic theory easier than NP-hard is descriptively inadequate. To prove the lower bound, I show that the following three subproblems of language comprehension are all NP-hard: decide whether a given sound is possible sound of a given language; disambiguate a sequence of words; and compute the antecedents of pronouns. The proofs are based directly on the empirical facts of the language user's knowledge, under an appropriate idealization. Therefore, they are invariant across linguistic theories. (For this reason, no knowledge of linguistic theory is needed to understand the proofs, only knowledge of English.) To illustrate the usefulness of the upper bound, I show that two widely-accepted analyses of the language user's knowledge (of syntactic ellipsis and phonological dependencies) lead to complexity outside of NP (PSPACE-hard and Undecidable, respectively). Next, guided by the complexity proofs, I construct alternate linguisitic analyses that are strictly superior on descriptive grounds, as well as being less complex computationally (in NP). The report also presents a new framework for linguistic theorizing, that resolves important puzzles in generative linguistics, and guides the mathematical investigation of human language.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reviews a study to examine the feasibility of using elicited language samples as a basis for planning language instruction and as a measure of progress in language development.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

France is known for being a champion of individual rights as well as for its overt hostility to any form of group rights. Linguistic pluralism in the public sphere is rejected for fear of babelization and Balkanization of the country. Over recent decades the Conseil Constitutionnel (CC) has, together with the Conseil d’État, remained arguably the strongest defender of this Jacobin ideal in France. In this article, I will discuss the role of France’s restrictive language policy through the prism of the CC’s jurisprudence. Overall, I will argue that the CC made reference to the (Jacobin) state-nation concept, a concept that is discussed in the first part of the paper, in order to fight the revival of regional languages in France over recent decades. The clause making French the official language in 1992 was functional to this policy. The intriguing aspect is that in France the CC managed to standardise France’s policy vis-à-vis regional and minority languages through its jurisprudence; an issue discussed in the second part of the paper. But in those regions with a stronger tradition of identity, particularly in the French overseas territories, the third part of the paper argues, normative reality has increasingly become under pressure. Therefore, a discrepancy between the ‘law in courts’ and the compliance with these decisions (‘law in action’) has been emerging over recent years. Amid some signs of opening of France to minorities, this contradiction delineates a trend that might well continue in future.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article reports on an investigation into the language learning beliefs of students of French in England, aged 16 to 18. It focuses on qualitative data from two groups of learners (10 in total). While both groups had broadly similar levels of achievement in French in terns of examination success, they dffered greatly in the self-image they had of themselves as language learners, with one group displaying low levels of self-eficacy beliefs regarding the possibility of future success. The implica tions of such beliefs for students' levels of motivation and persistence are discussed, together with their possible causes. The article concludes by suggesting changes in classroom practice that might help students develop a more positive image of them selves as language learners.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recognizing 11 official languages, the 1996 South African Constitution provides a context for the management of diversity with important implications for the redistribution of wealth and power. The development and implementation of the language-in-education policies which might be expected to flow from the Constitution, however, have been slow and ineffective. One of the casualties of government procrastination has been African language publishing. In the absence of well-resourced bilingual education, most learners continue to be taught through the medium of English as a second language. Teachers are reluctant to use more innovative pedagogies without the support of adequate African language materials and publishers are cautious about producing such materials. Nonetheless, activity in this sector offers many opportunities for African language speakers. This paper explores the challenges and constraints for African language publishing for children and argues that market forces and language policy need to work in mutually reinforcing ways. Further progress is necessarily dependent on the political will to implement language-in-education policies that promote additive bilingualism and, in the process, guarantee sales for risk-averse publishers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent research shows that speakers of languages with obligatory plural marking (English) preferentially categorize objects based on common shape, whereas speakers of nonplural-marking classifier languages (Yucatec and Japanese) preferentially categorize objects based on common material. The current study extends that investigation to the domain of bilingualism. Japanese and English monolinguals, and Japanese–English bilinguals were asked to match novel objects based on either common shape or color. Results showed that English monolinguals selected shape significantly more than Japanese monolinguals, whereas the bilinguals shifted their cognitive preferences as a function of their second language proficiency. The implications of these findings for conceptual representation and cognitive processing in bilinguals are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article elucidates the Typological Primacy Model (TPM; Rothman, 2010, 2011, 2013) for the initial stages of adult third language (L3) morphosyntactic transfer, addressing questions that stem from the model and its application. The TPM maintains that structural proximity between the L3 and the L1 and/or the L2 determines L3 transfer. In addition to demonstrating empirical support for the TPM, this article articulates a proposal for how the mind unconsciously determines typological (structural) proximity based on linguistic cues from the L3 input stream used by the parser early on to determine holistic transfer of one previous (the L1 or the L2) system. This articulated version of the TPM is motivated by argumentation appealing to cognitive and linguistic factors. Finally, in line with the general tenets of the TPM, I ponder if and why L3 transfer might obtain differently depending on the type of bilingual (e.g. early vs. late) and proficiency level of bilingualism involved in the L3 process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Different theoretical accounts of second language (L2) acquisition differ with respect to whether or not advanced learners are predicted to show native like processing for features not instantiated in the native language (L1). We examined how native speakers of English, a language with number but not gender agreement, process number and gender agreement in Spanish. We compare agreement within a determiner phrase (órgano muy complejo “[DP organ-MASC-SG very complex-MASC-SG]”) and across a verb phrase (cuadro es auténtico “painting-MASC-SG [VP is authentic-MASC-SG]”) in order to investigate whether native like processing is limited to local domains (e.g. within the phrase), in line with Clahsen and Felser (2006). We also examine whether morphological differences in how the L1 and L2 realize a shared feature impact processing by comparing number agreement between nouns and adjectives, where only Spanish instantiates agreement, and between demonstratives and nouns, where English also instantiates agreement. Similar to Spanish natives, advanced learners showed a P600 for both number and gender violations overall, in line with the Full Transfer/Full Access Hypothesis (Schwartz and Sprouse, 1996), which predicts that learners can show native-like processing for novel features. Results also show that learners can establish syntactic dependencies outside of local domains, as suggested by the presence of a P600 for both within and across phrase violations. Moreover, similar to native speakers, learners were impacted by the structural distance (number of intervening phrases) between the agreeing elements, as suggested by the more positive waveforms for within than across-phrase agreement overall. These results are consistent with the proposal that learners are sensitive to hierarchical structure.