932 resultados para 080404 Markup Languages
Resumo:
A comparison is made between German and Russian terminological derivations in chemistry and the methods used by Germans and Russians to solve problems related to the fornlrrtion of scientific words. A study of this comparison, it is believed, can help us in the development of scientific words in Indian languages.
Resumo:
Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other ``auxiliary'' languages. We observe that translations have many topically related words in common in the auxiliary language. To model this, we define the notion of a translingual theme, a set of topically related words from auxiliary language corpora, and present a probabilistic framework for translation induction. Extensive experiments on 35 comparable corpora using English and French as auxiliary languages show that this approach can yield dramatic improvements in performance (e.g. MRR improves by 124% to 0.419 for Telugu-Kannada). A user study on WikiTSu, a system for cross-lingual Wikipedia title suggestion that uses our approach, shows a 20% improvement in the quality of titles suggested.
Resumo:
Eguíluz, Federico; Merino, Raquel; Olsen, Vickie; Pajares, Eterio; Santamaría, José Miguel (eds.)
Resumo:
The First SPARK-STREAM Workshop on Livelihoods and Languages took place in Bangkok, Thailand, from 9-11 April 2003. It was the first activity in a SPARK-STREAM learning and communications process around livelihoods and languages. (PDF contains 53 pages)
Resumo:
The Second SPARK-STREAM Workshop on Livelihoods and Languages took place in Tagaytay City, Philippines, from 12-14 June 2003. Outputs were intended to be: Drafts of language-specific “Guide to Learning and Communicating about Livelihoods”. Drafts of articles for STREAM Journal and SPARK Newsletter. Priorities and practical follow-up for capacity-building in carrying out participatory livelihoods analysis Follow-up plans. [PDF contains 30 pages]
Resumo:
The present corpus study aimed to examine whether Basque (OV) resorts more often than Spanish (VO) to certain grammatical operations, in order to minimi ze the number of arguments to be processed before the verb. Ueno & Polinsky (2009) argue that VO/OV languages use certain grammatical resources with different frequencies in order to facilitate real-time processing. They observe that both OV and VO languages in their sample (Japanese, Turkish and Spanish) have a similar frequency of use of subject pro-drop; however, they find that OV languages (Japanese, Turkish) use more intransitive sentences than VO languages (English, Spanish), and conclude this is an OV-specific strategy to facilitate processing. We conducted a comparative corpus study of Spanish (VO) and Basque (OV). Results show (a) that the fre- quency of use of subject pro-drop is higher in Basque than in Spanish; and (b) Basque does not use more intransitive sentences than Spanish; both languages have a similar frequency of intransitive sentences. Based on these findings, we conclude that the frequency of use of grammatical resources to facilitate the processing does not depend on a single typological trait (VO/OV) but it is modulated by the concurrence of other grammatical feature.
Resumo:
A general definition of interpreted formal language is presented. The notion “is a part of" is formally developed and models of the resulting part theory are used as universes of discourse of the formal languages. It is shown that certain Boolean algebras are models of part theory.
With this development, the structure imposed upon the universe of discourse by a formal language is characterized by a group of automorphisms of the model of part theory. If the model of part theory is thought of as a static world, the automorphisms become the changes which take place in the world. Using this formalism, we discuss a notion of abstraction and the concept of definability. A Galois connection between the groups characterizing formal languages and a language-like closure over the groups is determined.
It is shown that a set theory can be developed within models of part theory such that certain strong formal languages can be said to determine their own set theory. This development is such that for a given formal language whose universe of discourse is a model of part theory, a set theory can be imbedded as a submodel of part theory so that the formal language has parts which are sets as its discursive entities.
Resumo:
El objetivo principal de esta tesis doctoral es, en primer lugar, ofrecer una reconstrucción alternativa del protoainu para, en segundo lugar, aplicar conceptos de tipología diacrónicaholística con el fin de discernir algún patrón evolutivo que ayude a responder a la pregunta:¿por qué la lengua ainu es como es en su contexto geolingüístico (lengua AOV con prefijos),cuando en la región euroasiática lo normal es encontrar el perfil 'lengua AOV con sufijos'? En suma, se trata de explorar las posibilidades que ofrece la tipología diacrónica holística,combinada con métodos más tradicionales, en la investigación de las etapas prehistóricas delenguas aisladas, es decir, sin parientes conocidos, como el ainu, el vasco, el zuñi o elburushaski. Este trabajo se divide en tres grandes bloques con un total de ocho capítulos, unapéndice con las nuevas reconstrucciones protoainúes y la bibliografía.El primer bloque se abre con el capítulo 1, donde se hace una breve presentación delas lenguas ainus y su filología. El capítulo 2 está dedicado a la reconstrucción de la fonologíaprotoainu. La reconstrucción pionera pertenece a A. Vovin (1992), que de hecho sirve comobase sobre la que ampliar, corregir o modificar nuevos elementos. En el capítulo 3 se describela morfología histórica de las lenguas ainus. En el capítulo 4 se investiga esta opción dentrode un marco más amplio que tiene como objetivo analizar los patrones elementales deformación de palabras. El capítulo 5, con el que se inicia el segundo bloque, da cabida a lapresentación de una hipótesis tipológica diacrónica, a cargo de P. Donegan y D. Stampe, conla que especialistas en lenguas munda y mon-khmer han sido capaces de alcanzar unreconstrucción del protoaustroasiático según la cual el tipo aglutinante de las lenguas mundasería secundario, frente al original monosilábico de las lenguas mon-khmer. En el capítulo 6se retoma la perspectiva tradicional de la lingüística geográfica, pero no se olvidan algunas delas consideraciones tipológicas apuntadas en el capítulo anterior (el hecho de que la hipótesisde Donegan y Stampe no funcione con el ainu no significa que la tipología diacrónica nopueda ser todavía de utilidad). En el capítulo 7 se presentan algunas incongruencias queresultan tras combinar las supuestas evidencias arqueológicas con el escenario lingüísticodescrito en capítulos anteriores. Las conclusiones generales se presentan en el capítulo 8. Elapéndice es una tabla comparativa con las dos reconstrucciones disponibles a fecha de hoypara la lengua protoainu, es decir, las propuestas por A. Vovin en su estudio seminal de 1992y en el capítulo 3 de la presente tesis. Dicha tabla incluye 686 reconstrucciones (puedehacerse una sencilla referencia cruzada con Vovin, puesto que ambas están ordenadasalfabéticamente).
Resumo:
In this paper we study a simple mathematical model of a bilingual community in which all agents are f luent in the majority language but only a fraction of the population has some degree of pro ficiency in the minority language. We investigate how different distributions of pro ficiency, combined with the speaker´attitudes towards or against the minority language, may infl uence its use in pair conversations.
Resumo:
Does language-specific orthography help language detection and lexical access in naturalistic bilingual contexts? This study investigates how L2 orthotactic properties influence bilingual language detection in bilingual societies and the extent to which it modulates lexical access and single word processing. Language specificity of naturalistically learnt L2 words was manipulated by including bigram combinations that could be either L2 language-specific or common in the two languages known by bilinguals. A group of balanced bilinguals and a group of highly proficient but unbalanced bilinguals who grew up in a bilingual society were tested, together with a group of monolinguals (for control purposes). All the participants completed a speeded language detection task and a progressive demasking task. Results showed that the use of the information of orthotactic rules across languages depends on the task demands at hand, and on participants' proficiency in the second language. The influence of language orthotactic rules during language detection, lexical access and word identification are discussed according to the most prominent models of bilingual word recognition.
Resumo:
We consider systems of equations of the form where A is the underlying alphabet, the Xi are variables, the Pi,a are boolean functions in the variables Xi, and each δi is either the empty word or the empty set. The symbols υ and denote concatenation and union of languages over A. We show that any such system has a unique solution which, moreover, is regular. These equations correspond to a type of automation, called boolean automation, which is a generalization of a nondeterministic automation. The equations are then used to determine the language accepted by a sequential network; they are obtainable directly from the network.