889 resultados para Corpus parallèles


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation is an onomastic study of variation in women s name phrases in official documents in Finland during the period 1780−1930. The aim is to discuss from a socio-onomastic perspective both the changeover from patronymics to inherited family names and the use of surnames after marriage (i.e. whether women adopted their husbands family names or retained their maiden names), before new laws in this area entered into force in Finland in the early 20th century. In 1920, a law on family names that required fixed names put an end to the use of the patronymic as a person s only surname. After 1929, it was no longer possible for a married woman to retain her maiden name. Methodologically, to explain this development from a socio-onomastic perspective, I have based my study on a syntactic-semantic analysis of the actual name phrases. To be able to demonstrate the extensive material, I have elaborated a scheme to divide the 115 different types of name phrases into 13 main categories. The analysis of the material for Helsinki is based on frequency calculations of the different types of name phrases every thirtieth year, as well as on describing variation in the structure and semantic content of the name phrases, e.g. social variation in the use of titles and epithets. In addition to this, by applying a biographic-genealogical method, I have conducted two case studies of the usage of women s name phrases in the two chosen families. The study is based on parish registers from the period 1780−1929, estate inventory documents from the period 1780−1928, registration forms for liberty of trade from the period 1880−1908, family announcements on newspapers from the period 1829−1888, gravestones from the period 1796−1929 and diaries from the periods 1799−1801 and 1818−1820 providing a corpus of 5 950 name phrases. The syntactic-semantic analysis has revealed the overall picture of various ways of denoting women in official documents. In Helsinki, towards the end of the 19th century, the use of inherited family names seems to be almost fully developed in official contexts. At the late 19th century, a patronymic still appears as the only surname of some working-class women whereas in the early 20th century patronymics were only entered in the parish register as a kind of middle name. In the beginning of the 19th century, most married women were still registered under their maiden names, with a few exceptions among the bourgeoisie and upper class. The comparative analysis of name phrases in diaries, however, indicates that the use of the husband s family name by married women was a much earlier phenomenon in private contexts than in official documents. Keywords: socio-onomastics, syntactic-semantic analysis, name phrase, patronymic, maiden name, husband s family name

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The multifaceted passive present participle in Finnish This study investigates the uses of the passive present participle in Finnish. The participle occurs in a variety of syntactic environments and exhibits a rich polysemy. Former descriptions have treated it as a mainly modal element, but it has several non-modal uses as well. The present study provides an overview of its uses and meanings, with the main focus on the factors which trigger the modal reading. In addition, the study contains two case studies on modal periphrastic constructions consisting of the verb 'to be' and the present passive participle, the Obligation construction, e.g., on men-tä-vä [is go-pass-ptc], and the Possiblity construction, e.g., on pelaste-tta-v-i-ssa [is save-pass-ptc-pl-ine]. The study is based on empirical data of 9000 sentences obtained from i) large collections of transcribed material from Finnish dialects, ii) a corpus of modern Finnish newspaper texts, iii) corpora of Old Finnish texts. Both in colloquial and standard Finnish the reading of the participle is highly dependent of the context and determined by such factors as the overall syntactic environment and other co-occurring elements. One of the main findings here is that the Finnish passive present participle is not modal per se. The contextual modal reading arises whenever the state of affairs is conceptualized from the viewpoint of the implied subject of the participle, and the meaning of possibility or obligation depends mostly on whether the situation is pleasant or undesirable. In sections examining the grammaticalization of the Possibility and Obligation constructions, the perspective is diachronic. Both constructions have derived from copula constructions with the passive present participle as a predicate (adjective or adverb). These sections show how a linguistic change can be investigated on the basis of the patterns of usage in the empirical data. The Possibility construction is currently going through a restructuration to a passive verbal complex. The source of this construction is reflected in its present-day use by the fact that it heavily biased towards a small set of verbs. The Obligation construction has grammaticalized to a construction comparable to a compound tense. Patterns of use of the construction show that grammaticalization originates in specific syntactic constructions with an implication of practical necessity. Furthermore, it is shown that the Obligation construction has grammaticalized in different directions in standard and colloquial Finnish. Differing from the study on most typical phenomena investigated in the literature on grammaticalization of modality, the present study opens new perspectives and methods for discussion on these questions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sodium dodecyl sulphate-polyacrylamide gel electrophoresis of Percoll purified Leydig cell proteins from 20- and 120-day-old rats revealed a significant decrease in a low molecular weight peptide in the adult rats. Administration of human chorionic gonadotropin to immature rats resulted in a decrease in the low molecular weight peptide along with increase in testosterone production. Modulation of the peptide by human chorionic gonadotropin could be confirmed by Western blotting. The presence of a similar peptide could be detected by Western blotting in testes of immature mouse, hamster, guinea pig but not in adrenal, placenta and corpus luteum. Administration of testosterone propionate which is known to inhibit the pituitary luteinizing hormone levels in adult rats resulted in an increase in the low molecular weight peptide, as checked by Western blotting. It is suggested that this peptide may have a role in regulation of acquisition of responsiveness to luteinizing hormone by immature rat Leydig cells.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Analisi contrastiva delle modalità di traduzione in finnico dei Tempi verbali e delle perifrasi aspettuali dell italiano (Italian Philology) The topic of this research is a contrastive study of tenses and aspect in Italian and in Finnish. The study aims to develop a research method for analyzing translations and comparable texts (non-translation) written in a target language. Thus, the analysis is based on empirical data consisting of translations of novels from Italian to Finnish and vice versa. In addition to this, for the section devoted to solutions adopted in Finnish for translating the Italian tenses Perfetto Semplice and Perfetto Composto, 39 Finnish native speakers were asked to answer questions concerning the choice of Perfekti and Imperfekti in Finnish. The responses given by the Finnish informants were compared to the choices made by translators in the target language, and in this way it was possible both to benefit from the motivation provided by native speakers to explain the selection of a tense (Imperfekti/Perfekti) in a specific context compared with the Italian formal equivalents (Perfetto Composto/Perfetto Semplice), and to define the specific features of the Finnish verb tenses. The research aims to develop a qualitative method for the analysis of formal equivalents and translational changes ( shifts ). Although, as the choice of Italian and Finnish progressive forms is optional and related to speaker preferences, besides the qualitative analysis, I also considered it necessary to operate a quantitative one in order to find out whether the two items share the same degree of correspondence in frequency of use. In this study I explain translation choices in light of cognitive grammar, suggesting that particular translation relationships derive from so-called construal operations. I use the concepts of cognitive linguistics not only to analyze the convergences and divergences of the two aspectual systems, but also to redefine some general procedures related to the phenomenon of translation. For the practical analysis of the corpus were for the most part employed theoretical categories developed in a framework proposed by Pier Marco Bertinetto. Following this approach, the notions of aspect (the morphologic or morphosyntactic, subjective level) and actionality (the lexical aspect or objective level, traditionally Aktionsart) are carefully distinguished. This also allowed me to test the applicability of these distinctions to two languages typologically different from each other. The data allowed both the analysis of the semantic and pragmatic features that determine tense and aspect choices in these two languages, and to discover the correspondences between the two language systems and the strategies that translators are forced to resort to in particular situations. The research provides not only a detailed and analytically argued inventory about possible solutions for translating Italian tenses and aspectual devices in Finnish that could be of pedagogical relevance, but also new contributions about the specific uses of time-aspectual devices in the two languages in question.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The habit of "drinking smoke" , meaning tobacco smoking, caused a true controversy in early modern England. The new substance was used both for its alleged therapeutic properties as well as its narcotic effects. The dispute over tobacco continues the line of written controversies which were an important means of communication in the sixteenth and seventeenth century Europe. The tobacco controversy is special among medical controversies because the recreational use of tobacco soon spread and outweighed its medicinal use, ultimately causing a social and cultural crisis in England. This study examines how language is used in polemic discourse and argumentation. The material consists of medical texts arguing for and against tobacco in early modern England. The texts were compiled into an electronic corpus of tobacco texts (1577 1670) representing different genres and styles of writing. With the help of the corpus, the tobacco controversy is described and analyzed in the context of early modern medicine. A variety of methods suitable for the study of conflict discourse were used to assess internal and external text variation. The linguistic features examined include personal pronouns, intertextuality, structural components, and statistically derived keywords. A common thread in the work is persuasive language use manifested, for example, in the form of emotive adjectives and the generic use of pronouns; the latter is especially pronounced in the dichotomy between us and them. Controversies have not been studied in this manner before but the methods applied have supplemented each other and proven their suitability in the study of conflictive discourse. These methods can also be applied to present-day materials.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work is a case study of applying nonparametric statistical methods to corpus data. We show how to use ideas from permutation testing to answer linguistic questions related to morphological productivity and type richness. In particular, we study the use of the suffixes -ity and -ness in the 17th-century part of the Corpus of Early English Correspondence within the framework of historical sociolinguistics. Our hypothesis is that the productivity of -ity, as measured by type counts, is significantly low in letters written by women. To test such hypotheses, and to facilitate exploratory data analysis, we take the approach of computing accumulation curves for types and hapax legomena. We have developed an open source computer program which uses Monte Carlo sampling to compute the upper and lower bounds of these curves for one or more levels of statistical significance. By comparing the type accumulation from women’s letters with the bounds, we are able to confirm our hypothesis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study is a pragmatic description of the evolution of the genre of English witchcraft pamphlets from the mid-sixteenth century to the end of the seventeenth century. Witchcraft pamphlets were produced for a new kind of readership semi-literate, uneducated masses and the central hypothesis of this study is that publishing for the masses entailed rethinking the ways of writing and printing texts. Analysis of the use of typographical variation and illustrations indicates how printers and publishers catered to the tastes and expectations of this new audience. Analysis of the language of witchcraft pamphlets shows how pamphlet writers took into account the new readership by transforming formal written source materials trial proceedings into more immediate ways of writing. The material for this study comes from the Corpus of Early Modern English Witchcraft Pamphlets, which has been compiled by the author. The multidisciplinary analysis incorporates both visual and linguistic aspects of the texts, with methodologies and theoretical insights adopted eclectically from historical pragmatics, genre studies, book history, corpus linguistics, systemic functional linguistics and cognitive psychology. The findings are anchored in the socio-historical context of early modern publishing, reading, literacy and witchcraft beliefs. The study shows not only how consideration of a new audience by both authors and printers influenced the development of a genre, but also the value of combining visual and linguistic features in pragmatic analyses of texts.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El presente estudio supone un intento de describir y analizar el uso de la preposición "de" sobre la base de un corpus diacrónico, con énfasis en las diferentes relaciones semánticas que establece. Partiendo de un total de más de 16.000 casos de "de" hemos establecido 48 categorías de uso, que corresponden a cuatro tipos de construcción sintáctica, a saber, el uso de "de" como complemento de nombres (CN), verbos (CV), adjetivos (CA) y, finalmente, su uso como núcleo de expresiones adverbiales independientes (CI). El estudio consta de tres partes fundamentales. En la parte I, se introduce la Lingüística Cognitiva, que constituye la base teórica esencial del trabajo. Más exactamente, se introducen conceptos como la teoría del prototipo, la teoría de las metáforas conceptuales y la gramática cognitiva, especialmente las ideas de "punto de referencia" y "relación intrínseca" (Langacker 1995, 1999). La parte II incluye el análisis de las 48 categorías. En esta parte se presentan y comentan casi 2.000 ejemplos del uso contextual de "de" extraídos del corpus diacrónico. Los resultados más importantes del análisis pueden resumirse en los siguientes puntos: El uso de "de" sigue siendo esencialmente el mismo en la actualidad que hace 800 años, en el sentido de que todas las 48 categorías se identifican en todas las épocas del corpus. El uso de "de" como complemento nominal va aumentando, al contrario de lo que ocurre con su uso como complemento verbal. En el contexto nominal son especialmente las relaciones posesivas más abstractas las que se hacen más frecuentes, mientras que en el contexto verbal las relaciones que se hacen menos frecuentes son las de separación/alejamiento, causa, agente y partitivo indefinido. Destaca la importancia del siglo XVIII como época de transición entre un primer estado de las cosas y otro posterior, en especial en relación con el carácter cada vez más abstracto de las relaciones posesivas así como con la disminución de las categorías adverbales de causa, agente y partitivo. Pese a la variación en el contexto inmediato de uso, el núcleo semántico de "de" se mantiene inalterado. La parte III toma como punto de partida los resultados del análisis de la parte II, tratando de deslindar el aporte semántico de la preposición "de" a su contexto de uso del valor de la relación en conjunto. Así, recurriendo a la metodología para determinar el significado básico y la metodología para determinar lo que constituyen significados distintos de una preposición (Tyler , Evans 2003a, 2003b), se llega a la hipótesis de que "de" posee cuatro significados básicos, a saber, 'punto de partida', 'tema/asunto', 'parte/todo' y 'posesión'. Esta hipótesis, basada en las metodologías de Tyler y Evans y en los resultados del análisis de corpus, se intenta verificar empíricamente mediante el uso de dos cuestionarios destinados a averiguar hasta qué punto las distinciones semánticas a las que se llega por vía teórica son reconocidas por los hablantes nativos de la lengua (cf. Raukko 2003). El resultado conjunto de los dos acercamientos tanto refuerza como especifica la hipótesis. Los datos que arroja el análisis de los cuestionarios parecen reforzar la idea de que el núcleo semántico de "de" es complejo, constando de los cuatro valores mencionados. Sin embargo, cada uno de estos valores básicos constituye un prototipo local, en torno al cual se construye un complejo de matices semánticos derivados del prototipo. La idea final es que los hablantes son conscientes de los cuatro postulados valores básicos, pero que también distinguen matices más detallados, como son las ideas de 'causa', 'agente', 'instrumento', 'finalidad', 'cualidad', etc. Es decir, "de" constituye un elemento polisémico complejo cuya estructura semántica puede describirse como una semejanza de familia centrada en cuatro valores básicos en torno a los cuales se encuentra una serie de matices más específicos, que también constituyen valores propios de la preposición. Creemos, además, que esta caracterización semántica es válida para todas las épocas de la historia del español, con unas pequeñas modificaciones en el peso relativo de los distintos matices, lo cual está relacionado con la observada variación diacrónica en el uso de "de".

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim of the study is to investigate the use of finlandisms in an historical perspective, how they have been viewed from the mid-19th century to this day, and the effect of language planning on their use. A finlandism is a word, a phrase, or a structure that is used only in the Swedish varieties used in Finland (i.e. in Finland Swedish), or used in these varieties in a different meaning than in the Swedish used in Sweden. Various aspects of Finland-Swedish language planning are discussed in relation to language planning generally; in addition, the relation of Finland Swedish to Standard Swedish and standard regional varieties is discussed, and various types of finlandisms are analysed in detail. A comprehensive picture is provided of the emergence and evolution of the ideology of language planning from the mid-19th century up until today. A theoretical model of corpus planning is presented and its effect on linguistic praxis described. One result of the study is that the belief among Finland-Swedish language planners that the Swedish language in Finland must not be allowed to become distanced from Standard Swedish, has been widely adopted by the average Finland Swede, particularly during the interwar period, following the publication of Hugo Bergroth s work Finlandssvenska in 1917. Criticism of this language-planning ideology started to appear in the 1950s, and intensified in the 1970s. However, language planning and the basis for this conception of language continue to enjoy strong support among Swedish-speaking Finns. I show that the editing of Finnish literary texts written in Swedish has often been somewhat amateurish and the results not always linguistically appropriate, and that Swedish publishers have in fact adopted a rather liberal attitude towards finlandisms. My conclusion is that language planning has achieved rather modest results in its resistance to finlandisms. Most of the finlandisms used in 1915 were still in use in 2005. Finlandisms occur among speakers of all ages, and even among academically educated people despite their more elevated style. The most common finlandisms were used by informants of all ages. The ones that are firmly rooted are the most established, in other words those that are stylistically neutral, seemingly genuinely Swedish, but which are nevertheless strongly supported by Finnish, and display a shift in meaning as compared with Standard Swedish.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Language Documentation and Description as Language Planning Working with Three Signed Minority Languages Sign languages are minority languages that typically have a low status in society. Language planning has traditionally been controlled from outside the sign-language community. Even though signed languages lack a written form, dictionaries have played an important role in language description and as tools in foreign language learning. The background to the present study on sign language documentation and description as language planning is empirical research in three dictionary projects in Finland-Swedish Sign Language, Albanian Sign Language, and Kosovar Sign Language. The study consists of an introductory article and five detailed studies which address language planning from different perspectives. The theoretical basis of the study is sociocultural linguistics. The research methods used were participant observation, interviews, focus group discussions, and document analysis. The primary research questions are the following: (1) What is the role of dictionary and lexicographic work in language planning, in research on undocumented signed language, and in relation to the language community as such? (2) What factors are particular challenges in the documentation of a sign language and should therefore be given special attention during lexicographic work? (3) Is a conventional dictionary a valid tool for describing an undocumented sign language? The results indicate that lexicographic work has a central part to play in language documentation, both as part of basic research on undocumented sign languages and for status planning. Existing dictionary work has contributed new knowledge about the languages and the language communities. The lexicographic work adds to the linguistic advocacy work done by the community itself with the aim of vitalizing the language, empowering the community, receiving governmental recognition for the language, and improving the linguistic (human) rights of the language users. The history of signed languages as low status languages has consequences for language planning and lexicography. One challenge that the study discusses is the relationship between the sign-language community and the hearing sign linguist. In order to make it possible for the community itself to take the lead in a language planning process, raising linguistic awareness within the community is crucial. The results give rise to questions of whether lexicographic work is of more importance for status planning than for corpus planning. A conventional dictionary as a tool for describing an undocumented sign language is criticised. The study discusses differences between signed and spoken/written languages that are challenging for lexicographic presentations. Alternative electronic lexicographic approaches including both lexicon and grammar are also discussed. Keywords: sign language, Finland-Swedish Sign Language, Albanian Sign Language, Kosovar Sign Language, language documentation and description, language planning, lexicography

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Language software applications encounter new words, e.g., acronyms, technical terminology, names or compounds of such words. In order to add new words to a lexicon, we need to indicate their inflectional paradigm. We present a new generally applicable method for creating an entry generator, i.e. a paradigm guesser, for finite-state transducer lexicons. As a guesser tends to produce numerous suggestions, it is important that the correct suggestions be among the first few candidates. We prove some formal properties of the method and evaluate it on Finnish, English and Swedish full-scale transducer lexicons. We use the open-source Helsinki Finite-State Technology to create finitestate transducer lexicons from existing lexical resources and automatically derive guessers for unknown words. The method has a recall of 82-87 % and a precision of 71-76 % for the three test languages. The model needs no external corpus and can therefore serve as a baseline.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

N-gram language models and lexicon-based word-recognition are popular methods in the literature to improve recognition accuracies of online and offline handwritten data. However, there are very few works that deal with application of these techniques on online Tamil handwritten data. In this paper, we explore methods of developing symbol-level language models and a lexicon from a large Tamil text corpus and their application to improving symbol and word recognition accuracies. On a test database of around 2000 words, we find that bigram language models improve symbol (3%) and word recognition (8%) accuracies and while lexicon methods offer much greater improvements (30%) in terms of word recognition, there is a large dependency on choosing the right lexicon. For comparison to lexicon and language model based methods, we have also explored re-evaluation techniques which involve the use of expert classifiers to improve symbol and word recognition accuracies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present a novel approach that makes use of topic models based on Latent Dirichlet allocation(LDA) for generating single document summaries. Our approach is distinguished from other LDA based approaches in that we identify the summary topics which best describe a given document and only extract sentences from those paragraphs within the document which are highly correlated given the summary topics. This ensures that our summaries always highlight the crux of the document without paying any attention to the grammar and the structure of the documents. Finally, we evaluate our summaries on the DUC 2002 Single document summarization data corpus using ROUGE measures. Our summaries had higher ROUGE values and better semantic similarity with the documents than the DUC summaries.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

When document corpus is very large, we often need to reduce the number of features. But it is not possible to apply conventional Non-negative Matrix Factorization(NMF) on billion by million matrix as the matrix may not fit in memory. Here we present novel Online NMF algorithm. Using Online NMF, we reduced original high-dimensional space to low-dimensional space. Then we cluster all the documents in reduced dimension using k-means algorithm. We experimentally show that by processing small subsets of documents we will be able to achieve good performance. The method proposed outperforms existing algorithms.