883 resultados para corpus diacrônico
Resumo:
A straightforward computation of the list of the words (the `tail words' of the list) that are distributionally most similar to a given word (the `head word' of the list) leads to the question: How semantically similar to the head word are the tail words; that is: how similar are their meanings to its meaning? And can we do better? The experiment was done on nearly 18,000 most frequent nouns in a Finnish newsgroup corpus. These nouns are considered to be distributionally similar to the extent that they occur in the same direct dependency relations with the same nouns, adjectives and verbs. The extent of the similarity of their computational representations is quantified with the information radius. The semantic classification of head-tail pairs is intuitive; some tail words seem to be semantically similar to the head word, some do not. Each such pair is also associated with a number of further distributional variables. Individually, their overlap for the semantic classes is large, but the trained classification-tree models have some success in using combinations to predict the semantic class. The training data consists of a random sample of 400 head-tail pairs with the tail word ranked among the 20 distributionally most similar to the head word, excluding names. The models are then tested on a random sample of another 100 such pairs. The best success rates range from 70% to 92% of the test pairs, where a success means that the model predicted my intuitive semantic class of the pair. This seems somewhat promising when distributional similarity is used to capture semantically similar words. This analysis also includes a general discussion of several different similarity formulas, arranged in three groups: those that apply to sets with graded membership, those that apply to the members of a vector space, and those that apply to probability mass functions.
Resumo:
Abstract: Research on translation universals has its roots in the need to make generalizations about the features that distinguish translations from non-translations. They go back to the old tradition of negative comments about the failings of typical translations. These comments concern the relations between translations and the target language, and between translations and their source texts. With the rise of descriptive studies, and the use of corpus research methods borrowed from linguistics, the search for the typical features of translations became more systematic. A number of hypotheses about potential universals have been proposed, and tested on different languages and language pairs. Some of them are evidently false; on others, the jury is still out. If some hypotheses continue to be supported by empirical evidence, the question then arises of how they might best be explained. There has been fierce criticism of some of the assumptions underlying the search for universals, including the use of the term 'universal'itself, but the approach has also brought clear methodological benefits.
Resumo:
The impact of Greek-Egyptian bilingualism on language use and linguistic competence is the key issue in this dissertation. The language use in a corpus of 148 Greek notarial contracts is analyzed on phonological, morphological and syntactic levels. The texts were written by bilingual notaries (agoranomoi) in Upper Egypt in the later Hellenistic period. They present, for the most part, very good administrative Greek. On the other hand, their language contains variation and idiosyncrasies that were earlier condemned as ungrammatical and bad Greek, and were not subjected to closer analysis. In order to reach plausible explanations for those phenomena, a thorough research into the sociohistorical and linguistic context was needed before the linguistic analysis. The general linguistic landscape, the population pattern and the status and frequency of Greek literacy in Ptolemaic Egypt in general, and in Upper Egypt in particular, are presented. Through a detailed examination of the notaries themselves (their names, families and handwriting), it became evident that there were one to three persons at the notarial office writing under the signature of one notary. Often the documents under one notary's name were written in the same hand. We get, therefore, exceptionally close to studying idiolects in written material from antiquity. The qualitative linguistic analysis revealed that the notaries made relatively few orthographic mistakes that reflect the ongoing phonological changes and they mastered the morphological forms. The problems arose at the syntactic level, for example, with the pattern of agreement between the noun groups or a noun with its modifiers. The significant structural differences between Greek and Egyptian can be behind the innovative strategies used by some of the notaries. Moreover, certain syntactic structures were clearly transferred from the notaries first language, Egyptian. This is obvious in the relative clause structure. Transfer can be found in other structures, as well, although, we must not forget the influence of parallel Greek structures. Sometimes these can act simultaneously. The interesting linguistic strategies and transfer features come mostly from the hand of one notary, Hermias. Some other notaries show similar patterns, for example, Hermias' cousin, Ammonios. Hermias' texts reveal that he probably spoke Greek more than his predecessors. It is possible to conclude, then, that the notaries of the later generations were more fluently bilingual; their two languages were partly integrated in their minds as an interlanguage combining elements from both languages. The earlier notaries had the two languages functionally separated and they followed the standardized contract formulae more rigidly.
Resumo:
In this study we explore the concurrent, combined use of three research methods, statistical corpus analysis and two psycholinguistic experiments (a forced-choice and an acceptability rating task), using verbal synonymy in Finnish as a case in point. In addition to supporting conclusions from earlier studies concerning the relationships between corpus-based and ex- perimental data (e. g., Featherston 2005), we show that each method adds to our understanding of the studied phenomenon, in a way which could not be achieved through any single method by itself. Most importantly, whereas relative rareness in a corpus is associated with dispreference in selection, such infrequency does not categorically always entail substantially lower acceptability. Furthermore, we show that forced-choice and acceptability rating tasks pertain to distinct linguistic processes, with category-wise in- commensurable scales of measurement, and should therefore be merged with caution, if at all.
Resumo:
Abstract The modern food system and sustainable development form a conceptual combination that suggests sustainability deficits in the ways we deal with food consumption and production - in terms of economic relations, environmental impacts and nutritional status of western population. This study explores actors’ orientations towards sustainability by taking into account actors’ embedded positions within structures of the food system, actors’ economic relations and views about sustainability as well as their possibilities for progressive activities. The study looks particularly at social dynamics for sustainability within primary production and public consumption. If actors within these two worlds were to express converging orientations for sustainability, the system dynamics of the market would enable more sustainable growth in terms of production dictated by consumption. The study is based on a constructivist research approach with qualitative text analyses. The data consisted of three text corpora, the ‘local food corpus’, the ‘catering corpus’ and the ‘mixed corpus’. The local food actors were interviewed about their economic exchange relations. The caterers’ interviews dealt with their professional identity for sustainability. Finally, the mixed corpus assembled a dialogue as a participatory research approach, which was applied in order to enable researcher and caterer learning about the use of organic milk in public catering. The data were analysed for theoretically conceptualised relations, expressing behavioural patterns in actors’ everyday work as interpreted by the researcher. The findings were corroborated by the internal and external communities of food system actors. The interpretations have some validity, although they only present abstractions of everyday life and its rich, even opaque, fabric of meanings and aims. The key findings included primary producers’ social skilfulness, which enabled networking with other actors in very different paths of life, learning in order to promote one’s trade, and trusting reflectively in partners in order to extend business. These activities expanded the supply chain in a spiral fashion by horizontal and vertical forward integration, until large retailers were met for negotiations on a more equal or ‘other regarding’ basis. This kind of chain level coordination, typically building around the core of social and partnership relations, was coined as a socially overlaid network. It supported market access of local farmers, rooted in their farms, who were able to draw on local capital and labour in promotion of competitive business; the growth was endogenous. These kinds of chains – one conventional and one organic – were different from the strategic chain, which was more profit based and while highly competitive, presented exogenous growth as it depended on imported capital and local employees. However, the strategic chain offered learning opportunities and support for the local economy. The caterers exhibited more or less committed professional identity for sustainability within their reach. The facilitating and balanced approaches for professional identities dealt successfully with local and organic food in addition to domestic food, and also imported food. The co-operation with supply chains created innovative solutions and savings for the business parties to be shared. The rule-abiding approach for sustainability only made choices among organic supply chains without extending into co-operation with actors. There were also more complicated and troubled identities as juggling, critical and delimited approaches for sustainability, with less productive efforts due to restrictions such as absence of organisational sustainability strategy, weak presence of local and organic suppliers, limited understanding about sustainability and no organisational resources to develop changes towards a sustainable food system. Learning in the workplace about food system reality in terms of supply chain co-operation may prove to be a change engine that leads to advanced network operations and a more sustainable food system. The convergence between primary producers and caterers existed to an extent allowing suggestion that increased clarity about sustainable consumption and production by actors could be approached using advanced tools. The study looks for introduction of more profound environmental and socio-economic knowledge through participatory research with supply chain actors in order to promote more sustainable food systems. Summary of original publications and the authors’ contribution I Mikkola, M. & Seppänen, L. 2006. Farmers’ new participation in food chains: making horizontal and vertical progress by networking. In: Langeveld, H. & Röling N. (Eds.). Changing European farming systems for a better future. New visions for rural areas. Wageningen, The Netherlands. Wageningen Academic Publishers: 267–271. II Mikkola, M. 2008. Coordinative structures and development of food supply chains. British Food Journal 110 (2): 189–205. III Mikkola, M. 2009. Shaping professional identity for sustainability. Evidence in Finnish public catering. Appetite 53 (1): 56–65. IV Mikkola, M. 2009. Catering for sustainability: building a dialogue on organic milk. Agronomy Research 7 (Special issue 2): 668–676. Minna Mikkola has been responsible for developing the generic research frame, particular research questions, the planning and collection of the data, their qualitative analysis and writing the articles I, II, III and IV. Dr Laura Seppänen has contributed to the development of the generic research frame and article I by introducing the author to the basic concepts of economic sociology and by supporting the writing of article II with her critical comments. Articles are printed with permission from the publishers.
Resumo:
It is important to identify the ``correct'' number of topics in mechanisms like Latent Dirichlet Allocation(LDA) as they determine the quality of features that are presented as features for classifiers like SVM. In this work we propose a measure to identify the correct number of topics and offer empirical evidence in its favor in terms of classification accuracy and the number of topics that are naturally present in the corpus. We show the merit of the measure by applying it on real-world as well as synthetic data sets(both text and images). In proposing this measure, we view LDA as a matrix factorization mechanism, wherein a given corpus C is split into two matrix factors M-1 and M-2 as given by C-d*w = M1(d*t) x Q(t*w).Where d is the number of documents present in the corpus anti w is the size of the vocabulary. The quality of the split depends on ``t'', the right number of topics chosen. The measure is computed in terms of symmetric KL-Divergence of salient distributions that are derived from these matrix factors. We observe that the divergence values are higher for non-optimal number of topics - this is shown by a `dip' at the right value for `t'.
Resumo:
This dissertation is an onomastic study of variation in women s name phrases in official documents in Finland during the period 1780−1930. The aim is to discuss from a socio-onomastic perspective both the changeover from patronymics to inherited family names and the use of surnames after marriage (i.e. whether women adopted their husbands family names or retained their maiden names), before new laws in this area entered into force in Finland in the early 20th century. In 1920, a law on family names that required fixed names put an end to the use of the patronymic as a person s only surname. After 1929, it was no longer possible for a married woman to retain her maiden name. Methodologically, to explain this development from a socio-onomastic perspective, I have based my study on a syntactic-semantic analysis of the actual name phrases. To be able to demonstrate the extensive material, I have elaborated a scheme to divide the 115 different types of name phrases into 13 main categories. The analysis of the material for Helsinki is based on frequency calculations of the different types of name phrases every thirtieth year, as well as on describing variation in the structure and semantic content of the name phrases, e.g. social variation in the use of titles and epithets. In addition to this, by applying a biographic-genealogical method, I have conducted two case studies of the usage of women s name phrases in the two chosen families. The study is based on parish registers from the period 1780−1929, estate inventory documents from the period 1780−1928, registration forms for liberty of trade from the period 1880−1908, family announcements on newspapers from the period 1829−1888, gravestones from the period 1796−1929 and diaries from the periods 1799−1801 and 1818−1820 providing a corpus of 5 950 name phrases. The syntactic-semantic analysis has revealed the overall picture of various ways of denoting women in official documents. In Helsinki, towards the end of the 19th century, the use of inherited family names seems to be almost fully developed in official contexts. At the late 19th century, a patronymic still appears as the only surname of some working-class women whereas in the early 20th century patronymics were only entered in the parish register as a kind of middle name. In the beginning of the 19th century, most married women were still registered under their maiden names, with a few exceptions among the bourgeoisie and upper class. The comparative analysis of name phrases in diaries, however, indicates that the use of the husband s family name by married women was a much earlier phenomenon in private contexts than in official documents. Keywords: socio-onomastics, syntactic-semantic analysis, name phrase, patronymic, maiden name, husband s family name
Resumo:
The multifaceted passive present participle in Finnish This study investigates the uses of the passive present participle in Finnish. The participle occurs in a variety of syntactic environments and exhibits a rich polysemy. Former descriptions have treated it as a mainly modal element, but it has several non-modal uses as well. The present study provides an overview of its uses and meanings, with the main focus on the factors which trigger the modal reading. In addition, the study contains two case studies on modal periphrastic constructions consisting of the verb 'to be' and the present passive participle, the Obligation construction, e.g., on men-tä-vä [is go-pass-ptc], and the Possiblity construction, e.g., on pelaste-tta-v-i-ssa [is save-pass-ptc-pl-ine]. The study is based on empirical data of 9000 sentences obtained from i) large collections of transcribed material from Finnish dialects, ii) a corpus of modern Finnish newspaper texts, iii) corpora of Old Finnish texts. Both in colloquial and standard Finnish the reading of the participle is highly dependent of the context and determined by such factors as the overall syntactic environment and other co-occurring elements. One of the main findings here is that the Finnish passive present participle is not modal per se. The contextual modal reading arises whenever the state of affairs is conceptualized from the viewpoint of the implied subject of the participle, and the meaning of possibility or obligation depends mostly on whether the situation is pleasant or undesirable. In sections examining the grammaticalization of the Possibility and Obligation constructions, the perspective is diachronic. Both constructions have derived from copula constructions with the passive present participle as a predicate (adjective or adverb). These sections show how a linguistic change can be investigated on the basis of the patterns of usage in the empirical data. The Possibility construction is currently going through a restructuration to a passive verbal complex. The source of this construction is reflected in its present-day use by the fact that it heavily biased towards a small set of verbs. The Obligation construction has grammaticalized to a construction comparable to a compound tense. Patterns of use of the construction show that grammaticalization originates in specific syntactic constructions with an implication of practical necessity. Furthermore, it is shown that the Obligation construction has grammaticalized in different directions in standard and colloquial Finnish. Differing from the study on most typical phenomena investigated in the literature on grammaticalization of modality, the present study opens new perspectives and methods for discussion on these questions.
Resumo:
Sodium dodecyl sulphate-polyacrylamide gel electrophoresis of Percoll purified Leydig cell proteins from 20- and 120-day-old rats revealed a significant decrease in a low molecular weight peptide in the adult rats. Administration of human chorionic gonadotropin to immature rats resulted in a decrease in the low molecular weight peptide along with increase in testosterone production. Modulation of the peptide by human chorionic gonadotropin could be confirmed by Western blotting. The presence of a similar peptide could be detected by Western blotting in testes of immature mouse, hamster, guinea pig but not in adrenal, placenta and corpus luteum. Administration of testosterone propionate which is known to inhibit the pituitary luteinizing hormone levels in adult rats resulted in an increase in the low molecular weight peptide, as checked by Western blotting. It is suggested that this peptide may have a role in regulation of acquisition of responsiveness to luteinizing hormone by immature rat Leydig cells.
Resumo:
Analisi contrastiva delle modalità di traduzione in finnico dei Tempi verbali e delle perifrasi aspettuali dell italiano (Italian Philology) The topic of this research is a contrastive study of tenses and aspect in Italian and in Finnish. The study aims to develop a research method for analyzing translations and comparable texts (non-translation) written in a target language. Thus, the analysis is based on empirical data consisting of translations of novels from Italian to Finnish and vice versa. In addition to this, for the section devoted to solutions adopted in Finnish for translating the Italian tenses Perfetto Semplice and Perfetto Composto, 39 Finnish native speakers were asked to answer questions concerning the choice of Perfekti and Imperfekti in Finnish. The responses given by the Finnish informants were compared to the choices made by translators in the target language, and in this way it was possible both to benefit from the motivation provided by native speakers to explain the selection of a tense (Imperfekti/Perfekti) in a specific context compared with the Italian formal equivalents (Perfetto Composto/Perfetto Semplice), and to define the specific features of the Finnish verb tenses. The research aims to develop a qualitative method for the analysis of formal equivalents and translational changes ( shifts ). Although, as the choice of Italian and Finnish progressive forms is optional and related to speaker preferences, besides the qualitative analysis, I also considered it necessary to operate a quantitative one in order to find out whether the two items share the same degree of correspondence in frequency of use. In this study I explain translation choices in light of cognitive grammar, suggesting that particular translation relationships derive from so-called construal operations. I use the concepts of cognitive linguistics not only to analyze the convergences and divergences of the two aspectual systems, but also to redefine some general procedures related to the phenomenon of translation. For the practical analysis of the corpus were for the most part employed theoretical categories developed in a framework proposed by Pier Marco Bertinetto. Following this approach, the notions of aspect (the morphologic or morphosyntactic, subjective level) and actionality (the lexical aspect or objective level, traditionally Aktionsart) are carefully distinguished. This also allowed me to test the applicability of these distinctions to two languages typologically different from each other. The data allowed both the analysis of the semantic and pragmatic features that determine tense and aspect choices in these two languages, and to discover the correspondences between the two language systems and the strategies that translators are forced to resort to in particular situations. The research provides not only a detailed and analytically argued inventory about possible solutions for translating Italian tenses and aspectual devices in Finnish that could be of pedagogical relevance, but also new contributions about the specific uses of time-aspectual devices in the two languages in question.
Resumo:
The habit of "drinking smoke" , meaning tobacco smoking, caused a true controversy in early modern England. The new substance was used both for its alleged therapeutic properties as well as its narcotic effects. The dispute over tobacco continues the line of written controversies which were an important means of communication in the sixteenth and seventeenth century Europe. The tobacco controversy is special among medical controversies because the recreational use of tobacco soon spread and outweighed its medicinal use, ultimately causing a social and cultural crisis in England. This study examines how language is used in polemic discourse and argumentation. The material consists of medical texts arguing for and against tobacco in early modern England. The texts were compiled into an electronic corpus of tobacco texts (1577 1670) representing different genres and styles of writing. With the help of the corpus, the tobacco controversy is described and analyzed in the context of early modern medicine. A variety of methods suitable for the study of conflict discourse were used to assess internal and external text variation. The linguistic features examined include personal pronouns, intertextuality, structural components, and statistically derived keywords. A common thread in the work is persuasive language use manifested, for example, in the form of emotive adjectives and the generic use of pronouns; the latter is especially pronounced in the dichotomy between us and them. Controversies have not been studied in this manner before but the methods applied have supplemented each other and proven their suitability in the study of conflictive discourse. These methods can also be applied to present-day materials.
Resumo:
This work is a case study of applying nonparametric statistical methods to corpus data. We show how to use ideas from permutation testing to answer linguistic questions related to morphological productivity and type richness. In particular, we study the use of the suffixes -ity and -ness in the 17th-century part of the Corpus of Early English Correspondence within the framework of historical sociolinguistics. Our hypothesis is that the productivity of -ity, as measured by type counts, is significantly low in letters written by women. To test such hypotheses, and to facilitate exploratory data analysis, we take the approach of computing accumulation curves for types and hapax legomena. We have developed an open source computer program which uses Monte Carlo sampling to compute the upper and lower bounds of these curves for one or more levels of statistical significance. By comparing the type accumulation from women’s letters with the bounds, we are able to confirm our hypothesis.
Resumo:
This study is a pragmatic description of the evolution of the genre of English witchcraft pamphlets from the mid-sixteenth century to the end of the seventeenth century. Witchcraft pamphlets were produced for a new kind of readership semi-literate, uneducated masses and the central hypothesis of this study is that publishing for the masses entailed rethinking the ways of writing and printing texts. Analysis of the use of typographical variation and illustrations indicates how printers and publishers catered to the tastes and expectations of this new audience. Analysis of the language of witchcraft pamphlets shows how pamphlet writers took into account the new readership by transforming formal written source materials trial proceedings into more immediate ways of writing. The material for this study comes from the Corpus of Early Modern English Witchcraft Pamphlets, which has been compiled by the author. The multidisciplinary analysis incorporates both visual and linguistic aspects of the texts, with methodologies and theoretical insights adopted eclectically from historical pragmatics, genre studies, book history, corpus linguistics, systemic functional linguistics and cognitive psychology. The findings are anchored in the socio-historical context of early modern publishing, reading, literacy and witchcraft beliefs. The study shows not only how consideration of a new audience by both authors and printers influenced the development of a genre, but also the value of combining visual and linguistic features in pragmatic analyses of texts.
Resumo:
The aim of the study is to investigate the use of finlandisms in an historical perspective, how they have been viewed from the mid-19th century to this day, and the effect of language planning on their use. A finlandism is a word, a phrase, or a structure that is used only in the Swedish varieties used in Finland (i.e. in Finland Swedish), or used in these varieties in a different meaning than in the Swedish used in Sweden. Various aspects of Finland-Swedish language planning are discussed in relation to language planning generally; in addition, the relation of Finland Swedish to Standard Swedish and standard regional varieties is discussed, and various types of finlandisms are analysed in detail. A comprehensive picture is provided of the emergence and evolution of the ideology of language planning from the mid-19th century up until today. A theoretical model of corpus planning is presented and its effect on linguistic praxis described. One result of the study is that the belief among Finland-Swedish language planners that the Swedish language in Finland must not be allowed to become distanced from Standard Swedish, has been widely adopted by the average Finland Swede, particularly during the interwar period, following the publication of Hugo Bergroth s work Finlandssvenska in 1917. Criticism of this language-planning ideology started to appear in the 1950s, and intensified in the 1970s. However, language planning and the basis for this conception of language continue to enjoy strong support among Swedish-speaking Finns. I show that the editing of Finnish literary texts written in Swedish has often been somewhat amateurish and the results not always linguistically appropriate, and that Swedish publishers have in fact adopted a rather liberal attitude towards finlandisms. My conclusion is that language planning has achieved rather modest results in its resistance to finlandisms. Most of the finlandisms used in 1915 were still in use in 2005. Finlandisms occur among speakers of all ages, and even among academically educated people despite their more elevated style. The most common finlandisms were used by informants of all ages. The ones that are firmly rooted are the most established, in other words those that are stylistically neutral, seemingly genuinely Swedish, but which are nevertheless strongly supported by Finnish, and display a shift in meaning as compared with Standard Swedish.
Resumo:
Language Documentation and Description as Language Planning Working with Three Signed Minority Languages Sign languages are minority languages that typically have a low status in society. Language planning has traditionally been controlled from outside the sign-language community. Even though signed languages lack a written form, dictionaries have played an important role in language description and as tools in foreign language learning. The background to the present study on sign language documentation and description as language planning is empirical research in three dictionary projects in Finland-Swedish Sign Language, Albanian Sign Language, and Kosovar Sign Language. The study consists of an introductory article and five detailed studies which address language planning from different perspectives. The theoretical basis of the study is sociocultural linguistics. The research methods used were participant observation, interviews, focus group discussions, and document analysis. The primary research questions are the following: (1) What is the role of dictionary and lexicographic work in language planning, in research on undocumented signed language, and in relation to the language community as such? (2) What factors are particular challenges in the documentation of a sign language and should therefore be given special attention during lexicographic work? (3) Is a conventional dictionary a valid tool for describing an undocumented sign language? The results indicate that lexicographic work has a central part to play in language documentation, both as part of basic research on undocumented sign languages and for status planning. Existing dictionary work has contributed new knowledge about the languages and the language communities. The lexicographic work adds to the linguistic advocacy work done by the community itself with the aim of vitalizing the language, empowering the community, receiving governmental recognition for the language, and improving the linguistic (human) rights of the language users. The history of signed languages as low status languages has consequences for language planning and lexicography. One challenge that the study discusses is the relationship between the sign-language community and the hearing sign linguist. In order to make it possible for the community itself to take the lead in a language planning process, raising linguistic awareness within the community is crucial. The results give rise to questions of whether lexicographic work is of more importance for status planning than for corpus planning. A conventional dictionary as a tool for describing an undocumented sign language is criticised. The study discusses differences between signed and spoken/written languages that are challenging for lexicographic presentations. Alternative electronic lexicographic approaches including both lexicon and grammar are also discussed. Keywords: sign language, Finland-Swedish Sign Language, Albanian Sign Language, Kosovar Sign Language, language documentation and description, language planning, lexicography