Biblioteca Digital

62 resultados para Corpus bruit

em Helda - Digital Repository of University of Helsinki

Agreement Patterns in English : Diachronic Corpus Studies on Common-Number Pronouns

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study reports a diachronic corpus investigation of common-number pronouns used to convey unknown or otherwise unspecified reference. The study charts agreement patterns in these pronouns in various diachronic and synchronic corpora. The objective is to provide base-line data on variant frequencies and distributions in the history of English, as there are no previous systematic corpus-based observations on this topic. This study seeks to answer the questions of how pronoun use is linked with the overall typological development in English and how their diachronic evolution is embedded in the linguistic and social structures in which they are used. The theoretical framework draws on corpus linguistics and historical sociolinguistics, grammaticalisation, diachronic typology, and multivariate analysis of modelling sociolinguistic variation. The method employs quantitative corpus analyses from two main electronic corpora, one from Modern English and the other from Present-day English. The Modern English material is the Corpus of Early English Correspondence, and the time frame covered is 1500-1800. The written component of the British National Corpus is used in the Present-day English investigations. In addition, the study draws supplementary data from other electronic corpora. The material is used to compare the frequencies and distributions of common-number pronouns between these two time periods. The study limits the common-number uses to two subsystems, one anaphoric to grammatically singular antecedents and one cataphoric, in which the pronoun is followed by a relative clause. Various statistical tools are used to process the data, ranging from cross-tabulations to multivariate VARBRUL analyses in which the effects of sociolinguistic and systemic parameters are assessed to model their impact on the dependent variable. This study shows how one pronoun type has extended its uses in both subsystems, an increase linked with grammaticalisation and the changes in other pronouns in English through the centuries. The variationist sociolinguistic analysis charts how grammaticalisation in the subsystems is embedded in the linguistic and social structures in which the pronouns are used. The study suggests a scale of two statistical generalisations of various sociolinguistic factors which contribute to grammaticalisation and its embedding at various stages of the process.

Les formes d’adresse dans un corpus de films français et leur traduction en finnois

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of forms of address in French films and their Finnish translations The use of forms of address constitutes an integral part of speakers’ communicative competence. In fact, they are not only used to assign to whom the speech is addressed, but also to construct the relationship between speakers. However, the choice of a suitable form is not necessarily evident in modern, pluralistic society. By the notion form of address, I refer to pronouns of address (tu vs. vous) and different nouns of address like names, titles (Monsieur, Madame, Mademoiselle), kinship terms, occupational terms, terms of endearment and insults. The purpose of the present thesis is, first, to study the semantic and pragmatic values of forms of address in dialogues of modern French films, and, second, their translation in Finnish subtitles. It is evident that film language is not spontaneous, but only a representation of authentic speech, and that subtitles are a written version of the original spoken language. Consequently, this thesis studies spoken fictive dialogues and their written translations. The methods applied in the study are the Interactional and Pragmatic Approach as well as Translatology. The role of forms of address in an interpersonal relationship is studied with dimensions of distance and power (Brown and Gilman 1960, Kerbrat-Orecchioni 1992), whereas the pragmatic dimension permits studying in particular the use of forms of address in speech acts (Kerbrat-Orecchioni 2001). The translation strategies are studied with the help of Venuti’s (1995) notions of foreignizing and domesticating strategies. The results of the thesis suggest that the pronoun use in the studied films is usually reciprocal. However, the relations of power have not disappeared, but are expressed in a more discrete manner with nouns of address (for instance vous + Docteur vs. vous + Anita). The use of the pronoun of address vous seems still to be common, but increased intimacy is expressed by accompanying familiar nouns of address like first names. The nominal forms of address accompany different speech acts, but not in a systematic manner. In a dialogue they appear usually in the first speech act, and more rarely in the response, but not in both. In addition, they have an important role in the mechanics of conversation. The translators here face multiple demands, and their translations seem mostly to be a compromise between foreignizing and domesticating strategies.

Univariate, bivariate, and multivariate methods in corpus-based lexicography : A study of synonymy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this dissertation, I present an overall methodological framework for studying linguistic alternations, focusing specifically on lexical variation in denoting a single meaning, that is, synonymy. As the practical example, I employ the synonymous set of the four most common Finnish verbs denoting THINK, namely ajatella, miettiä, pohtia and harkita ‘think, reflect, ponder, consider’. As a continuation to previous work, I describe in considerable detail the extension of statistical methods from dichotomous linguistic settings (e.g., Gries 2003; Bresnan et al. 2007) to polytomous ones, that is, concerning more than two possible alternative outcomes. The applied statistical methods are arranged into a succession of stages with increasing complexity, proceeding from univariate via bivariate to multivariate techniques in the end. As the central multivariate method, I argue for the use of polytomous logistic regression and demonstrate its practical implementation to the studied phenomenon, thus extending the work by Bresnan et al. (2007), who applied simple (binary) logistic regression to a dichotomous structural alternation in English. The results of the various statistical analyses confirm that a wide range of contextual features across different categories are indeed associated with the use and selection of the selected think lexemes; however, a substantial part of these features are not exemplified in current Finnish lexicographical descriptions. The multivariate analysis results indicate that the semantic classifications of syntactic argument types are on the average the most distinctive feature category, followed by overall semantic characterizations of the verb chains, and then syntactic argument types alone, with morphological features pertaining to the verb chain and extra-linguistic features relegated to the last position. In terms of overall performance of the multivariate analysis and modeling, the prediction accuracy seems to reach a ceiling at a Recall rate of roughly two-thirds of the sentences in the research corpus. The analysis of these results suggests a limit to what can be explained and determined within the immediate sentential context and applying the conventional descriptive and analytical apparatus based on currently available linguistic theories and models. The results also support Bresnan’s (2007) and others’ (e.g., Bod et al. 2003) probabilistic view of the relationship between linguistic usage and the underlying linguistic system, in which only a minority of linguistic choices are categorical, given the known context – represented as a feature cluster – that can be analytically grasped and identified. Instead, most contexts exhibit degrees of variation as to their outcomes, resulting in proportionate choices over longer stretches of usage in texts or speech.

Grammar and disciplinary culture : a corpus-based study

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present study provides a usage-based account of how three grammatical structures, declarative content clauses, interrogative content clause and as-predicative constructions, are used in academic research articles. These structures may be used in both knowledge claims and citations, and they often express evaluative meanings. Using the methodology of quantitative corpus linguistics, I investigate how the culture of the academic discipline influences the way in which these constructions are used in research articles. The study compares the rates of occurrence of these grammatical structures and investigates their co-occurrence patterns in articles representing four different disciplines (medicine, physics, law, and literary criticism). The analysis is based on a purpose-built 2-million-word corpus, which has been part-of-speech tagged. The analysis demonstrates that the use of these grammatical structures varies between disciplines, and further shows that the differences observed in the corpus data are linked with differences in the nature of knowledge and the patterns of enquiry. The constructions in focus tend to be more frequently used in the soft disciplines, law and literary criticism, where their co-occurrence patterns are also more varied. This reflects both the greater variety of topics discussed in these disciplines, and the higher frequency of references to statements made by other researchers. Knowledge-building in the soft fields normally requires a careful contextualisation of the arguments, giving rise to statements reporting earlier research employing the constructions in focus. In contrast, knowledgebuilding in the hard fields is typically a cumulative process, based on agreed-upon methods of analysis. This characteristic is reflected in the structure and contents of research reports, which offer fewer opportunities for using these constructions.

Automatic indexing : an approach using an index term corpus and combining linguistic and statistical methods

Relevância:

20.00% 20.00%

Publicador:

Modality as portrayed in upper secondary school textbooks : A corpus-based approach

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tämän pro gradu -lopputyön aiheena on englannin kielen modaalisten apuverbien ns. ydinjoukko: will, would, can, could, shall, should, may, might ja must. Semantiikan kannalta nämä apuverbit ovat erityisen kompleksisia: niiden tulkinnassa on usein huomattavaa monivivahteisuutta, vaikka perinteiset kieliopit antavat ymmärtää niillä olevan kaksi tai kolme toisistaan selkeästi erillään olevaa merkitystä. Ne asettavatkin vieraan kielen oppimisympäristössä erityisiä haasteita. Viimeaikainen kehitys korpuslingvistiikan metodeissa on tuottanut entistä tarkempia kuvauksia siitä, miten modaalisia apuverbejä nykyenglannissa käytetään ja mihin suuntaan niiden kehitys on lyhyenkin ajan sisällä kulkenut. Tämän tutkielman tavoitteena on ollut verrata näiden uusien tutkimusten tuloksia siihen todellisuuteen, jonka englannin kielen lukiotasoinen oppimateriaali Suomessa opiskelijalle tarjoaa. Lähdin siitä, että opetussuunnitelman vaatima autenttisuus ja kommunikaativisuus kieltenopetuksessa tulisi näkyä tasapuolisena modaalisten apuverbien kohteluna. Alkuperäinen hypoteesini kuitenkin oli, että siinä miten modaalisuus ilmenee autenttisessa ympäristössä ja siinä miten se esitetään oppikirjoissa, on poikkeavuuksia. Lähestymistapani tähän tutkielmaan oli korpuslähtöinen. Valitsin kahdesta lukion kirjasarjasta ne kirjat, joissa modaaliset apuverbit mainittiin eksplisiittisesti. Skannasin jokaisen neljästä eri kirjasta löytyvän (kokonaisen) tekstin ja rakensin näistä aineksista pienen korpuksen. Tästä korpuksesta hain korpusanalyyseihin tarkoitetulla ohjelmalla kaikki lauseet, joissa esiintyi modaalisia apuverbejä. Tämän jälkeen analysoin jokaisen modaalisen apuverbin semanttisesti lauseyhteydessään. Tämän analyysin tuloksena pystyin rakentamaan taulukoita ja vertailemaan tuloksia uusimpien tutkimusten tuloksiin. Tämän tutkielman perusteella poikkeavuuksia on olemassa. Yleisesti ottaen modaalisten apuverbien keskinäinen frekvenssi oli oikean suuntainen: mitään apuverbiä ei ollut käytetty merkittävästi enemmän tai vähemmän kuin mitä viimeaikaisen tutkimuksen valossa olisi suotavaa. Sen sijaan apuverbien semanttisessa jakaumassa oli paikoin suuriakin eroja siinä, mitkä merkitykset oppikirjoissa painottuivat ja mitkä taas nykyenglannissa vaikuttaisivat olevan frekvensseiltään suurempia. Erityisesti can ja must erottuivat joukosta siinä, että oppikirjojen tarjoama kuva niiden käytöstä on päinvastainen kuin mitä voisi odottaa: can-verbin käyttö painottui selvästi tarkoittamaan ’kykyä’ eikä ’mahdollisuutta’, joka nykytutkimuksen valossa on sen pääasiallinen käyttötapa. Toisaalta must tarkoitti aineistossa ylikorostuneesti ’pakkoa’, kun se useimmiten nykyään tarkoittaa yhtä usein ’johtopäätöstä’ kuin ’pakkoa’. Lisäksi ’lupaa’ pyydettiin aineistossa merkillisen harvoin. Tulosten perusteella esitän, että oppikirjojen tekijät yleisellä tasolla luopuisivat kielioppikirjojen luutuneista käsityksistä ja uskaltaisivat altistaa opiskelijat koko modaalisten apuverbien merkityskirjolle.

The SCOTS corpus and TEFL: Discovering and Using Strategies of Spoken English in Finnish Upper Secondary Schools

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Suomen koulutuspolitiikasta vastaavat viranomaiset ovat reagoineet kansainvälisten kommunikaatiotarpeiden asettamiin haasteisiin ja muuttaneet yhden lukion A-tasoisen vieraan kielen kurssin sisällön vastaamaan suullisen viestinnän tarpeita. Tutkimuksessa selvitetään, miten englannin puhestrategioita voi opettaa suomalaisille lukiolaisille ja mitä metodeja on käytettävissä puhestrategioiden oppimisen arvioimiseksi. Vastaan asettamiini kysymyksiin aikaisemman tutkimuskirjallisuuden ja englannin kielen lukio-opetuksesta keräämäni aineiston avulla. Keskeisiä elementtejä tutkielmassa ovat erityisesti pragmaattinen kompetenssi ja kolme yleisen tason puhestrategiaa (keskustelun aloittaminen, oman puheenvuoron säilyttäminen sekä keskustelun ylläpitäminen). Aineistossa on mukana 65 ensimmäisen vuosiluokan lukiolaista (luokka A ja B) Helsingistä ja Espoosta. Opetusmateriaalina on käytetty SCOTS korpusta; tarkemmin määriteltynä puhetiedosto nimeltä Conversation 20: Four secondary school girls in the North East. Tiedostossa esille tulleet, kolmeen puhestrategiaan liittyvät fraasit, sanat ja rakenteet havainnollistettiin opiskelijoille mm. AntConc - konkordanssiohjelman avulla. Opiskelijat tekivät myös kirjallisia ja suullisia harjoituksia, jotka liittyivät puhestrategioihin. Neljälle vapaaehtoiselle opiskelijalle suunnattu toinen suullinen tehtävätyyppi vapaamuotoisine keskusteluineen äänitettiin, transkriboitiin ja tuloksia arvioitiin mm. eurooppalaisen viitekehyksen avulla. Lisäksi B - luokka vastasi kyselylomakkeeseen, jossa kysyttiin heidän mielipiteitään esim. hyödyllisimmästä testioppitunnista sekä heidän osallistumishalukkuudestaan uudelle pitkän englannin kahdeksannelle syventävälle kurssille. Tutkimustulokset ovat kannustavia ja osoittavat, että puhestrategioita on mahdollista opettaa jo lukiotasolla. Vaikka tutkimuksessa käytetty lähestymistapa oli opiskelijoille osittain uusi, valtaosa heistä myönsi oppineensa uutta englannin kielen keskustelurakenteista. Lisäksi vapaaehtoisten opiskelijoiden äänitetyt ja transkriboidut keskustelut tarjoavat hyvän lähtökohdan mahdolliselle jatkotutkimukselle.

Happiness and Joy in Corpus Contexts: A Cognitive Semantic Analysis

Relevância:

20.00% 20.00%

Publicador:

Corpus-based lexeme ranking for morphological guessers

Relevância:

20.00% 20.00%

Publicador:

Corpus-based paradigm Selection for morphological entries

Relevância:

20.00% 20.00%

Publicador:

Designing a Dependency Representation and Grammar Definition Corpus for Finnish

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We outline the design and creation of a syntactically and morphologically annotated corpora of Finnish for use by the research community. We motivate a definitional, systematic “grammar definition corpus” as a first step in an three-year annotation effort to help create higher-quality, better-documented extensive parsebanks at a later stage. The syntactic representation, consisting of a dependency structure and a basic set of dependency functions, is outlined with examples. Reference is made to double-blind annotation experiments to measure the applicability of the newgrammar definition corpus methodology.

The Phraseology and Structure of Latin Building Inscriptions in Roman North Africa

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study analyses the diction of Latin building inscriptions. Despite its importance, this topic has rarely been discussed before: the most substantial contribution on the subject is a short dissertation by Klaus Gast (1965) that focuses on 100 inscriptions dating mostly from the Republican period. Marietta Horster (2001) also touched upon this theme in her thesis on imperial building inscriptions. I have collected my source material in North Africa because more Latin building inscriptions dating from the Imperial period have survived there than in any other area of the Roman Empire. By means of a thorough and independent survey, I have assembled all relevant African Latin building inscriptions datable to the Roman period (between 146 BC and AD 425), 1002 texts, into a corpus. These inscriptions are all fully edited in Appendix 1; Appendix 2 contains references to earlier editions. To facilitate search operations, both are also available in electronic form. They are downloadable from the address http://www.helsinki.fi/hum/kla/htm/jatkoopinnot.htm. Chapter one is an introduction dealing with the nature of building inscriptions as source material. Chapter two offers a statistical overview of the material. The following main section of the work falls into five chapters, each of which analyses one main part of a building inscription. An average building inscription can be divided into five parts: the starting phrase opens the inscription (a dedication to gods, for example), the subject part identifies the builder, the object part describes the constructed or repaired building, the predicate part records the building activity and the supplement part offers additional information on the project (it can specify the funding, for instance). These chapters are systematic and chronological and their purpose is to register and interpret the phrases used, to analyse reasons for their use and for their popularity among the different groups of builders. Chapter eight, which follows the main section of the work, creates a typology of building inscriptions based on their structure. It also presents the most frequently attested types of building inscriptions. The conclusion describes, on a general level, how the diction of building inscriptions developed during the period of study and how this striking development resulted from socio-economic changes that took place in Romano-African society during Antiquity. This study shows that the phraseology of building inscriptions had a clear correlation both with the type of builder and with the date of carving. Private builders tended to accentuate their participation (especially its financial side) in the project; honouring the emperor received more emphasis in the building inscriptions set up by communities; the texts produced by the army were concise. The chronological development is so clear that it enables stylistic dating. At the beginning of the imperial period the phrases were clear, concrete, formal and stereotyped but by Late Antiquity they have become vague, subjective, flexible, varied and even rhetorically or poetically coloured.

Die Personennamen und Titel der mittelmongolischen Dokumente : Eine lexikalische Untersuchung

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present research is an investigation into the corpus of personal names and titles that are found in sources from the Middle Mongolian period, that is the time from the 13th to the beginning of the 15th century. The entry for every name or title has been divided into three parts: occurence(s) of a given name in Middle Mongolian sources (primary sources), etymology, and occurence(s) in sources other than Middle Mongolian (secondary sources). Culturally and lingistically the corpus can be divided into six sub-groups: Mongolian, Turkic (Old, Middle and Modern), Arabo-Persian (Islamic), Indo-Iranian and Tibetan (Buddhist), as well as Chinese. Among these, the largest group is formed by Mongolian and Turkic, followed by Chinese (mostly titles), Indo-Iranian, Arabo-Persian and Tibetan. With regard to the primary and secondary occurences the research is based mainly on primary sources including text-publications and dictionaries. Every name or title is documented as completely as possible within a Central Asian framework. However, due to the divergency of the sources available as well as diachronical importance, each sub-group has been dealt with slightly differently, but consistently. The corpus of investigated names and titles gives a fairly correct picture of the multi-ethnical composition of the Mongolian world-empire. It also shows the foreign influences on Mongolian names and titles, being in this respect a mirror of the influences that are visible in other parts of the Middle Mongolian culture too. Furthermore, the investigated corpus reflects the transitory stage of the 13th to 15th century in Central Asian history, and includes thus material from the past (Indo-Iranian, Old and Middle Turkic), and material that points to the future (Arabo-Persian, Tibetan, Modern Turkic).

Dancing and Professional Dancers in Roman Egypt

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The main aim of the study is to create a many-sided view of dancing in Roman Egypt (1st - early 4th centuries AD) and especially of the dancers who earned their living by dancing as hired performers. Even though dancers and other performers played a central part in many kinds of festivities throughout the ancient world, research on ancient professional dancers is rare and tends to rest on the ancient literature, which reflects the opinions of the elite. Documentary written sources (i.e., papyri, ostraka) the core of the present study are mentioned rather superficially, easily resulting in a stereotypical view of the dancers. This study will balance the picture of professional dancers in antiquity and of ancient dancing in a more general sense. The second aim characterizes this study as basic research: to provide a corpus of written sources from Greco-Roman Egypt on dancing and to discuss pictorial sources contemporary with the texts. The study also takes into account the theoretical discussion that centres on dancing as a nonverbal communicative mode. Dancers are seen as significant conveyors of social and cultural matters. This study shows that dancers were hired to perform especially in religious contexts, where the local associations on the village level also played an important part as the employers of the performers. These performers had a better standard of living in economic terms than the average hired worker, and dancers were better paid than other performers. In the Egyptian villages and towns, where the dancers performed and lived, the dancers do not seem to have been marginal because they were professionals or because of some ethnic or social background. However, their possible marginality may have occurred for reasons related to the practicalities of their profession (e.g., the itinerant life style). The oriental background of performers was a literary topos reflecting partly the situation in the centres of the empire, especially Rome, where many performers were of other than Roman origin. The connection of dancing, prostitution and slavery reflects the essential link between dance, body and gender: dancers are equated with such professions or socio-legal statuses where the body is the focus of attention, a commodity and a source of sensual pleasure; this dimension is clearly observable in ancient literature. According to the Egyptian documentary sources, there is no watertight evidence that professional dancers would have been engaged in prostitution and very little, if any, evidence that the disapproval of the professional dancers expressed by the ancient authors was shared by the Egyptians. From the 4th century onwards the dancers almost disappear from the documentary sources, reflecting the political and religious changes in the Mediterranean east.

Die Eigenschaften der Benennungen und ihr Einfluss auf deren Verwendung am Beispiel der Euro-Währung : Eine quantitative und kontrastive Analyse der Terminologie im Finnischen und im Deutschen

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation deals with the terminology of the Euro currency. Its aims are to determine the characteristics of the designations in a certain LSP and to discover whether the recommendations and rules that have been given to the formation of designations and 'ideal' designations have any influence on the usage of the designations. The characteristics analysed include length of the designation, part of speech, form, formation method, constancy, monosemy, suitability to a concept system and degree of specialty. The study analyses the actual usage of the designations in texts and the implementation of the designations. The study is an adaptation of a terminometric survey and uses concept analysis and quantitative analysis as its basic methods. The frequency of each characteristic is measured in terms of statistics. It is assumed that the 'ideality' of a designation influences its usage, for example that if a designation is short, it is used more than its longer rivals (synonyms). The results are analysed in a corpus consisting of a compilation of different texts concerning the Euro. The corpus is divided according to three features: year (1998-2003), genre (judicial texts, annual reports and brochures) and language (Finnish and German). Each analysis is performed according to each of these features and compared with the others. The results indicate that some of the characteristics of the designations indeed seem to have an influence on the usage of the designations. For example, monosemy and suitability to the concept system often lead to the implementation of the designation having the ideal or certain value in these characteristics in the analysed Finnish material. In German material, an 'ideal' value in the characteristics leads to the implementation of the designations more often than in Finnish. The contrastive study indicates that, for example, suitability to a concept system leads to implementation of a designation in judicial texts more often than in other genres. The recommendations given to an 'ideal' designation are thus often acceptable, but they cannot be generalized for all languages in the same extent.

«
1
2
3
4
5
»