Biblioteca Digital

969 resultados para linguistic corpora

Lingue meno diffuse e corpora: studio empirico sulla terminologia amministrativa ladina

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study aims to the elaboration of juridical and administrative terminology in Ladin language, actually on the Ladin idiom spoken in Val Badia. The necessity of this study is strictly connected to the fact that in South Tyrol the Ladin language is not just safeguarded, but the editing of administrative and normative text is guaranteed by law. This means that there is a need for a unique terminology in order to support translators and editors of specialised texts. The starting point of this study are, on one side the need of a unique terminology, and on the other side the translation work done till now from the employees of the public administration in Ladin language. In order to document their efforts a corpus made up of digitalized administrative and normative documents was build. The first two chapters focuses on the state of the art of projects on terminology and corpus linguistics for lesser used languages. The information were collected thanks to the help of institutes, universities and researchers dealing with lesser used languages. The third chapter focuses on the development of administrative language in Ladin language and the fourth chapter focuses on the creation of the trilingual Italian – German – Ladin corpus made up of administrative and normative documents. The last chapter deals with the methodologies applied in order to elaborate the terminology entries in Ladin language though the use of the trilingual corpus. Starting from the terminology entry all steps are described, from term extraction, to the extraction of equivalents, contexts and definitions and of course also of the elaboration of translation proposals for not found equivalences. Finally the problems referring to the elaboration of terminology in Ladin language are illustrated.

Language and Embodiment: sensory-motor and linguistic-social experience. Evidence on sentence comprehension

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work I address the study of language comprehension in an “embodied” framework. Firstly I show behavioral evidence supporting the idea that language modulates the motor system in a specific way, both at a proximal level (sensibility to the effectors) and at the distal level (sensibility to the goal of the action in which the single motor acts are inserted). I will present two studies in which the method is basically the same: we manipulated the linguistic stimuli (the kind of sentence: hand action vs. foot action vs. mouth action) and the effector by which participants had to respond (hand vs. foot vs. mouth; dominant hand vs. non-dominant hand). Response times analyses showed a specific modulation depending on the kind of sentence: participants were facilitated in the task execution (sentence sensibility judgment) when the effector they had to use to respond was the same to which the sentences referred. Namely, during language comprehension a pre-activation of the motor system seems to take place. This activation is analogous (even if less intense) to the one detectable when we practically execute the action described by the sentence. Beyond this effector specific modulation, we also found an effect of the goal suggested by the sentence. That is, the hand effector was pre-activated not only by hand-action-related sentences, but also by sentences describing mouth actions, consistently with the fact that to execute an action on an object with the mouth we firstly have to bring it to the mouth with the hand. After reviewing the evidence on simulation specificity directly referring to the body (for instance, the kind of the effector activated by the language), I focus on the specific properties of the object to which the words refer, particularly on the weight. In this case the hypothesis to test was if both lifting movement perception and lifting movement execution are modulated by language comprehension. We used behavioral and kinematics methods, and we manipulated the linguistic stimuli (the kind of sentence: the lifting of heavy objects vs. the lifting of light objects). To study the movement perception we measured the correlations between the weight of the objects lifted by an actor (heavy objects vs. light objects) and the esteems provided by the participants. To study the movement execution we measured kinematics parameters variance (velocity, acceleration, time to the first peak of velocity) during the actual lifting of objects (heavy objects vs. light objects). Both kinds of measures revealed that language had a specific effect on the motor system, both at a perceptive and at a motoric level. Finally, I address the issue of the abstract words. Different studies in the “embodied” framework tried to explain the meaning of abstract words The limit of these works is that they account only for subsets of phenomena, so results are difficult to generalize. We tried to circumvent this problem by contrasting transitive verbs (abstract and concrete) and nouns (abstract and concrete) in different combinations. The behavioral study was conducted both with German and Italian participants, as the two languages are syntactically different. We found that response times were faster for both the compatible pairs (concrete verb + concrete noun; abstract verb + abstract noun) than for the mixed ones. Interestingly, for the mixed combinations analyses showed a modulation due to the specific language (German vs. Italian): when the concrete word precedes the abstract one responses were faster, regardless of the word grammatical class. Results are discussed in the framework of current views on abstract words. They highlight the important role of developmental and social aspects of language use, and confirm theories assigning a crucial role to both sensorimotor and linguistic experience for abstract words.

Peer and group relationships in preschoolers: the role of social and linguistic skills

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Being able to positively interact and build relationships with playmates in preschool years is crucial to achieve positive adjustment. An update review and two studies on such topics were provided. Study 1 is observational; it investigates the type of social experience in groups (N = 443) of children (N = 120) at preschool age in child-led vs. teacher-led contexts. The results revealed that in child-led contexts children were more likely to be alone, in dyads, and in small peer groups; groups were mostly characterized by same-gender playmates who engaged in joint interactions, with few social interactions with teachers. In teacher-led contexts, on the other hand, children were more likely to be involved in small, medium and large groups; groups were mostly characterized by other-gender playmates, involved in parallel interactions, with teachers playing a more active role. The purpose of Study 2 was to describe the development of socio-emotional competence, temperamental traits and linguistic skill. It examined the role of children’s reciprocated nominations (=RNs) with peers, assessed via sociometric interview, in relation to socio-emotional competence, temperamental traits and linguistic skill. Finally, the similarity-homophily tendency was investigated. Socio-emotional competence and temperamental traits were assessed via teacher ratings, linguistic skill via test administration. Eighty-four preschool children (M age = 62.53) were recruited within 4 preschool settings. Those children were quite representative of preschool population. The results revealed that children with higher RNs showed higher social competence (tendency), social orientation, positive emotionality, motor activity and linguistic skill. They exhibited lower anxiety-withdrawal. The results also showed that children prefer playmates with similar features: social competence, anger-aggression (tendency), social orientation, positive emotionality, inhibition to innovation, attention, motor activity (tendency) and linguistic skill. Implications for future research were suggested.

La violenza contro le donne e il sessismo implicito nel discorso giornalistico scritto. Analisi di due micro-corpora in lingua italiana e francese

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La tesi si articola in quattro parti. La prima, di stampo femminista, propone una panoramica sul femminicidio come fenomeno sociale e sulla relativa situazione giuridica internazionale. La seconda tratta in generale la stampa di qualità, supporto mediatico prescelto per l'analisi linguistica. La terza parte propone un micro-corpus di stampa italiana sul tema del femminicidio e la quarta un micro-corpus di stampa francese sull' "Affaire DSK", entrambe corredate di un' analisi del componente lessicale e discorsivo (Analyse du discours). E' un lavoro comparativo, i cui risultati hanno permesso di mettere in evidenza e provare come la stampa di qualità italiana e francese tendano a veicolare in modo implicito un'immagine sessista, sterotipata e discriminatoria del femminicidio e della vittima di violenza.

Comparing genetic and linguistic diversity in African populations with a focus on the Khoisan of southern Africa

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The interaction between disciplines in the study of human population history is of primary importance, profiting from the biological and cultural characteristics of humankind. In fact, data from genetics, linguistics, archaeology and cultural anthropology can be combined to allow for a broader research perspective. This multidisciplinary approach is here applied to the study of the prehistory of sub-Saharan African populations: in this continent, where Homo sapiens originally started his evolution and diversification, the understanding of the patterns of human variation has a crucial relevance. For this dissertation, molecular data is interpreted and complemented with a major contribution from linguistics: linguistic data are compared to the genetic data and the research questions are contextualized within a linguistic perspective. In the four articles proposed, we analyze Y chromosome SNPs and STRs profiles and full mtDNA genomes on a representative number of samples to investigate key questions of African human variability. Some of these questions address i) the amount of genetic variation on a continental scale and the effects of the widespread migration of Bantu speakers, ii) the extent of ancient population structure, which has been lost in present day populations, iii) the colonization of the southern edge of the continent together with the degree of population contact/replacement, and iv) the prehistory of the diverse Khoisan ethnolinguistic groups, who were traditionally understudied despite representing one of the most ancient divergences of modern human phylogeny. Our results uncover a deep level of genetic structure within the continent and a multilayered pattern of contact between populations. These case studies represent a valuable contribution to the debate on our prehistory and open up further research threads.

Automatic induction of lexical features

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis concerns artificially intelligent natural language processing systems that are capable of learning the properties of lexical items (properties like verbal valency or inflectional class membership) autonomously while they are fulfilling their tasks for which they have been deployed in the first place. Many of these tasks require a deep analysis of language input, which can be characterized as a mapping of utterances in a given input C to a set S of linguistically motivated structures with the help of linguistic information encoded in a grammar G and a lexicon L: G + L + C → S (1) The idea that underlies intelligent lexical acquisition systems is to modify this schematic formula in such a way that the system is able to exploit the information encoded in S to create a new, improved version of the lexicon: G + L + S → L' (2) Moreover, the thesis claims that a system can only be considered intelligent if it does not just make maximum usage of the learning opportunities in C, but if it is also able to revise falsely acquired lexical knowledge. So, one of the central elements in this work is the formulation of a couple of criteria for intelligent lexical acquisition systems subsumed under one paradigm: the Learn-Alpha design rule. The thesis describes the design and quality of a prototype for such a system, whose acquisition components have been developed from scratch and built on top of one of the state-of-the-art Head-driven Phrase Structure Grammar (HPSG) processing systems. The quality of this prototype is investigated in a series of experiments, in which the system is fed with extracts of a large English corpus. While the idea of using machine-readable language input to automatically acquire lexical knowledge is not new, we are not aware of a system that fulfills Learn-Alpha and is able to deal with large corpora. To instance four major challenges of constructing such a system, it should be mentioned that a) the high number of possible structural descriptions caused by highly underspeci ed lexical entries demands for a parser with a very effective ambiguity management system, b) the automatic construction of concise lexical entries out of a bulk of observed lexical facts requires a special technique of data alignment, c) the reliability of these entries depends on the system's decision on whether it has seen 'enough' input and d) general properties of language might render some lexical features indeterminable if the system tries to acquire them with a too high precision. The cornerstone of this dissertation is the motivation and development of a general theory of automatic lexical acquisition that is applicable to every language and independent of any particular theory of grammar or lexicon. This work is divided into five chapters. The introductory chapter first contrasts three different and mutually incompatible approaches to (artificial) lexical acquisition: cue-based queries, head-lexicalized probabilistic context free grammars and learning by unification. Then the postulation of the Learn-Alpha design rule is presented. The second chapter outlines the theory that underlies Learn-Alpha and exposes all the related notions and concepts required for a proper understanding of artificial lexical acquisition. Chapter 3 develops the prototyped acquisition method, called ANALYZE-LEARN-REDUCE, a framework which implements Learn-Alpha. The fourth chapter presents the design and results of a bootstrapping experiment conducted on this prototype: lexeme detection, learning of verbal valency, categorization into nominal count/mass classes, selection of prepositions and sentential complements, among others. The thesis concludes with a review of the conclusions and motivation for further improvements as well as proposals for future research on the automatic induction of lexical features.

Spagnolo Tecnico Semplificato

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L’obiettivo della presente dissertazione è quello di creare un nuovo linguaggio controllato, denominato Español Técnico Simplificado (ETS). Basato sulla specifica tecnica del Simplified Technical English (STE), ufficialmente conosciuta come ASD-STE100, lo spagnolo controllato ETS si presenta come un documento metalinguistico in grado di fornire ad un redattore o traduttore tecnico alcune regole specifiche per produrre un documento tecnico. La strategia di implementazione conduce allo studio preliminare di alcuni linguaggi controllati simili all’inglese STE, quali il Français Rationalisé e il Simplified Technical Spanish. Attraverso un approccio caratteristico della linguistica dei corpora, la soluzione proposta fornisce il nuovo linguaggio controllato mediante l’estrazione di informazioni specifiche da un corpus ad-hoc di lingua spagnola appositamente creato ed interrogato. I risultati evidenziano un metodo linguistico (controllato) utile a produrre documentazione tecnica priva di ogni eventuale ambiguità. Il sistema ETS, infatti, si fonda sul concetto della intelligibilità in quanto condizione necessaria da soddisfare nell’ambito della produzione di un testo controllato. E, attraverso la sua macrostruttura, il documento ETS fornisce gli strumenti necessari per rendere il testo controllato univoco. Infatti, tale struttura bipartita suddivide in maniera logica i dettami: una prima parte riguarda e contiene regole sintattiche e stilistiche; una seconda parte riguarda e contiene un dizionario di un numero limitato di lemmi opportunamente selezionati. Il tutto a favore del principio della biunivocità dei segni, in questo caso, della lingua spagnola. Il progetto, nel suo insieme, apre le porte ad un linguaggio nuovo in alternativa a quelli presenti, totalmente creato in accademia, che vale come prototipo a cui far seguire altri progetti di ricerca.

The role of language in tv shows and its impact on the audience: A linguistic analysis of Game of Thrones

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Through the analysis of American TV show Game of Thrones, this dissertation will focus on the linguistic issues concerning the adaptation from books to television, the power of language over the audience, and the creation of two languages, with all the linguistic and cultural implications related to this phenomenon.

Le collocazioni in traduzione e interpretazione tra italiano e francese: Uno studio su eptic_01_2011

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation aims at investigating differences in phraseological patterns in translated and interpreted language, on the basis of the intermodal corpus EPTIC_01_2011 and focusing on Italian and French. First of all, an overview is offered of the main studies and theories about corpus linguistics and collocations: the notion of corpus is defined and a typology (focusing on intermodal corpora) is presented, before moving on to the linguistic phenomenon of collocation and its investigation through corpus linguistics methods. Second, the general structure of EPTIC_01_2011 is presented, including the ways in which its texts have been assembled, edited through ad hoc conventions and enriched with metadata. The methodology proposed by Durrant and Schmitt (2009), slightly edited to fit the present study, has been used to extract and compare noun+adjective/adjective+noun bigrams from a quantitative point of view. A subset of these data have then been extracted and analysed manually. The results of the study are presented through graphs and examples, with an in-depth discussion of the bigrams considered. Lastly, the data collected are analysed and categorised in terms of shifts occurring in translation and in interpreting, potential causes are discussed and ideas for further research and for the development of the EPTIC corpus are sketched.

Linguistic gender identity construction in political discourse : a corpus-assisted analysis of the primary speeches of Barack Obama and Hillary Clinton

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present study is concerned with exploring the linguistic identity construction of Barack Obama and Hillary Clinton in the context of USA 2008 Democratic Party primaries. Thus, their speeches are examined in order to detect the aspects of identity that each politician resorted to in the process of projecting a political identity. The study, however, takes a special interest in the ways in which gender identity is projected by Obama and Clinton. Moreover, the notions of gender bias as well as gender representations are also investigated.

Estrazione terminologica automatica: sistemi a confronto

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In any terminological study, candidate term extraction is a very time-consuming task. Corpus analysis tools have automatized some processes allowing the detection of relevant data within the texts, facilitating term candidate selection as well. Nevertheless, these tools are (normally) not specific for terminology research; therefore, the units which are automatically extracted need manual evaluation. Over the last few years some software products have been specifically developed for automatic term extraction. They are based on corpus analysis, but use linguistic and statistical information to filter data more precisely. As a result, the time needed for manual evaluation is reduced. In this framework, we tried to understand if and how these new tools can really be an advantage. In order to develop our project, we simulated a terminology study: we chose a domain (i.e. legal framework for medicinal products for human use) and compiled a corpus from which we extracted terms and phraseologisms using AntConc, a corpus analysis tool. Afterwards, we compared our list with the lists extracted automatically from three different tools (TermoStat Web, TaaS e Sketch Engine) in order to evaluate their performance. In the first chapter we describe some principles relating to terminology and phraseology in language for special purposes and show the advantages offered by corpus linguistics. In the second chapter we illustrate some of the main concepts of the domain selected, as well as some of the main features of legal texts. In the third chapter we describe automatic term extraction and the main criteria to evaluate it; moreover, we introduce the term-extraction tools used for this project. In the fourth chapter we describe our research method and, in the fifth chapter, we show our results and draw some preliminary conclusions on the performance and usefulness of term-extraction tools.

Basilea 3: La ricezione della terminologia finanziaria nel settore bancario. Analisi della variazione linguistica in corpora italiani e tedeschi

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Il punto di partenza del presente lavoro di ricerca terminologica è stato il soggiorno formativo presso la Direzione generale della Traduzione (DGT) della Commissione Europea di Lussemburgo. Il progetto di tirocinio, ovvero l’aggiornamento e la revisione di schede IATE afferenti al dominio finanziario, e gli aspetti problematici riscontrati durante la compilazione di tali schede hanno portato alla definizione della presente tesi. Lo studio si prefigge di analizzare la ricezione della terminologia precipua della regolamentazione di Basilea 3, esaminando il fenomeno della variazione linguistica in corpora italiani e tedeschi. Nel primo capitolo si descrive brevemente l’esperienza di tirocinio svolto presso la DGT, si presenta la banca dati IATE, l’attività terminologica eseguita e si illustrano le considerazioni che hanno portato allo sviluppo del progetto di tesi. Nel secondo capitolo si approfondisce il dominio investigato, descrivendo a grandi linee la crisi finanziaria che ha portato alla redazione della nuova normativa di Basilea 3, e si presentano i punti fondamentali degli Accordi di Basilea 3. Il terzo capitolo offre una panoramica sulle caratteristiche del linguaggio economico-finanziario e sulle conseguenze della nuova regolamentazione dal punto di vista linguistico, sottolineando le peculiarità della terminologia analizzata. Nel quarto capitolo si descrivono la metodologia seguita e le risorse utilizzate per il progetto di tesi, ovvero corpora ad hoc in lingua italiana e tedesca per l’analisi dei termini e le relative schede terminologiche. Il quinto capitolo si concentra sul fenomeno della variazione linguistica, fornendo un quadro teorico dei diversi approcci alla terminologia, cui segue l’analisi dei corpora e il commento dei risultati ottenuti; si considerano quindi le riflessioni teoriche alla luce di quanto emerso dalla disamina dei corpora. Infine, nell'appendice sono riportate le schede terminologiche IATE compilate durante il periodo di tirocinio e le schede terminologiche redatte a seguito dell’analisi del presente elaborato.

Acknowledging linguistic diversity in a multicultural society: the issue of indigenous languages in Colombia.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nowadays, modern society is gradually becoming multicultural. However, only in the last few years awareness on its importance has been raised. In the case of Colombia, multiculturalism has existed since the pre-Columbian period and today there are more than 80 ethnic groups and 65 indigenous languages in the country. The aim of this work is to illustrate the status of indigenous languages in Colombia and to enlighten about the importance of recognizing, protecting and strengthening the use of these native languages. Subsequent to this, it will be point out that linguistic diversity should be considered a resource and not a barrier to achieve unity in diversity. Finally, ethno-education will be presented as an adequate educational program that may guarantee an equal linguistic representation in the country.

"Linguistic Imperialism" di Robert Phillipson: proposta di traduzione di alcuni estratti

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L'oggetto di questo elaborato è la proposta di traduzione di alcuni passaggi tratti dal terzo capitolo dal manuale Linguistic Imperialism di Robert Phillipson. L'autore ha scritto un saggio di linguistica applicata che discute dell'esistenza dell'imperialismo linguistico e delle conseguenze che ha sulla realtà linguistica moderna. Il terzo capitolo, in particolar modo, descrive i fondamenti teorici su cui si basa l'intera teoria.

Studi verdiani, traduzione e nuove tecnologie: Costruzione e analisi di due corpora bilingui paralleli di libretti d'opera

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L’obiettivo del presente lavoro è illustrare la creazione di due corpora bilingui italiano-inglese di libretti d’opera di Giuseppe Verdi, annotati e indicizzati, e descrivere le potenzialità di queste risorse. Il progetto è nato dalla volontà di indagare l’effettiva possibilità di gestione e consultazione di testi poetici tramite corpora in studi translation-driven, optando in particolare per il genere libretto d’opera in considerazione della sua complessità, derivante anche dal fatto che il contenuto testuale è fortemente condizionato dalla musica. Il primo corpus, chiamato LiVeGi, si compone di cinque opere di Giuseppe Verdi e relativa traduzione inglese: Ernani, Il Trovatore, La Traviata, Aida e Falstaff; il secondo corpus, nominato FaLiVe, contiene l’originale italiano dell’opera Falstaff e due traduzioni inglesi, realizzate a circa un secolo di distanza l’una dall’altra. All’analisi del genere libretto e delle caratteristiche principali delle cinque opere selezionate (Capitolo 1), segue una panoramica della prassi traduttiva dei lavori verdiani nel Regno Unito e negli Stati Uniti (Capitolo 2) e la presentazione delle nozioni di Digital Humanities e di linguistica computazionale, all’interno delle quali si colloca il presente studio (Capitolo 3). La sezione centrale (Capitolo 4) presenta nel dettaglio tutte le fasi pratiche di creazione dei due corpora, in particolare selezione e reperimento del materiale, OCR, ripulitura, annotazione e uniformazione dei metacaratteri, part-of-speech tagging, indicizzazione e allineamento, terminando con la descrizione delle risorse ottenute. Il lavoro si conclude (Capitolo 5) con l’illustrazione delle potenzialità dei due corpora creati e le possibilità di ricerca offerte, presentando, a titolo d’esempio, due case study: il linguaggio delle protagoniste tragiche nei libretti di Verdi in traduzione (studio realizzato sul corpus LiVeGi) e la traduzione delle ingiurie nel Falstaff (attraverso il corpus FaLiVe).

«
1
2
...
23
24
25
26
27
28
29
...
64
65
»