907 resultados para Text Linguistics


Relevância:

20.00% 20.00%

Publicador:

Resumo:

For a long time, the work of a Franciscan Friar who had lived in Bologna and in Florence during the 13th and 14th centuries, Bartolomeo Della Pugliola, was thought to have been lost. Recent paleographic research, however, has affirmed that most of Della Pugliola’s work, although mixed into other authors, is contained in two manuscripts (1994 and 3843), currently kept at University Library in Bologna. Pugliola’s chronicle is central to Bolognese medieval literature, not only because it was the privileged source for the important work of Ramponis’ chronicle, but also because Bartolomeo della Pugliola’s sources are several significant works such as Jacopo Bianchetti’s lost writings and Pietro and Floriano Villolas’ chronicle (1163-1372). Ongoing historical studies and recent discoveries enabled me to reconstruct the historical chronology of Pugliola’s work as well as the Bolognese language between the 13th and 14th century The original purpose of my research was to add a linguistic commentary to the edition of the text in order to fill the gaps in medieval Bolognese language studies. In addition to being a reliable source, Pugliola’s chronicle was widely disseminated and became a sort of vulgate. The tradition of chronicle, through collation, allows the study of the language from a diachronic point of view. I therefore described all the linguistics phenomena related to phonetics, morphology and syntax in Pugliola’s text and I compared these results with variants in Villola’s and Ramponis’ chronicles. I also did likewise with another chronicle by a 16th century merchant, Friano Ubaldini, that I edited. This supplement helped to complete the Bolognese language outline from the 13th to the 16th century. In order to analize the data that I collected, I tried to approach them from a sociolinguistic point of view because each author represents a different variant of the language: closer to a scripta and the Florentine the language used by Pugliola, closer to the dialect spoken in Bologna the language used by Ubaldini. Differencies in handwriting especially show the models the authors try to reproduce or imitate. The glossary I added at the end of this study can help to understand these nuances with a number of examples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

[EN]The use of large corpora in the study of languages is a well established tradition. In the same vein, scholarship is also well represented in the case of the study of corpora for making grammars of languages. This is the case of the COBUILD grammar and dictionary and the case of the Longman Grammar of Spoken and Written English. This means that corpora have been analyzed in order to identify patterns in languages that can be later practised by learners following those patterns described and exemplified with real instances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study aims to the elaboration of juridical and administrative terminology in Ladin language, actually on the Ladin idiom spoken in Val Badia. The necessity of this study is strictly connected to the fact that in South Tyrol the Ladin language is not just safeguarded, but the editing of administrative and normative text is guaranteed by law. This means that there is a need for a unique terminology in order to support translators and editors of specialised texts. The starting point of this study are, on one side the need of a unique terminology, and on the other side the translation work done till now from the employees of the public administration in Ladin language. In order to document their efforts a corpus made up of digitalized administrative and normative documents was build. The first two chapters focuses on the state of the art of projects on terminology and corpus linguistics for lesser used languages. The information were collected thanks to the help of institutes, universities and researchers dealing with lesser used languages. The third chapter focuses on the development of administrative language in Ladin language and the fourth chapter focuses on the creation of the trilingual Italian – German – Ladin corpus made up of administrative and normative documents. The last chapter deals with the methodologies applied in order to elaborate the terminology entries in Ladin language though the use of the trilingual corpus. Starting from the terminology entry all steps are described, from term extraction, to the extraction of equivalents, contexts and definitions and of course also of the elaboration of translation proposals for not found equivalences. Finally the problems referring to the elaboration of terminology in Ladin language are illustrated.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

[EN]This paper is a proposal for teaching pragmatics following a corpus-based approach. Corpora have had a high impact on how linguistics is looked at these days. However, teaching linguistics is still traditional in its scope and stays away from a growing tendency of incorporating authentic samples in the theoretical classroom, and so lecturers perpetuate the presentation of the same canonical examples students may find in their textbooks or in other introductory monographs. Our view is that using corpus linguistics, especially corpora freely available in the World Wide Web, will result in a more engaging and fresh look at the course of Pragmatics, while promoting early research in students. This way, they learn the concepts but most importantly how to later identify pragmatic phenomena in real text. Here, we raise our concern with the methodology, presenting clear examples of corpus-based pragmatic activities, and one clear result is the fact that students learn also how to be autonomous in their analysis o f data. In our proposal, we move from more controlled tasks to autonomy. This proposal focuses on students enrolled in the course Pragmática de la Lengua inglesa, currently part of the curriculum in Lenguas Modernas, Universidad de Las Palmas de Gran Canaria.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The construction and use of multimedia corpora has been advocated for a while in the literature as one of the expected future application fields of Corpus Linguistics. This research project represents a pioneering experience aimed at applying a data-driven methodology to the study of the field of AVT, similarly to what has been done in the last few decades in the macro-field of Translation Studies. This research was based on the experience of Forlixt 1, the Forlì Corpus of Screen Translation, developed at the University of Bologna’s Department of Interdisciplinary Studies in Translation, Languages and Culture. As a matter of fact, in order to quantify strategies of linguistic transfer of an AV product, we need to take into consideration not only the linguistic aspect of such a product but all the meaning-making resources deployed in the filmic text. Provided that one major benefit of Forlixt 1 is the combination of audiovisual and textual data, this corpus allows the user to access primary data for scientific investigation, and thus no longer rely on pre-processed material such as traditional annotated transcriptions. Based on this rationale, the first chapter of the thesis sets out to illustrate the state of the art of research in the disciplinary fields involved. The primary objective was to underline the main repercussions on multimedia texts resulting from the interaction of a double support, audio and video, and, accordingly, on procedures, means, and methods adopted in their translation. By drawing on previous research in semiotics and film studies, the relevant codes at work in visual and acoustic channels were outlined. Subsequently, we concentrated on the analysis of the verbal component and on the peculiar characteristics of filmic orality as opposed to spontaneous dialogic production. In the second part, an overview of the main AVT modalities was presented (dubbing, voice-over, interlinguistic and intra-linguistic subtitling, audio-description, etc.) in order to define the different technologies, processes and professional qualifications that this umbrella term presently includes. The second chapter focuses diachronically on various theories’ contribution to the application of Corpus Linguistics’ methods and tools to the field of Translation Studies (i.e. Descriptive Translation Studies, Polysystem Theory). In particular, we discussed how the use of corpora can favourably help reduce the gap existing between qualitative and quantitative approaches. Subsequently, we reviewed the tools traditionally employed by Corpus Linguistics in regard to the construction of traditional “written language” corpora, to assess whether and how they can be adapted to meet the needs of multimedia corpora. In particular, we reviewed existing speech and spoken corpora, as well as multimedia corpora specifically designed to investigate Translation. The third chapter reviews Forlixt 1's main developing steps, from a technical (IT design principles, data query functions) and methodological point of view, by laying down extensive scientific foundations for the annotation methods adopted, which presently encompass categories of pragmatic, sociolinguistic, linguacultural and semiotic nature. Finally, we described the main query tools (free search, guided search, advanced search and combined search) and the main intended uses of the database in a pedagogical perspective. The fourth chapter lists specific compilation criteria retained, as well as statistics of the two sub-corpora, by presenting data broken down by language pair (French-Italian and German-Italian) and genre (cinema’s comedies, television’s soapoperas and crime series). Next, we concentrated on the discussion of the results obtained from the analysis of summary tables reporting the frequency of categories applied to the French-Italian sub-corpus. The detailed observation of the distribution of categories identified in the original and dubbed corpus allowed us to empirically confirm some of the theories put forward in the literature and notably concerning the nature of the filmic text, the dubbing process and Italian dubbed language’s features. This was possible by looking into some of the most problematic aspects, like the rendering of socio-linguistic variation. The corpus equally allowed us to consider so far neglected aspects, such as pragmatic, prosodic, kinetic, facial, and semiotic elements, and their combination. At the end of this first exploration, some specific observations concerning possible macrotranslation trends were made for each type of sub-genre considered (cinematic and TV genre). On the grounds of this first quantitative investigation, the fifth chapter intended to further examine data, by applying ad hoc models of analysis. Given the virtually infinite number of combinations of categories adopted, and of the latter with searchable textual units, three possible qualitative and quantitative methods were designed, each of which was to concentrate on a particular translation dimension of the filmic text. The first one was the cultural dimension, which specifically focused on the rendering of selected cultural references and on the investigation of recurrent translation choices and strategies justified on the basis of the occurrence of specific clusters of categories. The second analysis was conducted on the linguistic dimension by exploring the occurrence of phrasal verbs in the Italian dubbed corpus and by ascertaining the influence on the adoption of related translation strategies of possible semiotic traits, such as gestures and facial expressions. Finally, the main aim of the third study was to verify whether, under which circumstances, and through which modality, graphic and iconic elements were translated into Italian from an original corpus of both German and French films. After having reviewed the main translation techniques at work, an exhaustive account of possible causes for their non-translation was equally provided. By way of conclusion, the discussion of results obtained from the distribution of annotation categories on the French-Italian corpus, as well as the application of specific models of analysis allowed us to underline possible advantages and drawbacks related to the adoption of a corpus-based approach to AVT studies. Even though possible updating and improvement were proposed in order to help solve some of the problems identified, it is argued that the added value of Forlixt 1 lies ultimately in having created a valuable instrument, allowing to carry out empirically-sound contrastive studies that may be usefully replicated on different language pairs and several types of multimedia texts. Furthermore, multimedia corpora can also play a crucial role in L2 and translation teaching, two disciplines in which their use still lacks systematic investigation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research focuses on the definition of the complex relationship that exists between theory and project, which - in the architectural work by Oswald Mathias Ungers - is based on several essays and on the publications that - though they have never been collected in an organic text - make up an articulated corpus, so that it is possible to consider it as the foundations of a theory. More specifically, this thesis deals with the role of metaphor in Unger’s theory and its subsequent practical application to his projects. The path leading from theoretical analysis to architectural project is in Ungers’ view a slow and mediated path, where theory is an instrument without which it would not be possible to create the project's foundations. The metaphor is a figure of speech taken from disciplines such as philosophy, aesthetics, linguistics. Using a metaphor implies a transfer of meaning, as it is essentially based on the replacement of a real object with a figurative one. The research is articulated in three parts, each of them corresponding to a text by Ungers that is considered as crucial to understand the development of his architectural thinking. Each text marks three decades of Ungers’ work: the sixties, seventies and eighties. The first part of the research deals with the topic of Großform expressed by Ungers in his publication of 1966 Grossformen im Wohnungsbau, where he defines four criteria based on which architecture identifies with a Großform. One of the hypothesis underlying this study is that there is a relationship between the notion of Großform and the figure of metaphor. The second part of the thesis analyzes the time between the end of the sixties and the seventies, i.e. the time during which Ungers lived in the USA and taught at the Cornell University of Ithaca. The analysis focuses on the text Entwerfen und Denken in Vorstellungen, Metaphern und Analogien, written by Ungers in 1976, for the exhibition MAN transFORMS organized in the Cooper - Hewitt Museum in New York. This text, through which Ungers creates a sort of vocabulary to explain the notions of metaphor, analogy, signs, symbols and allegories, can be defined as the Manifesto of his architectural theory, the latter being strictly intertwined with the metaphor as a design instrument and which is best expressed when he introduces the 11 thesis with P. Koolhaas, P. Riemann, H. Kollhoff and A. Ovaska in Die Stadt in der Stadt in 1977. Berlin das grüne Stadtarchipel. The third part analyzes the indissoluble tie between the use of metaphor and the choice of the topic on which the project is based and, starting from Ungers’ publication in 1982 Architecture as theme, the relationship between idea/theme and image/metaphor is explained. Playing with shapes requires metaphoric thinking, i.e. taking references to create new ideas from the world of shapes and not just from architecture. The metaphor as a tool to interpret reality becomes for Ungers an inquiry method that precedes a project and makes it possible to define the theme on which the project will be based. In Ungers’ case, the architecture of ideas matches the idea of architecture; for Ungers the notions of idea and theme, image and metaphor cannot be separated from each other, the text on thematization of architecture is not a report of his projects, but it represents the need to put them in order and highlight the theme on which they are based.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wie viele andere Sprachen Ost- und Südostasiens ist das Thai eine numerusneutrale Sprache, in der ein Nomen lediglich das Konzept benennt und keinen Hinweis auf die Anzahl der Objekte liefert. Um Nomina im Thai zählen zu können, ist der Klassifikator (Klf) nötig, der die Objekte anhand ihrer semantischen Schlüsseleigenschaft herausgreift und individualisiert. Neben der Klassifikation stellt die Individualisierung die Hauptfunktion des Klf dar. Weitere Kernfunktionen des Klf außerhalb des Zählkontextes sind die Markierung der Definitheit, des Numerus sowie des Kontrasts. Die wichtigsten neuen Ergebnisse dieser Arbeit, die sowohl die Ebenen der Grammatik und Semantik als auch die der Logik und Pragmatik integriert, sind folgende: Im Thai kann der Klf sowohl auf der Element- als auch auf der Mengenebene agieren. In der Verbindung mit einem Demonstrativ kann der Klf auch eine pluralische Interpretation hervorrufen, wenn er auf eine als pluralisch präsupponierte Gesamtmenge referiert oder die Gesamtmenge in einer Teil-Ganzes-Relation individualisiert. In einem Ausdruck, der bereits eine explizite Zahlangabe enthält, bewirkt die Klf-Demonstrativ-Konstruktion eine Kontrastierung von Mengen mit gleichen Eigenschaften. Wie auch der Individualbegriff besitzt der Klf Intension und Extension. Intension und Extension von Thai-Klf verhalten sich umgekehrt proportional, d.h. je spezifischer der Inhalt eines Klf ist, desto kleiner ist sein Umfang. Der Klf signalisiert das Schlüsselmerkmal, das mit der Intension des Nomens der Identifizierung des Objekts dient. Der Klf individualisiert das Nomen, indem er Teilmengen quantifiziert. Er kann sich auf ein Objekt, eine bestimmte Anzahl von Objekten oder auf alle Objekte beziehen. Formal logisch lassen sich diese Funktionen mithilfe des Existenz- und des Allquantors darstellen. Auch die Nullstelle (NST) läßt sich formal logisch darstellen. Auf ihren jeweiligen Informationsgehalt reduziert, ergeben sich für Klf und NST abhängig von ihrer Positionierung verschiedene Informationswerte: Die Opposition von Klf und NST bewirkt in den Fragebögen ausschließlich skalare Q-Implikaturen, die sich durch die Informationsformeln in Form einer Horn-Skala darstellen lassen. In einem sich aufbauenden Kontext transportieren sowohl Klf als auch NST in der Kontextmitte bekannte Informationen, wodurch Implikaturen des M- bzw. I-Prinzips ausgelöst werden. Durch die Verbindung der Informationswerte mit den Implikaturen des Q-, M- und I-Prinzips lässt sich anhand der Positionierung direkt erkennen, wann der Klf die Funktion der Numerus-, der Definitheits- oder der Kontrast-Markierung erfüllt.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L'elaborato ha come scopo l'analisi delle tecniche di Text Mining e la loro applicazione all'interno di processi per l'auto-organizzazione della conoscenza. La prima parte della tesi si concentra sul concetto del Text Mining. Viene fornita la sua definizione, i possibili campi di utilizzo, il processo di sviluppo che lo riguarda e vengono esposte le diverse tecniche di Text Mining. Si analizzano poi alcuni tools per il Text Mining e infine vengono presentati alcuni esempi pratici di utilizzo. Il macro-argomento che viene esposto successivamente riguarda TuCSoN, una infrastruttura per la coordinazione di processi: autonomi, distribuiti e intelligenti, come ad esempio gli agenti. Si descrivono innanzi tutto le entità sulle quali il modello si basa, vengono introdotte le metodologie di interazione fra di essi e successivamente, gli strumenti di programmazione che l'infrastruttura mette a disposizione. La tesi, in un secondo momento, presenta MoK, un modello di coordinazione basato sulla biochimica studiato per l'auto-organizzazione della conoscenza. Anche per MoK, come per TuCSoN, vengono introdotte le entità alla base del modello. Avvalendosi MoK dell'infrastruttura TuCSoN, viene mostrato come le entità del primo vengano mappate su quelle del secondo. A conclusione dell'argomento viene mostrata un'applicazione per l'auto-organizzazione di news che si avvale del modello. Il capitolo successivo si occupa di analizzare i possibili utilizzi delle tecniche di Text Mining all'interno di infrastrutture per l'auto-organizzazione, come MoK. Nell'elaborato vengono poi presentati gli esperimenti effettuati sfruttando tecniche di Text Mining. Tutti gli esperimenti svolti hanno come scopo la clusterizzazione di articoli scientifici in base al loro contenuto, vengono quindi analizzati i risultati ottenuti. L'elaborato di tesi si conclude mettendo in evidenza alcune considerazioni finali su quanto svolto.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In numerosi campi scientici l'analisi di network complessi ha portato molte recenti scoperte: in questa tesi abbiamo sperimentato questo approccio sul linguaggio umano, in particolare quello scritto, dove le parole non interagiscono in modo casuale. Abbiamo quindi inizialmente presentato misure capaci di estrapolare importanti strutture topologiche dai newtork linguistici(Degree, Strength, Entropia, . . .) ed esaminato il software usato per rappresentare e visualizzare i grafi (Gephi). In seguito abbiamo analizzato le differenti proprietà statistiche di uno stesso testo in varie sue forme (shuffolato, senza stopwords e senza parole con bassa frequenza): il nostro database contiene cinque libri di cinque autori vissuti nel XIX secolo. Abbiamo infine mostrato come certe misure siano importanti per distinguere un testo reale dalle sue versioni modificate e perché la distribuzione del Degree di un testo normale e di uno shuffolato abbiano lo stesso andamento. Questi risultati potranno essere utili nella sempre più attiva analisi di fenomeni linguistici come l'autorship attribution e il riconoscimento di testi shuffolati.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Il problema relativo alla predizione, la ricerca di pattern predittivi all‘interno dei dati, è stato studiato ampiamente. Molte metodologie robuste ed efficienti sono state sviluppate, procedimenti che si basano sull‘analisi di informazioni numeriche strutturate. Quella testuale, d‘altro canto, è una tipologia di informazione fortemente destrutturata. Quindi, una immediata conclusione, porterebbe a pensare che per l‘analisi predittiva su dati testuali sia necessario sviluppare metodi completamente diversi da quelli ben noti dalle tecniche di data mining. Un problema di predizione può essere risolto utilizzando invece gli stessi metodi : dati testuali e documenti possono essere trasformati in valori numerici, considerando per esempio l‘assenza o la presenza di termini, rendendo di fatto possibile una utilizzazione efficiente delle tecniche già sviluppate. Il text mining abilita la congiunzione di concetti da campi di applicazione estremamente eterogenei. Con l‘immensa quantità di dati testuali presenti, basti pensare, sul World Wide Web, ed in continua crescita a causa dell‘utilizzo pervasivo di smartphones e computers, i campi di applicazione delle analisi di tipo testuale divengono innumerevoli. L‘avvento e la diffusione dei social networks e della pratica di micro blogging abilita le persone alla condivisione di opinioni e stati d‘animo, creando un corpus testuale di dimensioni incalcolabili aggiornato giornalmente. Le nuove tecniche di Sentiment Analysis, o Opinion Mining, si occupano di analizzare lo stato emotivo o la tipologia di opinione espressa all‘interno di un documento testuale. Esse sono discipline attraverso le quali, per esempio, estrarre indicatori dello stato d‘animo di un individuo, oppure di un insieme di individui, creando una rappresentazione dello stato emotivo sociale. L‘andamento dello stato emotivo sociale può condizionare macroscopicamente l‘evolvere di eventi globali? Studi in campo di Economia e Finanza Comportamentale assicurano un legame fra stato emotivo, capacità nel prendere decisioni ed indicatori economici. Grazie alle tecniche disponibili ed alla mole di dati testuali continuamente aggiornati riguardanti lo stato d‘animo di milioni di individui diviene possibile analizzare tali correlazioni. In questo studio viene costruito un sistema per la previsione delle variazioni di indici di borsa, basandosi su dati testuali estratti dalla piattaforma di microblogging Twitter, sotto forma di tweets pubblici; tale sistema include tecniche di miglioramento della previsione basate sullo studio di similarità dei testi, categorizzandone il contributo effettivo alla previsione.