Biblioteca Digital

978 resultados para synchronic linguistics

Epistemological tensions between linguistic description and ordinary speakers' intuitive knowledge: examples from French verb morphology

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this article, I address epistemological questions regarding the status of linguistic rules and the pervasive--though seldom discussed--tension that arises between theory-driven object perception by linguists on the one hand, and ordinary speakers' possible intuitive knowledge on the other hand. Several issues will be discussed using examples from French verb morphology, based on the 6500 verbs from Le Petit Robert dictionary (2013).

Contrast : a concept in language and its development in Medieval Spanish

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The purpose of this PhD thesis is to investigate a semantic relation present in the connection of sentences (more specifically: propositional units). This relation, which we refer to as contrast, includes the traditional categories of adversatives - predominantly represented by the connector but in English and pero in Modern Spanish - and concessives, prototypically verbalised through although / aunque. The aim is to describe, analyse and - as far as possible - to explain the emergence and evolution of different syntactic schemes marking contrast during the first three centuries of Spanish (also referred to as Castilian) as a literary language, i.e., from the 13th to the 15th century. The starting point of this question is a commonplace in syntax, whereby the semantic and syntactic complexity of clause linkage correlates with the degree of textual elaboration. In historical linguistics, i.e., applied to the phylogeny of a language, it is commonly referred to as the parataxis hypothesis A crucial part of the thesis is dedicated by the definition of contrast as a semantic relation. Although the label contrast has been used in this sense, mainly in functional grammar and text linguistics, mainstream grammaticography and linguistics remain attached to the traditional categories adversatives and concessives. In opposition to this traditional view, we present our own model of contrast, based on a pragma-semantic description proposed for the analysis of adversatives by Oswald Ducrot and subsequently adopted by Ekkehard König for the analysis of concessives. We refine and further develop this model in order for it to accommodate all, not just the prototypical instances of contrast in Spanish, arguing that the relationship between adversatives and concessives is a marked opposition, i.e., that the higher degree of semantic and syntactic integration of concessives restricts some possible readings that the adversatives may have, but that this difference is almost systematically neutralised by contextual factors, thus justifying the assumption of contrast as a comprehensive onomasiological category. This theoretical focus is completed by a state-of-the-question overview attempting to account for all relevant forms in which contrast is expressed in Medieval Spanish, with the aid of lexicographic and grammaticographical sources, and an empirical study investigating the expression of corpus in a corpus study on the textual functions of contrast in nine Medieval Spanish texts: Cantar de Mio Cid, Libro de Alexandre, Milagros de Nuestra Sehora, Estoria de Espana, Primera Partida, Lapidario, Libro de buen amor, Conde Lucanor, and Corbacho. This corpus is analysed using quantitative and qualitative tools, and the study is accompanied by a series of methodological remarks on how to investigate a pragma-semantic category in historical linguistics. The corpus study shows that the parataxis hypothesis fails to prove from a statistical viewpoint, although a qualitative analysis shows that the use of subordination does increase over time in some particular contexts.

L'ADN de l'édification linguistique

Relevância:

10.00% 10.00%

Publicador:

Ironías de la ironía: argumento dialéctico, figura retórica o categoría estética

Relevância:

10.00% 10.00%

Publicador:

Resumo:

[spa] El artículo plantea una breve revisión de la idea de ironía desde el punto de vista de la retórica y sus derivaciones en algunos de los estudios lingüísticos del siglo XX. Se parte de la clasificación tradicional de la ironía socrática, la ironía retórica y la ironía romántica para centrar el análisis en aspectos básicos del fenómeno irónico tales como la oposición, la verosimilitud, la complicidad con el intérprete o el papel desempeñado por el contexto. [eng] This article is a brief review of the concept of irony from the point of view of rhetoric and its influences in some of the twentieth century linguistic studies. The review begins with the traditional classification of Socratic irony, rhetoric irony and romantic irony in order to focus the analysis on some fundamental elements of ironic phenomenon such as opposition, verisimilitude, camaraderie with the interpreter or the role played by the context.

ClInt: A bilingual Spanish-Catalan spoken corpus of clinical interviews

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present ClInt (Clinical Interview), a bilingual Spanish-Catalan spoken corpus that contains 15 hours of clinical interviews. It consists of audio files aligned with multiple-level transcriptions comprising orthographic, phonetic and morphological information, as well as linguistic and extralinguistic encoding. This is a previously non-existent resource for these languages and it offers a wide-ranging exploitation potential in a broad variety of disciplines such as Linguistics, Natural Language Processing and related fields.

CoCo, a web interface for corpora compilation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

CoCo is a collaborative web interface for the compilation of linguistic resources. In this demo we are presenting one of its possible applications: paraphrase acquisition.

EsPal: One-stop shopping for Spanish word properties

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article introduces EsPal: a Web-accessible repository containing a comprehensive set of properties of Spanish words. EsPal is based on an extensible set of data sources, beginning with a 300 million token written database and a 460 million token subtitle database. Properties available include word frequency, orthographic structure and neighborhoods, phonological structure and neighborhoods, and subjective ratings such as imageability. Subword structure properties are also available in terms of bigrams and trigrams, bi-phones, and bi-syllables. Lemma and part-of-speech information and their corresponding frequencies are also indexed. The website enables users to either upload a set of words to receive their properties, or to receive a set of words matching constraints on the properties. The properties themselves are easily extensible and will be added over time as they become available. It is freely available from the following website: http://www.bcbl.eu/databases/espal

Identity, non-identity, and near-identity: Addressing the complexity of coreference

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article examines the mainstream categorical definition of coreference as "identity of reference." It argues that coreference is best handled when identity is treated as a continuum, ranging from full identity to non-identity, with room for near-identity relations to explain currently problematic cases. This middle ground is needed to account for those linguistic expressions in real text that stand in relations that are neither full coreference nor non-coreference, a situation that has led to contradictory treatment of cases in previous coreference annotation efforts. We discuss key issues for coreference such as conceptual categorization, individuation, criteria of identity, and the discourse model construct. We redefine coreference as a scalar relation between two (or more) linguistic expressions that refer to discourse entities considered to be at the same granularity level relevant to the linguistic and pragmatic context. We view coreference relations in terms of mental space theory and discuss a large number of real life examples that show near-identity at different degrees.

On the robust measurement of inflectional diversity

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Lexical diversity measures are notoriously sensitive to variations of sample size and recent approaches to this issue typically involve the computation of the average variety of lexical units in random subsamples of fixed size. This methodology has been further extended to measures of inflectional diversity such as the average number of wordforms per lexeme, also known as the mean size of paradigm (MSP) index. In this contribution we argue that, while random sampling can indeed be used to increase the robustness of inflectional diversity measures, using a fixed subsample size is only justified under the hypothesis that the corpora that we compare have the same degree of lexematic diversity. In the more general case where they may have differing degrees of lexematic diversity, a more sophisticated strategy can and should be adopted. A novel approach to the measurement of inflectional diversity is proposed, aiming to cope not only with variations of sample size, but also with variations of lexematic diversity. The robustness of this new method is empirically assessed and the results show that while there is still room for improvement, the proposed methodology considerably attenuates the impact of lexematic diversity discrepancies on the measurement of inflectional diversity.

Metàfores en venda : La metàfora conceptual en la publicitat televisiva en català

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Aquest treball pretén ampliar els estudis relacionats amb la lingüística cognitiva en la llengua catalana, en aquest cas en el camp d'experiència de la publicitat televisiva, i complementar els existents sobre el llenguatge publicitari i la comunicació dels mitjans audiovisuals.

A hybrid approach to treebank construction

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Este artículo describe investigación sobre los efectos de la desambiguación morfosintáctica usada como un preproceso de un analizador sint´actico profundo basado en HPSG, en el contexto del desarrollo de un treebank del español de código abierto, en el entorno de DELPH-IN. La anotación treebank se realiza manualmente tomando las decisiones apropiadas entre las opciones propuestas por el sistema y ordenadas por un módulo estadístico. Los experimentos presentados muestran que el uso de un etiquetador reduce la ambigüedad de las frases, y contribuye a limitar la cantidad de frases cuyo análisis sobrepasa a el límite de tiempo, y ayuda a al m´odulo estadístico a clasificar el árbol correcto entre los n mejores. Por un lado, nuestros resultados validan los beneficios ya reportados en la literatura de tal preproceso de análisis profundo con respecto a la velocidad, cobertura y precisión. Por otro lado, proponemos una estrategia basada en existentes herramientas de código abierto y recursos para desarrollar con alta consitencia treebanks de sintaxis profunda para idiomas con limitada disponibilidad de recursos lingüísticos.

El Proyecto CLARIN: Una infraestructura de investigación científica para las Humanidades y las Ciencias Sociales

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En aquest article presentem CLARIN (Common Language Resources and Technologies), un projecte de col·laboració europea a gran escala l"objectiu del qual és potenciar l"ús d"instruments tecnològics en la recerca en els àmbits de les humanitats i les ciències socials. CLARIN és un dels trenta-cinc projectes seleccionats pel Comitè ESFRI (European Strategy Forum on Research Infraestructures) per a la llista de les infraestructures que s"han d"haver construït, per la seva importància per a la recerca, d"aquí a deu anys. CLARIN vol portar a les humanitats i a les ciències socials els beneficis de l"accés compartit i en col·laboració a recursos digitals, i també l"ús del còmput intensiu amb instruments específics d"anàlisi i explotació per a l"accés intel·ligent a grans bases de dades. Amb aquest objectiu, CLARIN crearà la infraestructura necessària per a poder donar un accés genèric a grans bancs de dades i als instruments d"anàlisi i explotació d"aquestes dades mitjançant la utilització de tecnologia. Per a això implementarà, en una estructura de xarxa grid, i mitjançant tecnologia de serveis web i de web semàntic, una única interfície d"accés a les dades i als instruments d"anàlisi, i també a eines de processament i altres serveis necessaris. Aquesta interfície, pel fet de ser dissenyada per a servir els objectius comuns de la recerca en humanitats i ciències socials, en facilitarà l"ús a investigadors de diferents àmbits sense necessitat de tenir coneixements sobre les tecnologies implicades.

The Tibidabo Treebank

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En este artículo presentamos el desarrollo de un nuevo recurso de código abierto para el español: el treebank Tibidabo. La anotación se está llevando a cabo de forma semiautomática en la que, en primer lugar, el corpus es analizado automaticamente con una gramática simbólica del español basada en HPSG e implementada en el sistema Linguistic Knowledge Builder, y, en segundo lugar, los resultados del proceso de análisis se desambiguan manualmente. La existencia del treebank Tibidabo nos permitirá futuros trabajos de investigación para el desarrollo y evaluación de una arquitectura híbrida que combine métodos simbólicos y estadísticos para el PLN, así como investigaciones orientadas a la hibridización de técnicas de bajo y alto nivel para el PLN.

Corpus digitalizados y palabras gramaticales

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El objetivo de este trabajo es reflexionar acerca del empleo de los corpus informatizados. El caso que presentamos está vinculado a un proyecto de I+D sobre la gramaticalización de perífrasis verbales (GRAPEVERBA). Para llevar a cabo este estudio, hemos extraído las ocurrencias de los dos corpus académicos, CORDE and CREA. La falta de una lematización y de un etiquetado en ambos corpus nos ha planteado un problema de difícil solución, puesto que el número de ejemplos obtenido resulta excesivamente elevado. Otro problema tiene que ver con las ediciones textuales de las obras vertidas en los corpus de la Academia, de manera especial en el CORDE. Con cierta frecuencia, estas ediciones no son contemporáneas de los manuscritos originales, lo que compromete seriamente las conclusiones que se extraen acerca de la gramaticalización de algunas perífrasis verbales, por ejemplo de tener + (a/de) + infinitivo.

The Bakhtinian Dialogue revisited: A (non-biosemiotic) view from historiography and epistemology of humanities

Relevância:

10.00% 10.00%

Publicador:

«
1
2
...
32
33
34
35
36
37
38
...
65
66
»