154 resultados para Interrogative pronouns


Relevância:

10.00% 10.00%

Publicador:

Resumo:

La tesis que se presenta tiene como propósito la construcción automática de ontologías a partir de textos, enmarcándose en el área denominada Ontology Learning. Esta disciplina tiene como objetivo automatizar la elaboración de modelos de dominio a partir de fuentes información estructurada o no estructurada, y tuvo su origen con el comienzo del milenio, a raíz del crecimiento exponencial del volumen de información accesible en Internet. Debido a que la mayoría de información se presenta en la web en forma de texto, el aprendizaje automático de ontologías se ha centrado en el análisis de este tipo de fuente, nutriéndose a lo largo de los años de técnicas muy diversas provenientes de áreas como la Recuperación de Información, Extracción de Información, Sumarización y, en general, de áreas relacionadas con el procesamiento del lenguaje natural. La principal contribución de esta tesis consiste en que, a diferencia de la mayoría de las técnicas actuales, el método que se propone no analiza la estructura sintáctica superficial del lenguaje, sino que estudia su nivel semántico profundo. Su objetivo, por tanto, es tratar de deducir el modelo del dominio a partir de la forma con la que se articulan los significados de las oraciones en lenguaje natural. Debido a que el nivel semántico profundo es independiente de la lengua, el método permitirá operar en escenarios multilingües, en los que es necesario combinar información proveniente de textos en diferentes idiomas. Para acceder a este nivel del lenguaje, el método utiliza el modelo de las interlinguas. Estos formalismos, provenientes del área de la traducción automática, permiten representar el significado de las oraciones de forma independiente de la lengua. Se utilizará en concreto UNL (Universal Networking Language), considerado como la única interlingua de propósito general que está normalizada. La aproximación utilizada en esta tesis supone la continuación de trabajos previos realizados tanto por su autor como por el equipo de investigación del que forma parte, en los que se estudió cómo utilizar el modelo de las interlinguas en las áreas de extracción y recuperación de información multilingüe. Básicamente, el procedimiento definido en el método trata de identificar, en la representación UNL de los textos, ciertas regularidades que permiten deducir las piezas de la ontología del dominio. Debido a que UNL es un formalismo basado en redes semánticas, estas regularidades se presentan en forma de grafos, generalizándose en estructuras denominadas patrones lingüísticos. Por otra parte, UNL aún conserva ciertos mecanismos de cohesión del discurso procedentes de los lenguajes naturales, como el fenómeno de la anáfora. Con el fin de aumentar la efectividad en la comprensión de las expresiones, el método provee, como otra contribución relevante, la definición de un algoritmo para la resolución de la anáfora pronominal circunscrita al modelo de la interlingua, limitada al caso de pronombres personales de tercera persona cuando su antecedente es un nombre propio. El método propuesto se sustenta en la definición de un marco formal, que ha debido elaborarse adaptando ciertas definiciones provenientes de la teoría de grafos e incorporando otras nuevas, con el objetivo de ubicar las nociones de expresión UNL, patrón lingüístico y las operaciones de encaje de patrones, que son la base de los procesos del método. Tanto el marco formal como todos los procesos que define el método se han implementado con el fin de realizar la experimentación, aplicándose sobre un artículo de la colección EOLSS “Encyclopedia of Life Support Systems” de la UNESCO. ABSTRACT The purpose of this thesis is the automatic construction of ontologies from texts. This thesis is set within the area of Ontology Learning. This discipline aims to automatize domain models from structured or unstructured information sources, and had its origin with the beginning of the millennium, as a result of the exponential growth in the volume of information accessible on the Internet. Since most information is presented on the web in the form of text, the automatic ontology learning is focused on the analysis of this type of source, nourished over the years by very different techniques from areas such as Information Retrieval, Information Extraction, Summarization and, in general, by areas related to natural language processing. The main contribution of this thesis consists of, in contrast with the majority of current techniques, the fact that the method proposed does not analyze the syntactic surface structure of the language, but explores his deep semantic level. Its objective, therefore, is trying to infer the domain model from the way the meanings of the sentences are articulated in natural language. Since the deep semantic level does not depend on the language, the method will allow to operate in multilingual scenarios, where it is necessary to combine information from texts in different languages. To access to this level of the language, the method uses the interlingua model. These formalisms, coming from the area of machine translation, allow to represent the meaning of the sentences independently of the language. In this particular case, UNL (Universal Networking Language) will be used, which considered to be the only interlingua of general purpose that is standardized. The approach used in this thesis corresponds to the continuation of previous works carried out both by the author of this thesis and by the research group of which he is part, in which it is studied how to use the interlingua model in the areas of multilingual information extraction and retrieval. Basically, the procedure defined in the method tries to identify certain regularities at the UNL representation of texts that allow the deduction of the parts of the ontology of the domain. Since UNL is a formalism based on semantic networks, these regularities are presented in the form of graphs, generalizing in structures called linguistic patterns. On the other hand, UNL still preserves certain mechanisms of discourse cohesion from natural languages, such as the phenomenon of the anaphora. In order to increase the effectiveness in the understanding of expressions, the method provides, as another significant contribution, the definition of an algorithm for the resolution of pronominal anaphora limited to the model of the interlingua, in the case of third person personal pronouns when its antecedent is a proper noun. The proposed method is based on the definition of a formal framework, adapting some definitions from Graph Theory and incorporating new ones, in order to locate the notions of UNL expression and linguistic pattern, as well as the operations of pattern matching, which are the basis of the method processes. Both the formal framework and all the processes that define the method have been implemented in order to carry out the experimentation, applying on an article of the "Encyclopedia of Life Support Systems" of the UNESCO-EOLSS collection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Este estudo é movido pela curiosidade quanto a como se resolvem, nas traduções do italiano para o português, questões de colocação pronominal. Encontramos, normal e frequentemente, na língua italiana, principalmente na língua falada, pronomes cujos equivalentes em português existem em gramáticas normativas da língua portuguesa, mas que, na prática, não são utilizados pelos falantes e escritores brasileiros. Encontramos, também, na língua italiana, um significativo número de verbos pronominais (como esserci, volerci, averne etc.) e um considerável número de verbos pronominais múltiplos (como andarsene, farcela, fregarsene etc.) que, juntamente com esses pronomes, constituem, para os professores brasileiros de italiano língua estrangeira (LE), elementos difíceis de trabalhar na sala de aula. Além disso, tais elementos também podem dificultar o trabalho dos tradutores, que devem fazer determinadas escolhas ao traduzi-los para o português. Como são traduzidos os pronomes combinados do italiano nas versões brasileiras? Será que os portugueses, que possuem, por exemplo, tais pronomes utilizam-nos em todos os casos em que os encontramos nos textos de partida? E as partículas pronominais são simplesmente eliminadas no texto de chegada ou são substituídas? Tais aspectos, se observados e organizados, podem levar a uma melhor compreensão das duas línguas em contato e dar subsídios a estudantes, professores e tradutores. Pensando nessa dificuldade, esta pesquisa buscou e listou alguns autores e obras disponíveis para consulta e analisou um corpus com cento e sessenta e três ocorrências de pronomes no italiano, mais sete acréscimos de pronomes no português brasileiro (PB) e/ou português europeu (PE), partindo do romance Uno, nessuno e centomila de Luigi Pirandello e suas respectivas traduções em PB e PE. Nosso objetivo consiste em encontrar respostas úteis à diminuição do estranhamento, por parte de um italiano, que escuta, de um brasileiro, frases sem pronomes (ainda que o italiano as entenda) e/ou a sensação de inadequação e, até mesmo, de desconforto, por parte de um brasileiro, ao produzir frases com todos os pronomes. No corpus analisado, temos uma amostra das escolhas e respectivas traduções propostas pelos tradutores para casos de pronomes reflexivos, de pronomes pessoais do caso reto, de pronomes pessoais do caso oblíquo, de pronomes combinados e de partículas pronominais ne, ci e vi, com manutenções, omissões, trocas por outros pronomes (possessivos, retos, oblíquos, demonstrativos) e, até mesmo, uma espécie de compensação numérica com a inclusão de palavra inexistente no texto de partida.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents an algorithm for identifying noun-phrase antecedents of pronouns and adjectival anaphors in Spanish dialogues. We believe that anaphora resolution requires numerous sources of information in order to find the correct antecedent of the anaphor. These sources can be of different kinds, e.g., linguistic information, discourse/dialogue structure information, or topic information. For this reason, our algorithm uses various different kinds of information (hybrid information). The algorithm is based on linguistic constraints and preferences and uses an anaphoric accessibility space within which the algorithm finds the noun phrase. We present some experiments related to this algorithm and this space using a corpus of 204 dialogues. The algorithm is implemented in Prolog. According to this study, 95.9% of antecedents were located in the proposed space, a precision of 81.3% was obtained for pronominal anaphora resolution, and 81.5% for adjectival anaphora.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a whole Natural Language Processing (NLP) system for Spanish. The core of this system is the parser, which uses the grammatical formalism Lexical-Functional Grammars (LFG). Another important component of this system is the anaphora resolution module. To solve the anaphora, this module contains a method based on linguistic information (lexical, morphological, syntactic and semantic), structural information (anaphoric accessibility space in which the anaphor obtains the antecedent) and statistical information. This method is based on constraints and preferences and solves pronouns and definite descriptions. Moreover, this system fits dialogue and non-dialogue discourse features. The anaphora resolution module uses several resources, such as a lexical database (Spanish WordNet) to provide semantic information and a POS tagger providing the part of speech for each word and its root to make this resolution process easier.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The goal of my study is to investigate the relationship between selected deictic shields on the pronoun ‘I’ and the involvement/detachment dichotomy in a sample of television news interviews. I focus on the use of personal pronouns in political discourse. Drawing upon Caffi’s (2007) classification of mitigating devices into bushes, hedges and shields, I focus on deictic shields on the pronoun ‘I’: I examine the way a selection of ‘I’-related deictic shields is employed in a collection of news interviews broadcast during the electoral campaign prior to the UK 2015 General Election. My purpose is to uncover the frequencies of each of the linguistic items selected and the pragmatic functions of those linguistic items in the involvement/detachment dichotomy. The research is structured as follows. Chapter 1 provides an account of previous studies on the three main areas of research: speech event analysis, institutional interaction and the news interview, and the UK 2015 General Election television programmes. Chapter 2 is centred on the involvement/detachment dichotomy: I provide an overview of nonlinguistic and linguistic features of involvement and detachment at all levels of sentence structure. Chapter 3 contains a detailed account of the data collection and data analysis process. Chapter 4 provides an accurate description of results in three steps: quantitative analysis, qualitative analysis and discussion of the pragmatic functions of the selected linguistic features of involvement and detachment. Chapter 5 includes a brief summary of the investigation, reviews the main findings, and indicates limitations of the study and possible inputs for further research. The results of the analysis confirm that, while some of the linguistic items examined point toward involvement, others have a detaching effect. I therefore conclude that deictic shields on the pronoun ‘I’ permit the realisation of the involvement/detachment dichotomy in the speech genre of the news interview.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Includes bibliographical references and indexes.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A collection of miscellaneous pamphlets on the romance languages.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A collection of miscellaneous pamphlets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present an approach to parsing rehive clauses in Arabic in the tradition of the Paninian Grammar Frumework/2] which leads to deriving U common logicul form for equivalent sentences. Particular attention is paid to the analysis of resumptive pronouns in the retrieval of syntuctico-semantic relationships. The analysis arises from the development of a lexicalised dependency grammar for Arabic that has application for machine translation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present work studies the overall structuring of radio news discourse via investigating three metatextual/interactive functions: (1) Discourse Organizing Elements (DOEs), (2) Attribution and (3) Sentential and Nominal Background Information (SBI & NBI). An extended corpus of about 73,000 words from BBC and Radio Damascus news is used to study DOEs and a restricted corpus of 38,000 words for Attribution and S & NBI. A situational approach is adopted to assess the influence of factors such as medium and audience on these functions and their frequence. It is found that: (1) DOEs are organizational and their frequency is determined by length of text; (2) Attribution Function in accordance with the editor's strategy and its frequency is audience sensitive; and (3) BI provides background information and is determined by audience and news topics. Secondly, the salient grammatical elements in DOEs are discourse deictic demonstratives, address pronouns and nouns referring to `the news'. Attribution is realized in reporting/reported clauses, and BI in a sentence, a clause or a nominal group. Thirdly, DOEs establish a hierarchy of (1) news, (2) summary/expansion and (3) item: including topic introduction and details. While Attribution is generally, and SBI solely, a function of detailing, NBI and proper names are generally a function of summary and topic introduction. Being primarily addressed to audience and referring metatextually, the functions investigated support Sinclair's interactive and autonomous planes of discourse. They also shed light on the part(s) of the linguistic system which realize the metatextual/interactive function. Strictly, `discourse structure' inevitably involves a rank-scale; but news discourse also shows a convention of item `listing'. Hence only within the boundary of variety (ultimately interpreted across language and in its situation) can textual functions and discourse structure be studied. Finally, interlingual variety study provides invaluable insights into a level of translation that goes beyond matching grammatical systems or situational factors, an interpretive level which has to be described in linguistic analysis of translation data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Individual cues to deception are subtle and often missed by lay people and law enforcement alike. Linguistic statement analysis remains a potentially useful way of overcoming individual diagnostic limitations (e.g. Criteria based Content Analysis; Steller & Köhnken, 1989; Reality monitoring; Johnson & Raye, 1981; Scientific Content Analysis; Sapir, 1996). Unfortunately many of these procedures are time-consuming, require in-depth training, as well as lack empirical support and/or external validity. The current dissertation develops a novel approach to statement veracity analysis that is simple to learn, easy to administer, theoretically sound, and empirically validated. Two strategies were proposed for detecting differences between liars' and truth-tellers' statements. Liars were hypothesized to strategically write statements with the goal of self-exoneration. Liars' statements were predicted to contain more first person pronouns and fewer third person pronouns. Truth-tellers were hypothesized to be motivated toward being informative and thus produce statements with fewer first person pronouns and more third person pronouns. Three studies were conducted to test this hypothesis. The first study explored the verbal patterns of exoneration and informativeness focused statements. The second study used a traditional theft paradigm to examine these verbal patterns in guilty liars and innocent truth tellers. In the third study to better match the context of a criminal investigation a cheating paradigm was used in which spontaneous lying was induced and written statements were taken. Support for the first person pronoun hypothesis was found. Limited support was found for the third person pronoun hypothesis. Results, implications, and future directions for the current research are discussed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this dissertation, under the light of a sociofunctionalist approach (TAVARES, 2003; 2013; GORSKI; TAVARES, 2013), I analyze 378 data of pronouns tu and você extracted from 12 conversations belonging to Natal Conversational Data Base (CUNHA, 2010). I have the following objectives: (i) mapping linguistic and extralinguistic specialization trends of second person singular subject pronouns tu and você in the speech of Natal (RN); (ii) assessing the role of the principle of persistence (HOPPER, 1991) as a possible motivating factor of specialization trends of pronouns tu and você; (iii) identify in which of the six pronouns subsystems proposed by Scherre et al. (2009) is situated the speech community of Natal portrayed in this study. In order to achieve the proposed objectives, I submitted the data to multivariate statistical analysis, which have provided frequencies and relative weights.I obtained, as relevant factor groups, according to the statistical analysis, the nature of the relationship between the interlocutors (if the relationship is asymmetric, less intimate, and more formal, the use of você is favored; if the relationship is symmetrical, intimate and informal, the use of tu is favored); the degree of formality of the environment in which the conversation takes place (in more informal environments tu was favored; in more formal environments você were favored); and the type of discourse (reported / not reported) (tu was favored in not reported discourse and você was favored in reported discourse). Based on results regarding to these factors groups, I organized a panorama of specializations of pronouns tu and você, noting that tu seems specialized for more informal contexts of use than those for which você seems specialized. The motivation underlying these trends of specialization may be the principle of persistence, since along its historical development, você carries a trace of greater formality or, at least, less intimacy, when contrasted to tu. Finally, I concluded that, of the six pronoun subsystems proposed by Scherre et al. (2009), the speech community of Natal can be framed in the fifth, characterized by variable use of subject pronouns tu and você, with more frequent use of você than tu, and rare occurrence of agreement of tu with second-person singular verb.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this work we present the description and analysis of the clitics collocation patterns in prepositional infinitive sentences within the Brazilian writing in the centuries XIX and XX. The corpus in analysis is comprised of letters of newspaper readers and newspaper writers, as well as of advertisements (ads) taken from Brazilian newspapers from different regions / states – Rio de Janeiro, Bahia, Ceará and Pernambuco – and written in the Centuries XIX and XX. They belong to the common minimum corpus of the project named Projeto para a História do Português Brasileiro (PHPB or Project to the History of the Brazilian Portuguese, in English). Its analysis is based on theoreticalmethodological postulates of the Theory of Variation and Change (WEINREICH; LABOV; HERZOG, 1968[2006]; LABOV, 1972[2008]); on the Theory of Principles and Parameters (CHOMSKY, 1981, 1986) and on the model of Grammar Competition (KROCH, 1989; 2001). By trying to articulate those presuppositions from both the theories we present a proposition of theoretical interface between the Variation Theory and the Grammar one. Concerning the empirical results achieved by means of this research, we could figure that, in the context in which there were prepositional infinitive sentences, the most significant independent variable to the occurrence of the proclisis is the type of preposition that comes before the verb in the infinitive. Before that, we found out that there are prepositions which strongly direct the proclisis, as it is the case of the prepositions in Portuguese sem, por, de and para, with all of them presenting Relative Weights over 0,52. Another important result is the one attested in the data referring the state of Rio de Janeiro (RJ). This state is the only one of the sample which is located in the Southeastern region and also presents itself as the main proclisis conditioner amongst the localities pertaining to the sample. In order to explain those results, we raised the hypothesis that the proclisis implementation may be more advanced in the Southeastern than in the Northeastern Brazil, however that hypothesis must be confirmed or refuted in future works. We also present, in this work, a theoretical explanation about the clitics colocation in prepositional infinitive sentences within the Brazilian writing in the XIX and XX centuries. The theoretical explanation we found to interpret the achieved results associates Magro’s proposition (2005), regarding the existence of prepositions occupying the nucleus PP and the existence of prepositions which can play the role of a completer and occupy the nucleus CP, according to Galves (2000; 2001), regarding the existent relation between the clitic colocation and the association of traits-phi to the functional categories COMP, Tense and Person. Our proposition is that the occurrence of prepositions which occupy the nucleus CP causes changes in the values attributed to the traits-phi and to the strong Vtraits in the functional categories COMP, Tense and Person. Thus, we defend that proclisis in Brazilian Portuguese (BP) is derived from the movement of the verb to the functional category tense in which there is the association of traits +V and traits +AGR, what legitimates the proclisis according to Galves´s proposition (2000; 2001).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Based on the theoretical and methodological presuppositions of the theory of language variation and change (cf. WEINREICH; LABOV; HERZOG, 2006 [1968]), it is described and analyzed in this article the process of variation/change concerning the second person possessive pronouns in letters from readers of Brazilian newspapers from the XIX and XX centuries. These letters feature a portrait of the Brazilian press from the South (Santa Catarina), Southeast (Rio de Janeiro) and Northeast (Bahia and Rio Grande do Norte) regions in each century and are part of the Project for Brazilian Portuguese History‘s (PHPB) printed common minimal corpus. The point of departure of this work is the idea that the use of variant forms of expressing second person possessive pronouns – teu and seu – results from the interaction characterizing the varied social roles performed by the letters‘ senders. Arranging communicative units, which gather elements/features denoting time and space, conditioned and determined by socio-historical and cultural aspects, the readers‘ letters, turn out to be a promising research field under the light of this paper. More specifically, In the row of presented results in studies about the pronominal system in the diachroneity of/in Brazilian Portuguese (PB) (FARACO, 2002; LORENGIAN-PENKAL, 2007; CALLOU; LOPES, 2003; LOPES; DUARTE, 2003; MENON, 2005; ARDUIN; COELHO, 2006; LOPES, 2009; MARCOTULIO, 2010), the results featured in here point at different usages of the possessives, noticing the coexistence of the forms teu/tua and seu/sua strongly conditioned by the socio-discursive nature of the readers‘ letters in the course of the centuries and through different regions.