12 resultados para anaphora

em Universidad de Alicante


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an algorithm for identifying noun-phrase antecedents of pronouns and adjectival anaphors in Spanish dialogues. We believe that anaphora resolution requires numerous sources of information in order to find the correct antecedent of the anaphor. These sources can be of different kinds, e.g., linguistic information, discourse/dialogue structure information, or topic information. For this reason, our algorithm uses various different kinds of information (hybrid information). The algorithm is based on linguistic constraints and preferences and uses an anaphoric accessibility space within which the algorithm finds the noun phrase. We present some experiments related to this algorithm and this space using a corpus of 204 dialogues. The algorithm is implemented in Prolog. According to this study, 95.9% of antecedents were located in the proposed space, a precision of 81.3% was obtained for pronominal anaphora resolution, and 81.5% for adjectival anaphora.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Comunicación presentada en ACIDCA 2000, International Conference on Artificial and Computational Intelligence For Decision, Control and Automation In Engineering and Industrial Applications, Monastir, Tunisia, 22-24 March 2000.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a whole Natural Language Processing (NLP) system for Spanish. The core of this system is the parser, which uses the grammatical formalism Lexical-Functional Grammars (LFG). Another important component of this system is the anaphora resolution module. To solve the anaphora, this module contains a method based on linguistic information (lexical, morphological, syntactic and semantic), structural information (anaphoric accessibility space in which the anaphor obtains the antecedent) and statistical information. This method is based on constraints and preferences and solves pronouns and definite descriptions. Moreover, this system fits dialogue and non-dialogue discourse features. The anaphora resolution module uses several resources, such as a lexical database (Spanish WordNet) to provide semantic information and a POS tagger providing the part of speech for each word and its root to make this resolution process easier.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present is marked by the availability of large volumes of heterogeneous data, whose management is extremely complex. While the treatment of factual data has been widely studied, the processing of subjective information still poses important challenges. This is especially true in tasks that combine Opinion Analysis with other challenges, such as the ones related to Question Answering. In this paper, we describe the different approaches we employed in the NTCIR 8 MOAT monolingual English (opinionatedness, relevance, answerness and polarity) and cross-lingual English-Chinese tasks, implemented in our OpAL system. The results obtained when using different settings of the system, as well as the error analysis performed after the competition, offered us some clear insights on the best combination of techniques, that balance between precision and recall. Contrary to our initial intuitions, we have also seen that the inclusion of specialized Natural Language Processing tools dealing with Temporality or Anaphora Resolution lowers the system performance, while the use of topic detection techniques using faceted search with Wikipedia and Latent Semantic Analysis leads to satisfactory system performance, both for the monolingual setting, as well as in a multilingual one.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper shows an empirical study about the anaphoric accessibility space in Spanish dialogues. According to this study, antecedents of pronominal and adjectival anaphors can almost always (95.9%) be found in the noun phrases set taken from spaces defined using a structure based on adjacency pairs. Furthermore, a proposal of a reliable annotation scheme for Spanish dialogues is presented in order to define this anaphoric accessibility space. Using this annotation scheme, anaphora resolution algorithms can locate the adequate set of anaphor antecedent candidates.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

En este trabajo proponemos un algoritmo para la resolución de las descripciones definidas en español a través de la estructura del diálogo, mediante la definición de un espacio de accesibilidad anafórico. Este algoritmo está basado en la hipótesis de que la resolución de la anáfora está relacionada con la estructura del diálogo. Así, la resolución de la anáfora mejora si se especifica un espacio de accesibilidad para cada tipo descripción definida según la estructura del diálogo. La utilización de este espacio de accesibilidad anafóico reduce tanto el tiempo de procesamiento como la posibilidad de obtener un antecedente erróneo. Además, la definición de este espacio de accesibilidad depende únicamente de la propia estructura textual del diálogo.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper tells about the recognition of temporal expressions and the resolution of their temporal reference. A proposal of the units we have used to face up this tasks over a restricted domain is shown. We work with newspapers' articles in Spanish, that is why every reference we use is in Spanish. For the identification and recognition of temporal expressions we base on a temporal expression grammar and for the resolution on a dictionary, where we have the information necessary to do the date operation based on the recognized expressions. In the evaluation of our proposal we have obtained successful results for the examples studied.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present an algorithm for anaphora resolution in Spanish dialogues and an evaluation of the algorithm for pronominal anaphora. The proposed algorithm uses both linguistic information and the structure of the dialogue to find the antecedent of the anaphors. The system has been evaluated on ten dialogues.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Comunicación presentada en Cross-Language Evaluation Forum (CLEF 2008), Aarhus, Denmark, September 17-19, 2008.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we address two issues. The first one analyzes whether the performance of a text summarization method depends on the topic of a document. The second one is concerned with how certain linguistic properties of a text may affect the performance of a number of automatic text summarization methods. For this we consider semantic analysis methods, such as textual entailment and anaphora resolution, and we study how they are related to proper noun, pronoun and noun ratios calculated over original documents that are grouped into related topics. Given the obtained results, we can conclude that although our first hypothesis is not supported, since it has been found no evident relationship between the topic of a document and the performance of the methods employed, adapting summarization systems to the linguistic properties of input documents benefits the process of summarization.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper reports on the further results of the ongoing research analyzing the impact of a range of commonly used statistical and semantic features in the context of extractive text summarization. The features experimented with include word frequency, inverse sentence and term frequencies, stopwords filtering, word senses, resolved anaphora and textual entailment. The obtained results demonstrate the relative importance of each feature and the limitations of the tools available. It has been shown that the inverse sentence frequency combined with the term frequency yields almost the same results as the latter combined with stopwords filtering that in its turn proved to be a highly competitive baseline. To improve the suboptimal results of anaphora resolution, the system was extended with the second anaphora resolution module. The present paper also describes the first attempts of the internal document data representation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Decision support systems (DSS) support business or organizational decision-making activities, which require the access to information that is internally stored in databases or data warehouses, and externally in the Web accessed by Information Retrieval (IR) or Question Answering (QA) systems. Graphical interfaces to query these sources of information ease to constrain dynamically query formulation based on user selections, but they present a lack of flexibility in query formulation, since the expressivity power is reduced to the user interface design. Natural language interfaces (NLI) are expected as the optimal solution. However, especially for non-expert users, a real natural communication is the most difficult to realize effectively. In this paper, we propose an NLI that improves the interaction between the user and the DSS by means of referencing previous questions or their answers (i.e. anaphora such as the pronoun reference in “What traits are affected by them?”), or by eliding parts of the question (i.e. ellipsis such as “And to glume colour?” after the question “Tell me the QTLs related to awn colour in wheat”). Moreover, in order to overcome one of the main problems of NLIs about the difficulty to adapt an NLI to a new domain, our proposal is based on ontologies that are obtained semi-automatically from a framework that allows the integration of internal and external, structured and unstructured information. Therefore, our proposal can interface with databases, data warehouses, QA and IR systems. Because of the high NL ambiguity of the resolution process, our proposal is presented as an authoring tool that helps the user to query efficiently in natural language. Finally, our proposal is tested on a DSS case scenario about Biotechnology and Agriculture, whose knowledge base is the CEREALAB database as internal structured data, and the Web (e.g. PubMed) as external unstructured information.