14 resultados para Syntactic formulator

em Universidad de Alicante


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present an automatic system for the extraction of syntactic semantic patterns applied to the development of multilingual processing tools. In order to achieve optimum methods for the automatic treatment of more than one language, we propose the use of syntactic semantic patterns. These patterns are formed by a verbal head and the main arguments, and they are aligned among languages. In this paper we present an automatic system for the extraction and alignment of syntactic semantic patterns from two manually annotated corpora, and evaluate the main linguistic problems that we must deal with in the alignment process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last few years, there has been a wide development in the research on textual information systems. The goal is to improve these systems in order to allow an easy localization, treatment and access to the information stored in digital format (Digital Databases, Documental Databases, and so on). There are lots of applications focused on information access (for example, Web-search systems like Google or Altavista). However, these applications have problems when they must access to cross-language information, or when they need to show information in a language different from the one of the query. This paper explores the use of syntactic-sematic patterns as a method to access to multilingual information, and revise, in the case of Information Retrieval, where it is possible and useful to employ patterns when it comes to the multilingual and interactive aspects. On the one hand, the multilingual aspects that are going to be studied are the ones related to the access to documents in different languages from the one of the query, as well as the automatic translation of the document, i.e. a machine translation system based on patterns. On the other hand, this paper is going to go deep into the interactive aspects related to the reformulation of a query based on the syntactic-semantic pattern of the request.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, the authors extend and generalize the methodology based on the dynamics of systems with the use of differential equations as equations of state, allowing that first order transformed functions not only apply to the primitive or original variables, but also doing so to more complex expressions derived from them, and extending the rules that determine the generation of transformed superior to zero order (variable or primitive). Also, it is demonstrated that for all models of complex reality, there exists a complex model from the syntactic and semantic point of view. The theory is exemplified with a concrete model: MARIOLA model.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a whole Natural Language Processing (NLP) system for Spanish. The core of this system is the parser, which uses the grammatical formalism Lexical-Functional Grammars (LFG). Another important component of this system is the anaphora resolution module. To solve the anaphora, this module contains a method based on linguistic information (lexical, morphological, syntactic and semantic), structural information (anaphoric accessibility space in which the anaphor obtains the antecedent) and statistical information. This method is based on constraints and preferences and solves pronouns and definite descriptions. Moreover, this system fits dialogue and non-dialogue discourse features. The anaphora resolution module uses several resources, such as a lexical database (Spanish WordNet) to provide semantic information and a POS tagger providing the part of speech for each word and its root to make this resolution process easier.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, the new features that IR-n system applies on the topic processing for CL-SR are described. This set of features are based on applying logic forms to topics with the aim of incrementing the weight of topic terms according to a set of syntactic rules.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a CL-SR system that employs two different techniques: the first one is based on NLP rules that consist on applying logic forms to the topic processing while the second one basically consists on applying the IR-n statistical search engine to the spoken document collection. The application of logic forms to the topics allows to increase the weight of topic terms according to a set of syntactic rules. Thus, the weights of the topic terms are used by IR-n system in the information retrieval process.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El espacio doméstico construido es un producto social que a su vez crea sociedad. La casa constituye un escenario privilegiado, un medio de expresión y transmisión de conductas y comportamientos. No obstante, resulta muy difícil comprender el espacio social a través de unas ruinas arqueológicas vacías y carentes de tercera dimensión, y se corre el riesgo de proyectar una imagen historiográfica previamente construida sobre las sociedades estudiadas. La comprensión del espacio social requiere formalizar y discutir los patrones formales de las estructuras domésticas y sus formas de agrupación. Este trabajo aborda el estudio de los espacios domésticos desde una perspectiva lingüística (una gramática de la casa), distinguiendo los elementos en sí y sus combinaciones. Se definen tres niveles distintos de análisis del hecho doméstico: el morfológico, que se ocupa de la forma de las unidades domésticas y de las transformaciones que experimentan; el sintáctico que enfatiza las relaciones entre las estructuras elementales en el marco de una estructura espacial organizada; y el semiótico, que las analiza como expresiones sociales, materialización e instrumento de significados culturales. De acuerdo a esta perspectiva se propone una reflexión metodológica sobre la caracterización de los espacios domésticos medievales e islámicos en la Península Ibérica; se plantean los problemas derivados del uso social del espacio, los modelos domésticos y su diacronía, y se discute acerca de la casa como indicador material de islamización.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El objetivo del trabajo consiste en reutilizar el Treebank de dependencias EPECDEP (BDT) para construir el gold standard de la sintaxis superficial del euskera. El paso básico consiste en el estudio comparativo de los dos formalismos aplicados sobre el mismo corpus: el formalismo de la Gramática de Restricciones (Constraint Grammar, CG) y la Gramática de Dependencias (Dependency Grammar, DP). Como resultado de dicho estudio hemos establecido los criterios lingüísticos necesarios para derivar la funciones sintácticas en estilo CG. Dichos criterios han sido implementados y evaluados, así en el 75% de los casos se derivan automáticamente las funciones sintácticas para construir el gold standard.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the automatic process of building a dependency annotated corpus based on Ancora constituent structures. The Ancora corpus already has a dependency structure information layer, but the new annotated data applies a purely syntactic orientation and offers in this way a new resource to the linguistic research community. The paper details the process of reannotating the corpus, the linguistic criteria used and the obtained results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The English language and the Internet, both separately and taken together, are nowadays well-acknowledged as powerful forces which influence and affect the lexico-grammatical characteristics of other languages world-wide. In fact, many authors like Crystal (2004) have pointed out the emergence of the so-called Netspeak, that is, the language used in the Net or World Wide Web; as Crystal himself (2004: 19) puts it, ‘a type of language displaying features that are unique to the Internet […] arising out of its character as a medium which is electronic, global and interactive’. This ‘language’, however, may be differently understood: either as an adaptation of the English language proper to internet requirements and purposes, or as a new and rapidly-changing and developing language as a result of a rapid evolution or adaptation to Internet requirements of almost all world languages, for whom English is a trendsetter. If the second and probably most plausible interpretation is adopted, there are three salient features of ‘Netspeak’: (a) the rapid expansion of all its new linguistic developments thanks to the Internet itself, which may lead to the generalization and widespread acceptance of new words, coinages, or meanings, hundreds of times faster than was the case with the printed media. As said above, (b) the visible influence of English, the most prevalent language on the Internet. Consequently, (c) this new language tends to reduce the ‘distance’ between English and other languages as well as the ignorance of the former by speakers of other languages, since the ‘Netspeak’ version of the latter adopts grammatical, syntactic and lexical features of English. Thus, linguistic differences may even disappear when code-switching and/or borrowing occurs, as whole fragments of English appear in other language contexts. As a consequence of the new situation, an ideal context appears for interlanguage or multilingual word formation to thrive: puns, blends, compounds and word creativity in general find in the web the ideal place to gain rapid acceptance world-wide, as a result of fashion, coincidence, or sheer merit of the new linguistic proposals.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The great amount of text produced every day in the Web turned it as one of the main sources for obtaining linguistic corpora, that are further analyzed with Natural Language Processing techniques. On a global scale, languages such as Portuguese - official in 9 countries - appear on the Web in several varieties, with lexical, morphological and syntactic (among others) differences. Besides, a unified spelling system for Portuguese has been recently approved, and its implementation process has already started in some countries. However, it will last several years, so different varieties and spelling systems coexist. Since PoS-taggers for Portuguese are specifically built for a particular variety, this work analyzes different training corpora and lexica combinations aimed at building a model with high-precision annotation in several varieties and spelling systems of this language. Moreover, this paper presents different dictionaries of the new orthography (Spelling Agreement) as well as a new freely available testing corpus, containing different varieties and textual typologies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

One of the main challenges to be addressed in text summarization concerns the detection of redundant information. This paper presents a detailed analysis of three methods for achieving such goal. The proposed methods rely on different levels of language analysis: lexical, syntactic and semantic. Moreover, they are also analyzed for detecting relevance in texts. The results show that semantic-based methods are able to detect up to 90% of redundancy, compared to only the 19% of lexical-based ones. This is also reflected in the quality of the generated summaries, obtaining better summaries when employing syntactic- or semantic-based approaches to remove redundancy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Este artículo analiza el video oficial de la campaña del Presidente Correa denominado “spot bicicleta”, elemento promocional usado en las elecciones seccionales del 2013, toma como referencia de análisis las estructuras fonológicas, gráficas, sintácticas y semánticas. La pieza con una duración de 3:30 minutos, recuerda el proceso de “Revolución Ciudadana”, emprendido por el partido Alianza País (35); y resalta la condición de servicio del Presidente, cuyo eje principal es la Patria. El discurso gira entorno a la pobreza, la esperanza y el deseo del Presidente de continuar en el poder. Toda la pieza audiovisual está matizada con elementos posicionados en la mente de los ecuatorianos, usados como una estrategia de persuasión.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Web 2.0 has resulted in a shift as to how users consume and interact with the information, and has introduced a wide range of new textual genres, such as reviews or microblogs, through which users communicate, exchange, and share opinions. The exploitation of all this user-generated content is of great value both for users and companies, in order to assist them in their decision-making processes. Given this context, the analysis and development of automatic methods that can help manage online information in a quicker manner are needed. Therefore, this article proposes and evaluates a novel concept-level approach for ultra-concise opinion abstractive summarization. Our approach is characterized by the integration of syntactic sentence simplification, sentence regeneration and internal concept representation into the summarization process, thus being able to generate abstractive summaries, which is one the most challenging issues for this task. In order to be able to analyze different settings for our approach, the use of the sentence regeneration module was made optional, leading to two different versions of the system (one with sentence regeneration and one without). For testing them, a corpus of 400 English texts, gathered from reviews and tweets belonging to two different domains, was used. Although both versions were shown to be reliable methods for generating this type of summaries, the results obtained indicate that the version without sentence regeneration yielded to better results, improving the results of a number of state-of-the-art systems by 9%, whereas the version with sentence regeneration proved to be more robust to noisy data.