3 resultados para natural languages

em Universidad de Alicante


Relevância:

60.00% 60.00%

Publicador:

Resumo:

If one has a distribution of words (SLUNs or CLUNS) in a text written in language L(MT), and is adjusted one of the mathematical expressions of distribution that exists in the mathematical literature, some parameter of the elected expression it can be considered as a measure of the diversity. But because the adjustment is not always perfect as usual measure; it is preferable to select an index that doesn't postulate a regularity of distribution expressible for a simple formula. The problem can be approachable statistically, without having special interest for the organization of the text. It can serve as index any monotonous function that has a minimum value when all their elements belong to the same class, that is to say, all the individuals belong to oneself symbol, and a maximum value when each element belongs to a different class, that is to say, each individual is of a different symbol. It should also gather certain conditions like they are: to be not very sensitive to the extension of the text and being invariant to certain number of operations of selection in the text. These operations can be theoretically random. The expressions that offer more advantages are those coming from the theory of the information of Shannon-Weaver. Based on them, the authors develop a theoretical study for indexes of diversity to be applied in texts built in modeling language L(MT), although anything impedes that they can be applied to texts written in natural languages.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we describe Fénix, a data model for exchanging information between Natural Language Processing applications. The format proposed is intended to be flexible enough to cover both current and future data structures employed in the field of Computational Linguistics. The Fénix architecture is divided into four separate layers: conceptual, logical, persistence and physical. This division provides a simple interface to abstract the users from low-level implementation details, such as programming languages and data storage employed, allowing them to focus in the concepts and processes to be modelled. The Fénix architecture is accompanied by a set of programming libraries to facilitate the access and manipulation of the structures created in this framework. We will also show how this architecture has been already successfully applied in different research projects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.