14 resultados para Semantic metrics
em Universidad de Alicante
Resumo:
In this paper we present the enrichment of the Integration of Semantic Resources based in WordNet (ISR-WN Enriched). This new proposal improves the previous one where several semantic resources such as SUMO, WordNet Domains and WordNet Affects were related, adding other semantic resources such as Semantic Classes and SentiWordNet. Firstly, the paper describes the architecture of this proposal explaining the particularities of each integrated resource. After that, we analyze some problems related to the mappings of different versions and how we solve them. Moreover, we show the advantages that this kind of tool can provide to different applications of Natural Language Processing. Related to that question, we can demonstrate that the integration of semantic resources allows acquiring a multidimensional vision in the analysis of natural language.
Resumo:
In this paper we present an automatic system for the extraction of syntactic semantic patterns applied to the development of multilingual processing tools. In order to achieve optimum methods for the automatic treatment of more than one language, we propose the use of syntactic semantic patterns. These patterns are formed by a verbal head and the main arguments, and they are aligned among languages. In this paper we present an automatic system for the extraction and alignment of syntactic semantic patterns from two manually annotated corpora, and evaluate the main linguistic problems that we must deal with in the alignment process.
Resumo:
In the last few years, there has been a wide development in the research on textual information systems. The goal is to improve these systems in order to allow an easy localization, treatment and access to the information stored in digital format (Digital Databases, Documental Databases, and so on). There are lots of applications focused on information access (for example, Web-search systems like Google or Altavista). However, these applications have problems when they must access to cross-language information, or when they need to show information in a language different from the one of the query. This paper explores the use of syntactic-sematic patterns as a method to access to multilingual information, and revise, in the case of Information Retrieval, where it is possible and useful to employ patterns when it comes to the multilingual and interactive aspects. On the one hand, the multilingual aspects that are going to be studied are the ones related to the access to documents in different languages from the one of the query, as well as the automatic translation of the document, i.e. a machine translation system based on patterns. On the other hand, this paper is going to go deep into the interactive aspects related to the reformulation of a query based on the syntactic-semantic pattern of the request.
Resumo:
In this paper we explore the use of semantic classes in an existing information retrieval system in order to improve its results. Thus, we use two different ontologies of semantic classes (WordNet domain and Basic Level Concepts) in order to re-rank the retrieved documents and obtain better recall and precision. Finally, we implement a new method for weighting the expanded terms taking into account the weights of the original query terms and their relations in WordNet with respect to the new ones (which have demonstrated to improve the results). The evaluation of these approaches was carried out in the CLEF Robust-WSD Task, obtaining an improvement of 1.8% in GMAP for the semantic classes approach and 10% in MAP employing the WordNet term weighting approach.
Resumo:
This paper introduces the Sm4RIA Extension for OIDE, which implements the Sm4RIA approach in OIDE (OOH4RIA Integrated Development Environment). The application, based on the Eclipse framework, supports the design of the Sm4RIA models as well as the model-to-model and model-to-text transformation processes that facilitate the generation of Semantic Rich Internet Applications, i.e., RIA applications capable of sharing data as Linked data and consuming external data from other sources in the same manner. Moreover, the application implements mechanisms for the creation of RIA interfaces from ontologies and the automatic generation of administration interfaces for a previously design application.
Resumo:
This paper reports on the further results of the ongoing research analyzing the impact of a range of commonly used statistical and semantic features in the context of extractive text summarization. The features experimented with include word frequency, inverse sentence and term frequencies, stopwords filtering, word senses, resolved anaphora and textual entailment. The obtained results demonstrate the relative importance of each feature and the limitations of the tools available. It has been shown that the inverse sentence frequency combined with the term frequency yields almost the same results as the latter combined with stopwords filtering that in its turn proved to be a highly competitive baseline. To improve the suboptimal results of anaphora resolution, the system was extended with the second anaphora resolution module. The present paper also describes the first attempts of the internal document data representation.
Resumo:
This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.
Resumo:
In this paper, the authors extend and generalize the methodology based on the dynamics of systems with the use of differential equations as equations of state, allowing that first order transformed functions not only apply to the primitive or original variables, but also doing so to more complex expressions derived from them, and extending the rules that determine the generation of transformed superior to zero order (variable or primitive). Also, it is demonstrated that for all models of complex reality, there exists a complex model from the syntactic and semantic point of view. The theory is exemplified with a concrete model: MARIOLA model.
Resumo:
The Iterative Closest Point algorithm (ICP) is commonly used in engineering applications to solve the rigid registration problem of partially overlapped point sets which are pre-aligned with a coarse estimate of their relative positions. This iterative algorithm is applied in many areas such as the medicine for volumetric reconstruction of tomography data, in robotics to reconstruct surfaces or scenes using range sensor information, in industrial systems for quality control of manufactured objects or even in biology to study the structure and folding of proteins. One of the algorithm’s main problems is its high computational complexity (quadratic in the number of points with the non-optimized original variant) in a context where high density point sets, acquired by high resolution scanners, are processed. Many variants have been proposed in the literature whose goal is the performance improvement either by reducing the number of points or the required iterations or even enhancing the complexity of the most expensive phase: the closest neighbor search. In spite of decreasing its complexity, some of the variants tend to have a negative impact on the final registration precision or the convergence domain thus limiting the possible application scenarios. The goal of this work is the improvement of the algorithm’s computational cost so that a wider range of computationally demanding problems from among the ones described before can be addressed. For that purpose, an experimental and mathematical convergence analysis and validation of point-to-point distance metrics has been performed taking into account those distances with lower computational cost than the Euclidean one, which is used as the de facto standard for the algorithm’s implementations in the literature. In that analysis, the functioning of the algorithm in diverse topological spaces, characterized by different metrics, has been studied to check the convergence, efficacy and cost of the method in order to determine the one which offers the best results. Given that the distance calculation represents a significant part of the whole set of computations performed by the algorithm, it is expected that any reduction of that operation affects significantly and positively the overall performance of the method. As a result, a performance improvement has been achieved by the application of those reduced cost metrics whose quality in terms of convergence and error has been analyzed and validated experimentally as comparable with respect to the Euclidean distance using a heterogeneous set of objects, scenarios and initial situations.
Resumo:
The semantic localization problem in robotics consists in determining the place where a robot is located by means of semantic categories. The problem is usually addressed as a supervised classification process, where input data correspond to robot perceptions while classes to semantic categories, like kitchen or corridor. In this paper we propose a framework, implemented in the PCL library, which provides a set of valuable tools to easily develop and evaluate semantic localization systems. The implementation includes the generation of 3D global descriptors following a Bag-of-Words approach. This allows the generation of fixed-dimensionality descriptors from any type of keypoint detector and feature extractor combinations. The framework has been designed, structured and implemented to be easily extended with different keypoint detectors, feature extractors as well as classification models. The proposed framework has also been used to evaluate the performance of a set of already implemented descriptors, when used as input for a specific semantic localization system. The obtained results are discussed paying special attention to the internal parameters of the BoW descriptor generation process. Moreover, we also review the combination of some keypoint detectors with different 3D descriptor generation techniques.
Resumo:
The reprise evidential conditional (REC) is nowadays not very usual in Catalan: it is restricted to journalistic language and to some very formal genres (such as academic or legal language), it is not present in spontaneous discourse. On the one hand, it has been described among the rather new modality values of the conditional. On the other, the normative tradition tended to reject it for being a gallicism, or to describe it as an unsuitable neologism. Thanks to the extraction from text corpora, we surprisingly find this REC in Catalan from the beginning of the fourteenth century to the contemporary age, with semantic and pragmatic nuances and different evidence of grammaticalization. Due to the current interest in evidentiality, the REC has been widely studied in French, Italian and Portuguese, focusing mainly on its contemporary uses and not so intensively on the diachronic process that could explain the origin of this value. In line with this research, that we initiated studying the epistemic and evidential future in Catalan, our aim is to describe: a) the pragmatic context that could have been the initial point of the REC in the thirteenth century, before we find indisputable attestations of this use; b) the path of semantic change followed by the conditional from a ‘future in the past’ tense to the acquisition of epistemic and evidential values; and c) the role played by invited inferences, subjectification and intersubjectification in this change.
Resumo:
Presentation of the volume.
Resumo:
In this work we present a semantic framework suitable of being used as support tool for recommender systems. Our purpose is to use the semantic information provided by a set of integrated resources to enrich texts by conducting different NLP tasks: WSD, domain classification, semantic similarities and sentiment analysis. After obtaining the textual semantic enrichment we would be able to recommend similar content or even to rate texts according to different dimensions. First of all, we describe the main characteristics of the semantic integrated resources with an exhaustive evaluation. Next, we demonstrate the usefulness of our resource in different NLP tasks and campaigns. Moreover, we present a combination of different NLP approaches that provide enough knowledge for being used as support tool for recommender systems. Finally, we illustrate a case of study with information related to movies and TV series to demonstrate that our framework works properly.
Resumo:
Development of desalination projects requires simple methodologies and tools for cost-effective and environmentally-sensitive management. Sentinel taxa and biotic indices are easily interpreted in the perspective of environment management. Echinoderms are potential sentinel taxon to gauge the impact produced by brine discharge and the BOPA index is considered an effective tool for monitoring different types of impact. Salinity increase due to desalination brine discharge was evaluated in terms of these two indicators. They reflected the environmental impact and recovery after implementation of a mitigation measure. Echinoderms disappeared at the station closest to the discharge during the years with highest salinity and then recovered their abundance after installation of a diffuser reduced the salinity increase. In the same period, BOPA responded due to the decrease in sensitive amphipods and the increase in tolerant polychaete families when salinities rose. Although salinity changes explained most of the observed variability in both indicators, other abiotic parameters were also significant in explaining this variability.