11 resultados para Recall-Precision Curves
em Universidad de Alicante
Resumo:
The present is marked by the availability of large volumes of heterogeneous data, whose management is extremely complex. While the treatment of factual data has been widely studied, the processing of subjective information still poses important challenges. This is especially true in tasks that combine Opinion Analysis with other challenges, such as the ones related to Question Answering. In this paper, we describe the different approaches we employed in the NTCIR 8 MOAT monolingual English (opinionatedness, relevance, answerness and polarity) and cross-lingual English-Chinese tasks, implemented in our OpAL system. The results obtained when using different settings of the system, as well as the error analysis performed after the competition, offered us some clear insights on the best combination of techniques, that balance between precision and recall. Contrary to our initial intuitions, we have also seen that the inclusion of specialized Natural Language Processing tools dealing with Temporality or Anaphora Resolution lowers the system performance, while the use of topic detection techniques using faceted search with Wikipedia and Latent Semantic Analysis leads to satisfactory system performance, both for the monolingual setting, as well as in a multilingual one.
Resumo:
This paper presents the automatic extension to other languages of TERSEO, a knowledge-based system for the recognition and normalization of temporal expressions originally developed for Spanish. TERSEO was first extended to English through the automatic translation of the temporal expressions. Then, an improved porting process was applied to Italian, where the automatic translation of the temporal expressions from English and from Spanish was combined with the extraction of new expressions from an Italian annotated corpus. Experimental results demonstrate how, while still adhering to the rule-based paradigm, the development of automatic rule translation procedures allowed us to minimize the effort required for porting to new languages. Relying on such procedures, and without any manual effort or previous knowledge of the target language, TERSEO recognizes and normalizes temporal expressions in Italian with good results (72% precision and 83% recall for recognition).
Resumo:
This paper presents a multi-layered Question Answering (Q.A.) architecture suitable for enhancing current Q.A. capabilities with the possibility of processing complex questions. That is, questions whose answer needs to be gathered from pieces of factual information scattered in different documents. Specifically, we have designed a layer oriented to process the different types of temporal questions. Complex temporal questions are first decomposed into simpler ones, according to the temporal relationships expressed in the original question. In the same way, the answers of each simple question are re-composed, fulfilling the temporal restrictions of the original complex question. Using this architecture, a Temporal Q.A. system has been developed. In this paper, we focus on explaining the first part of the process: the decomposition of the complex questions. Furthermore, it has been evaluated with the TERQAS question corpus of 112 temporal questions. For the task of question splitting our system has performed, in terms of precision and recall, 85% and 71%, respectively.
Resumo:
In this paper we explore the use of semantic classes in an existing information retrieval system in order to improve its results. Thus, we use two different ontologies of semantic classes (WordNet domain and Basic Level Concepts) in order to re-rank the retrieved documents and obtain better recall and precision. Finally, we implement a new method for weighting the expanded terms taking into account the weights of the original query terms and their relations in WordNet with respect to the new ones (which have demonstrated to improve the results). The evaluation of these approaches was carried out in the CLEF Robust-WSD Task, obtaining an improvement of 1.8% in GMAP for the semantic classes approach and 10% in MAP employing the WordNet term weighting approach.
Resumo:
In this paper a multilingual method for event ordering based on temporal expression resolution is presented. This method has been implemented through the TERSEO system which consists of three main units: temporal expression recognizing, resolution of the coreference introduced by these expressions, and event ordering. By means of this system, chronological information related to events can be extracted from documental databases. This information is automatically added to the documental database in order to allow its use by question answering systems in those cases referring to temporality. The system has been evaluated obtaining results of 91 % precision and 71 % recall. For this, a blind evaluation process has been developed guaranteing a reliable annotation process that was measured through the kappa factor.
Resumo:
Este artículo presenta un nuevo algoritmo de fusión de clasificadores a partir de su matriz de confusión de la que se extraen los valores de precisión (precision) y cobertura (recall) de cada uno de ellos. Los únicos datos requeridos para poder aplicar este nuevo método de fusión son las clases o etiquetas asignadas por cada uno de los sistemas y las clases de referencia en la parte de desarrollo de la base de datos. Se describe el algoritmo propuesto y se recogen los resultados obtenidos en la combinación de las salidas de dos sistemas participantes en la campaña de evaluación de segmentación de audio Albayzin 2012. Se ha comprobado la robustez del algoritmo, obteniendo una reducción relativa del error de segmentación del 6.28% utilizando para realizar la fusión el sistema con menor y mayor tasa de error de los presentados a la evaluación.
Resumo:
Paper submitted to the XVIII Conference on Design of Circuits and Integrated Systems (DCIS), Ciudad Real, España, 2003.
Resumo:
Paper submitted to 10th IEEE International Conference on Electronics, Circuits and Systems (ICECS), Sharjah, Emiratos Árabes, 2003.
Resumo:
Context. The Gaia-ESO Survey (GES) is a large public spectroscopic survey at the European Southern Observatory Very Large Telescope. Aims. A key aim is to provide precise radial velocities (RVs) and projected equatorial velocities (vsini) for representative samples of Galactic stars, which will complement information obtained by the Gaia astrometry satellite. Methods. We present an analysis to empirically quantify the size and distribution of uncertainties in RV and vsini using spectra from repeated exposures of the same stars. Results. We show that the uncertainties vary as simple scaling functions of signal-to-noise ratio (S/N) and vsini, that the uncertainties become larger with increasing photospheric temperature, but that the dependence on stellar gravity, metallicity and age is weak. The underlying uncertainty distributions have extended tails that are better represented by Student’s t-distributions than by normal distributions. Conclusions. Parametrised results are provided, which enable estimates of the RV precision for almost all GES measurements, and estimates of the vsini precision for stars in young clusters, as a function of S/N, vsini and stellar temperature. The precision of individual high S/N GES RV measurements is 0.22–0.26 km s-1, dependent on instrumental configuration.
Resumo:
A single and very easy to use Graphical User Interface (GUI- MATLAB) based on the topological information contained in the Gibbs energy of mixing function has been developed as a friendly tool to check the coherence of NRTL parameters obtained in a correlation data procedure. Thus, the analysis of the GM/RT surface, the GM/RT for the binaries and the GM/RT in planes containing the tie lines should be necessary to validate the obtained parameters for the different models for correlating phase equlibrium data.
Resumo:
The remediation of paracetamol (PA), an emerging contaminant frequently found in wastewater treatment plants, has been studied in the low concentration range (0.3–10 mg L−1) using as adsorbent a biomass-derived activated carbon. PA uptake of up to 100 mg g−1 over the activated carbon has been obtained, with the adsorption isotherms being fairly explained by the Langmuir model. The application of Reichemberg and the Vermeulen equations to the batch kinetics experiments allowed estimating homogeneous and heterogeneous diffusion coefficients, reflecting the dependence of diffusion with the surface coverage of PA. A series of rapid small-scale column tests were carried out to determine the breakthrough curves under different operational conditions (temperature, PA concentration, flow rate, bed length). The suitability of the proposed adsorbent for the remediation of PA in fixed-bed adsorption was proven by the high PA adsorption capacity along with the fast adsorption and the reduced height of the mass transfer zone of the columns. We have demonstrated that, thanks to the use of the heterogeneous diffusion coefficient, the proposed mathematical approach for the numerical solution to the mass balance of the column provides a reliable description of the breakthrough profiles and the design parameters, being much more accurate than models based in the classical linear driving force.