10 resultados para classification and equivalence classes

em Universidad de Alicante


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we explore the use of semantic classes in an existing information retrieval system in order to improve its results. Thus, we use two different ontologies of semantic classes (WordNet domain and Basic Level Concepts) in order to re-rank the retrieved documents and obtain better recall and precision. Finally, we implement a new method for weighting the expanded terms taking into account the weights of the original query terms and their relations in WordNet with respect to the new ones (which have demonstrated to improve the results). The evaluation of these approaches was carried out in the CLEF Robust-WSD Task, obtaining an improvement of 1.8% in GMAP for the semantic classes approach and 10% in MAP employing the WordNet term weighting approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to assess the way volleyball teams score with regard to: whether or not they won the game, whether they were the home or away team, the level of the opposing teams, and the type of confrontation. The sample was composed of 118,083 plays from 794 men’s volleyball matches and 125,751 plays from 719 women’s matches of Spain’s first division clubs (from the 2002-2003 season to the 2006-2007 season). The variables studied were: the way points were obtained in each play, being the home or away team, the level of the teams, the result of the match, and the type of confrontation between the teams with regard to their level. The results demonstrate that for both men’s and women’s teams, the majority of the points were obtained in attack and by opponent errors. Differences were found with regard to the way points were obtained when winning or losing the match was taken into account as well as when considering the level of the teams. This paper discusses the differences found with regard to whether the team is home or visiting and the type of confrontation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The harmonization of European health systems brings with it a need for tools to allow the standardized collection of information about medical care. A common coding system and standards for the description of services are needed to allow local data to be incorporated into evidence-informed policy, and to permit equity and mobility to be assessed. The aim of this project has been to design such a classification and a related tool for the coding of services for Long Term Care (DESDE-LTC), based on the European Service Mapping Schedule (ESMS). Methods: The development of DESDE-LTC followed an iterative process using nominal groups in 6 European countries. 54 researchers and stakeholders in health and social services contributed to this process. In order to classify services, we use the minimal organization unit or “Basic Stable Input of Care” (BSIC), coded by its principal function or “Main Type of Care” (MTC). The evaluation of the tool included an analysis of feasibility, consistency, ontology, inter-rater reliability, Boolean Factor Analysis, and a preliminary impact analysis (screening, scoping and appraisal). Results: DESDE-LTC includes an alpha-numerical coding system, a glossary and an assessment instrument for mapping and counting LTC. It shows high feasibility, consistency, inter-rater reliability and face, content and construct validity. DESDE-LTC is ontologically consistent. It is regarded by experts as useful and relevant for evidence-informed decision making. Conclusion: DESDE-LTC contributes to establishing a common terminology, taxonomy and coding of LTC services in a European context, and a standard procedure for data collection and international comparison.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The exponential increase of subjective, user-generated content since the birth of the Social Web, has led to the necessity of developing automatic text processing systems able to extract, process and present relevant knowledge. In this paper, we tackle the Opinion Retrieval, Mining and Summarization task, by proposing a unified framework, composed of three crucial components (information retrieval, opinion mining and text summarization) that allow the retrieval, classification and summarization of subjective information. An extensive analysis is conducted, where different configurations of the framework are suggested and analyzed, in order to determine which is the best one, and under which conditions. The evaluation carried out and the results obtained show the appropriateness of the individual components, as well as the framework as a whole. By achieving an improvement over 10% compared to the state-of-the-art approaches in the context of blogs, we can conclude that subjective text can be efficiently dealt with by means of our proposed framework.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a novel filter for feature selection. Such filter relies on the estimation of the mutual information between features and classes. We bypass the estimation of the probability density function with the aid of the entropic-graphs approximation of Rényi entropy, and the subsequent approximation of the Shannon one. The complexity of such bypassing process does not depend on the number of dimensions but on the number of patterns/samples, and thus the curse of dimensionality is circumvented. We show that it is then possible to outperform a greedy algorithm based on the maximal relevance and minimal redundancy criterion. We successfully test our method both in the contexts of image classification and microarray data classification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este artículo presenta un nuevo algoritmo de fusión de clasificadores a partir de su matriz de confusión de la que se extraen los valores de precisión (precision) y cobertura (recall) de cada uno de ellos. Los únicos datos requeridos para poder aplicar este nuevo método de fusión son las clases o etiquetas asignadas por cada uno de los sistemas y las clases de referencia en la parte de desarrollo de la base de datos. Se describe el algoritmo propuesto y se recogen los resultados obtenidos en la combinación de las salidas de dos sistemas participantes en la campaña de evaluación de segmentación de audio Albayzin 2012. Se ha comprobado la robustez del algoritmo, obteniendo una reducción relativa del error de segmentación del 6.28% utilizando para realizar la fusión el sistema con menor y mayor tasa de error de los presentados a la evaluación.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Context. Four clusters of red supergiants have been discovered in a region of the Milky Way close to base of the Scutum-Crux Arm and the tip of the Long Bar. Population synthesis models indicate that they must be very massive to harbour so many supergiants. If the clusters are physically connected, this Scutum Complex would be the largest and most massive star-forming region ever identified in the Milky Way. Aims. The spatial extent of one of these clusters, RSGC3, has not been investigated. In this paper we explore the possibility that a population of red supergiants could be located in its vicinity. Methods. We utilised 2MASS JHKS photometry to identify candidate obscured luminous red stars in the vicinity of RSGC3. We observed a sample of candidates with the TWIN spectrograph on the 3.5-m telescope at Calar Alto, obtaining intermediate-resolution spectroscopy in the 8000−9000 Å range. We re-evaluated a number of classification criteria proposed in the literature for this spectral range and found that we could use our spectra to derive spectral types and luminosity classes. Results. We measured the radial velocity of five members of RSGC3, finding velocities similar to the average for members of Stephenson 2. Among the candidates observed outside the cluster, our spectra revealed eight M-type supergiants at distances <18′ from the centre of RSGC3, distributed in two clumps. The southern clump is most likely another cluster of red supergiants, with reddening and age identical to RSGC3. From 2MASS photometry, we identified four likely supergiant members of the cluster in addition to the five spectroscopically observed. The northern clump may be a small cluster with similar parameters. Photometric analysis of the area around RSGC3 suggests the presence of a large (>30) population of red supergiants with similar colours. Conclusions. Our data suggest that the massive cluster RSGC3 is surrounded by an extended association, which may be very massive ( ≳ 105 M⊙). We also show that supergiants in the Scutum Complex may be characterised via a combination of 2MASS photometry and intermediate-to-high-resolution spectroscopy in the Z band.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the chemical textile domain experts have to analyse chemical components and substances that might be harmful for their usage in clothing and textiles. Part of this analysis is performed searching opinions and reports people have expressed concerning these products in the Social Web. However, this type of information on the Internet is not as frequent for this domain as for others, so its detection and classification is difficult and time-consuming. Consequently, problems associated to the use of chemical substances in textiles may not be detected early enough, and could lead to health problems, such as allergies or burns. In this paper, we propose a framework able to detect, retrieve, and classify subjective sentences related to the chemical textile domain, that could be integrated into a wider health surveillance system. We also describe the creation of several datasets with opinions from this domain, the experiments performed using machine learning techniques and different lexical resources such as WordNet, and the evaluation focusing on the sentiment classification, and complaint detection (i.e., negativity). Despite the challenges involved in this domain, our approach obtains promising results with an F-score of 65% for polarity classification and 82% for complaint detection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aims: To determine the prevalence of endometriosis in epithelial ovarian cancers (EOC) and the association among their histological subtypes and with endometrial carcinoma. Methods: An observational cohort study performed in 192 patients operated on for EOC, 30 women with atypical endometriosis and 17 with p53 positive endometriosis. Data on associated endometriosis and endometrial carcinomas, histological subtypes, tumor stage, clinical and pathological characteristics and survival were analyzed. Results: Twenty cases of EOC (10.4%) had also endometriosis (12.7 in borderline and 9.3% in invasive cases), being a synchronous finding in most cases. Endometriosis associated with serous or mucinous EOC was observed in 2.2 and 2.7% of cases, respectively. However, this association was observed in 50 of endometrioid and 23% of clear cell EOC. Age, parity and tumor stage were lower in endometriosis-associated EOC patients; and all associated cases were type I (Kurman and Shih's classification) and showed better results in survival rate. Endometrial carcinoma was more frequently associated with endometrioid EOC (25%). Conclusions: There is a significant association between endometriosis, including atypical forms, and endometrioid and clear cell carcinomas, but not with other EOC histotypes. The presence of endometriosis in EOC suggests a better prognosis and an intermediate stage within the progression endometriosis-carcinoma.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article we describe a semantic localization dataset for indoor environments named ViDRILO. The dataset provides five sequences of frames acquired with a mobile robot in two similar office buildings under different lighting conditions. Each frame consists of a point cloud representation of the scene and a perspective image. The frames in the dataset are annotated with the semantic category of the scene, but also with the presence or absence of a list of predefined objects appearing in the scene. In addition to the frames and annotations, the dataset is distributed with a set of tools for its use in both place classification and object recognition tasks. The large number of labeled frames in conjunction with the annotation scheme make this dataset different from existing ones. The ViDRILO dataset is released for use as a benchmark for different problems such as multimodal place classification and object recognition, 3D reconstruction or point cloud data compression.