3 resultados para classification aided by clustering
em Universidad de Alicante
Resumo:
Feature selection is an important and active issue in clustering and classification problems. By choosing an adequate feature subset, a dataset dimensionality reduction is allowed, thus contributing to decreasing the classification computational complexity, and to improving the classifier performance by avoiding redundant or irrelevant features. Although feature selection can be formally defined as an optimisation problem with only one objective, that is, the classification accuracy obtained by using the selected feature subset, in recent years, some multi-objective approaches to this problem have been proposed. These either select features that not only improve the classification accuracy, but also the generalisation capability in case of supervised classifiers, or counterbalance the bias toward lower or higher numbers of features that present some methods used to validate the clustering/classification in case of unsupervised classifiers. The main contribution of this paper is a multi-objective approach for feature selection and its application to an unsupervised clustering procedure based on Growing Hierarchical Self-Organising Maps (GHSOMs) that includes a new method for unit labelling and efficient determination of the winning unit. In the network anomaly detection problem here considered, this multi-objective approach makes it possible not only to differentiate between normal and anomalous traffic but also among different anomalies. The efficiency of our proposals has been evaluated by using the well-known DARPA/NSL-KDD datasets that contain extracted features and labelled attacks from around 2 million connections. The selected feature sets computed in our experiments provide detection rates up to 99.8% with normal traffic and up to 99.6% with anomalous traffic, as well as accuracy values up to 99.12%.
Resumo:
Fluctuations of trace gas activity as a response to variations in weather and microclimate conditions were monitored over a year in a shallow volcanic cave (Painted Cave, Galdar, Canary Islands, Spain). 222Rn concentration was used due to its greater sensitivity to hygrothermal variations than CO2 concentration. Radon concentration in the cave increases as effective vapour condensation within the porous system of the rock surfaces inside the cave increases due to humidity levels of more than 70%. Condensed water content in pores was assessed and linked to a reduction in the direct passage of trace gases. Fluctuations in radon activity as a response to variations in weather and microclimate conditions were statistically identified by clustering entropy changes on the radon signal and parameterised to predict radon concentration anomalies. This raises important implications for other research fields, including the surveillance of shallow volcanic and seismic activity, preventive conservation of cultural heritage in indoor spaces, indoor air quality control and studies to improve understanding of the role of subterranean terrestrial ecosystems as reservoirs and/or temporary sources of trace gases.
Resumo:
The aim of this study was to assess the way volleyball teams score with regard to: whether or not they won the game, whether they were the home or away team, the level of the opposing teams, and the type of confrontation. The sample was composed of 118,083 plays from 794 men’s volleyball matches and 125,751 plays from 719 women’s matches of Spain’s first division clubs (from the 2002-2003 season to the 2006-2007 season). The variables studied were: the way points were obtained in each play, being the home or away team, the level of the teams, the result of the match, and the type of confrontation between the teams with regard to their level. The results demonstrate that for both men’s and women’s teams, the majority of the points were obtained in attack and by opponent errors. Differences were found with regard to the way points were obtained when winning or losing the match was taken into account as well as when considering the level of the teams. This paper discusses the differences found with regard to whether the team is home or visiting and the type of confrontation.