4 resultados para THRESHOLD SELECTION METHOD
em Universidad de Alicante
Resumo:
In this paper, we propose a novel filter for feature selection. Such filter relies on the estimation of the mutual information between features and classes. We bypass the estimation of the probability density function with the aid of the entropic-graphs approximation of Rényi entropy, and the subsequent approximation of the Shannon one. The complexity of such bypassing process does not depend on the number of dimensions but on the number of patterns/samples, and thus the curse of dimensionality is circumvented. We show that it is then possible to outperform a greedy algorithm based on the maximal relevance and minimal redundancy criterion. We successfully test our method both in the contexts of image classification and microarray data classification.
Resumo:
Feature selection is an important and active issue in clustering and classification problems. By choosing an adequate feature subset, a dataset dimensionality reduction is allowed, thus contributing to decreasing the classification computational complexity, and to improving the classifier performance by avoiding redundant or irrelevant features. Although feature selection can be formally defined as an optimisation problem with only one objective, that is, the classification accuracy obtained by using the selected feature subset, in recent years, some multi-objective approaches to this problem have been proposed. These either select features that not only improve the classification accuracy, but also the generalisation capability in case of supervised classifiers, or counterbalance the bias toward lower or higher numbers of features that present some methods used to validate the clustering/classification in case of unsupervised classifiers. The main contribution of this paper is a multi-objective approach for feature selection and its application to an unsupervised clustering procedure based on Growing Hierarchical Self-Organising Maps (GHSOMs) that includes a new method for unit labelling and efficient determination of the winning unit. In the network anomaly detection problem here considered, this multi-objective approach makes it possible not only to differentiate between normal and anomalous traffic but also among different anomalies. The efficiency of our proposals has been evaluated by using the well-known DARPA/NSL-KDD datasets that contain extracted features and labelled attacks from around 2 million connections. The selected feature sets computed in our experiments provide detection rates up to 99.8% with normal traffic and up to 99.6% with anomalous traffic, as well as accuracy values up to 99.12%.
Resumo:
This paper shows the analysis results obtained from more than 200 finite element method (FEM) models used to calculate the settlement of a foundation resting on two soils of differing deformability. The analysis considers such different parameters as the foundation geometry, the percentage of each soil in contact with the foundation base and the ratio of the soils’ elastic moduli. From the described analysis, it is concluded that the maximum settlement of the foundation, calculated by assuming that the foundation is completely resting on the most deformable soil, can be correlated with the settlement calculated by FEM models through a correction coefficient named “settlement reduction factor” (α). As a consequence, a novel expression is proposed for calculating the real settlement of a foundation resting on two soils of different deformability with maximum errors lower than 1.57%, as demonstrated by the statistical analysis carried out. A guide for the application of the proposed simple method is also explained in the paper. Finally, the proposed methodology has been validated using settlement data from an instrumented foundation, indicating that this is a simple, reliable and quick method which allows the computation of the maximum elastic settlement of a raft foundation, evaluates its suitability and optimises its selection process.
Resumo:
The knowledge of the distributional patterns of saproxylic beetles is essential for conservation biology due to the relevance of this fauna in the maintenance of ecological processes and the endangerment of species. The complex community of saproxylic beetles is shaped by different assemblages that are composed of species linked by the microhabitats they use. We evaluate how different the species distribution patterns that are obtained can be, depending on the analyzed assemblage and to what extent these can affect conservation decisions. Beetles were sampled using hollow emergence and window traps in three protected areas of the Iberian Peninsula. Species richness, composition, and diversity turnover were analyzed for each sampling method and showed high variation depending on the analyzed assemblage. Beta diversity was clearly higher among forests for the assemblage captured using window traps. This method collects flying insects from different tree microhabitats and its captures are influenced by the forest structuring. Within forests, the assemblages captured by hollow emergence traps, which collect the fauna linked to tree hollows, showed the largest turnover of species, as they are influenced by the characteristics of each cavity. Moreover, the selection of the forest showing the highest species richness strongly depended on the studied assemblage. This study demonstrates that differences in the studied assemblages (group of species co-occurring in the same habitat) can also lead to significant differences in the identified patterns of species distribution and diversity turnover. This fact will be necessary to take into consideration when making decisions about conservation and management.