973 resultados para self-organising maps
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.
Resumo:
Solar-powered vehicle activated signs (VAS) are speed warning signs powered by batteries that are recharged by solar panels. These signs are more desirable than other active warning signs due to the low cost of installation and the minimal maintenance requirements. However, one problem that can affect a solar-powered VAS is the limited power capacity available to keep the sign operational. In order to be able to operate the sign more efficiently, it is proposed that the sign be appropriately triggered by taking into account the prevalent conditions. Triggering the sign depends on many factors such as the prevailing speed limit, road geometry, traffic behaviour, the weather and the number of hours of daylight. The main goal of this paper is therefore to develop an intelligent algorithm that would help optimize the trigger point to achieve the best compromise between speed reduction and power consumption. Data have been systematically collected whereby vehicle speed data were gathered whilst varying the value of the trigger speed threshold. A two stage algorithm is then utilized to extract the trigger speed value. Initially the algorithm employs a Self-Organising Map (SOM), to effectively visualize and explore the properties of the data that is then clustered in the second stage using K-means clustering method. Preliminary results achieved in the study indicate that using a SOM in conjunction with K-means method is found to perform well as opposed to direct clustering of the data by K-means alone. Using a SOM in the current case helped the algorithm determine the number of clusters in the data set, which is a frequent problem in data clustering.
Resumo:
Feature selection is an important and active issue in clustering and classification problems. By choosing an adequate feature subset, a dataset dimensionality reduction is allowed, thus contributing to decreasing the classification computational complexity, and to improving the classifier performance by avoiding redundant or irrelevant features. Although feature selection can be formally defined as an optimisation problem with only one objective, that is, the classification accuracy obtained by using the selected feature subset, in recent years, some multi-objective approaches to this problem have been proposed. These either select features that not only improve the classification accuracy, but also the generalisation capability in case of supervised classifiers, or counterbalance the bias toward lower or higher numbers of features that present some methods used to validate the clustering/classification in case of unsupervised classifiers. The main contribution of this paper is a multi-objective approach for feature selection and its application to an unsupervised clustering procedure based on Growing Hierarchical Self-Organising Maps (GHSOMs) that includes a new method for unit labelling and efficient determination of the winning unit. In the network anomaly detection problem here considered, this multi-objective approach makes it possible not only to differentiate between normal and anomalous traffic but also among different anomalies. The efficiency of our proposals has been evaluated by using the well-known DARPA/NSL-KDD datasets that contain extracted features and labelled attacks from around 2 million connections. The selected feature sets computed in our experiments provide detection rates up to 99.8% with normal traffic and up to 99.6% with anomalous traffic, as well as accuracy values up to 99.12%.
Resumo:
Growing models have been widely used for clustering or topology learning. Traditionally these models work on stationary environments, grow incrementally and adapt their nodes to a given distribution based on global parameters. In this paper, we present an enhanced unsupervised self-organising network for the modelling of visual objects. We first develop a framework for building non-rigid shapes using the growth mechanism of the self-organising maps, and then we define an optimal number of nodes without overfitting or underfitting the network based on the knowledge obtained from information-theoretic considerations. We present experimental results for hands and we quantitatively evaluate the matching capabilities of the proposed method with the topographic product.
Resumo:
The Asteraceae, one of the largest families among angiosperms, is chemically characterised by the production of sesquiterpene lactones (SLs). A total of 1,111 SLs, which were extracted from 658 species, 161 genera, 63 subtribes and 15 tribes of Asteraceae, were represented and registered in two dimensions in the SISTEMATX, an in-house software system, and were associated with their botanical sources. The respective 11 block of descriptors: Constitutional, Functional groups, BCUT, Atom-centred, 2D autocorrelations, Topological, Geometrical, RDF, 3D-MoRSE, GETAWAY and WHIM were used as input data to separate the botanical occurrences through self-organising maps. Maps that were generated with each descriptor divided the Asteraceae tribes, with total index values between 66.7% and 83.6%. The analysis of the results shows evident similarities among the Heliantheae, Helenieae and Eupatorieae tribes as well as between the Anthemideae and Inuleae tribes. Those observations are in agreement with systematic classifications that were proposed by Bremer, which use mainly morphological and molecular data, therefore chemical markers partially corroborate with these classifications. The results demonstrate that the atom-centred and RDF descriptors can be used as a tool for taxonomic classification in low hierarchical levels, such as tribes. Descriptors obtained through fragments or by the two-dimensional representation of the SL structures were sufficient to obtain significant results, and better results were not achieved by using descriptors derived from three-dimensional representations of SLs. Such models based on physico-chemical properties can project new design SLs, similar structures from literature or even unreported structures in two-dimensional chemical space. Therefore, the generated SOMs can predict the most probable tribe where a biologically active molecule can be found according Bremer classification.
Resumo:
Variations on the standard Kohonen feature map can enable an ordering of the map state space by using only a limited subset of the complete input vector. Also it is possible to employ merely a local adaptation procedure to order the map, rather than having to rely on global variables and objectives. Such variations have been included as part of a hybrid learning system (HLS) which has arisen out of a genetic-based classifier system. In the paper a description of the modified feature map is given, which constitutes the HLSs long term memory, and results in the control of a simple maze running task are presented, thereby demonstrating the value of goal related feedback within the overall network.
Resumo:
Self-organizing neural networks have been implemented in a wide range of application areas such as speech processing, image processing, optimization and robotics. Recent variations to the basic model proposed by the authors enable it to order state space using a subset of the input vector and to apply a local adaptation procedure that does not rely on a predefined test duration limit. Both these variations have been incorporated into a new feature map architecture that forms an integral part of an Hybrid Learning System (HLS) based on a genetic-based classifier system. Problems are represented within HLS as objects characterized by environmental features. Objects controlled by the system have preset targets set against a subset of their features. The system's objective is to achieve these targets by evolving a behavioural repertoire that efficiently explores and exploits the problem environment. Feature maps encode two types of knowledge within HLS — long-term memory traces of useful regularities within the environment and the classifier performance data calibrated against an object's feature states and targets. Self-organization of these networks constitutes non-genetic-based (experience-driven) learning within HLS. This paper presents a description of the HLS architecture and an analysis of the modified feature map implementing associative memory. Initial results are presented that demonstrate the behaviour of the system on a simple control task.
Resumo:
LEÃO, Adriano de Castro; DÓRIA NETO, Adrião Duarte; SOUSA, Maria Bernardete Cordeiro de. New developmental stages for common marmosets (Callithrix jacchus) using mass and age variables obtained by K-means algorithm and self-organizing maps (SOM). Computers in Biology and Medicine, v. 39, p. 853-859, 2009
Resumo:
LEÃO, Adriano de Castro; DÓRIA NETO, Adrião Duarte; SOUSA, Maria Bernardete Cordeiro de. New developmental stages for common marmosets (Callithrix jacchus) using mass and age variables obtained by K-means algorithm and self-organizing maps (SOM). Computers in Biology and Medicine, v. 39, p. 853-859, 2009
Resumo:
LEÃO, Adriano de Castro; DÓRIA NETO, Adrião Duarte; SOUSA, Maria Bernardete Cordeiro de. New developmental stages for common marmosets (Callithrix jacchus) using mass and age variables obtained by K-means algorithm and self-organizing maps (SOM). Computers in Biology and Medicine, v. 39, p. 853-859, 2009
Resumo:
A chemotaxonomic analysis is described of a database containing various types of compounds from the Heliantheae tribe (Asteraceae) using Self-Organizing Maps (SOM). The numbers of occurrences of 9 chemical classes in different taxa of the tribe were used as variables. The study shows that SOM applied to chemical data can contribute to differentiate genera, subtribes, and groups of subtribes (subtribe branches), as well as to tribal and subtribal classifications of Heliantheae, exhibiting a high hit percentage comparable to that of an expert performance, and in agreement with the previous tribe classification proposed by Stuessy.