896 resultados para Recurrent associative self-organizing map
Resumo:
Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.
Resumo:
In dieser Arbeit wird ein Verfahren zum Einsatz neuronaler Netzwerke vorgestellt, das auf iterative Weise Klassifikation und Prognoseschritte mit dem Ziel kombiniert, bessere Ergebnisse der Prognose im Vergleich zu einer einmaligen hintereinander Ausführung dieser Schritte zu erreichen. Dieses Verfahren wird am Beispiel der Prognose der Windstromerzeugung abhängig von der Wettersituation erörtert. Eine Verbesserung wird in diesem Rahmen mit einzelnen Ausreißern erreicht. Verschiedene Aspekte werden in drei Kapiteln diskutiert: In Kapitel 1 werden die verwendeten Daten und ihre elektronische Verarbeitung vorgestellt. Die Daten bestehen zum einen aus Windleistungshochrechnungen für die Bundesrepublik Deutschland der Jahre 2011 und 2012, welche als Transparenzanforderung des Erneuerbaren Energiegesetzes durch die Übertragungsnetzbetreiber publiziert werden müssen. Zum anderen werden Wetterprognosen, die der Deutsche Wetterdienst im Rahmen der Grundversorgung kostenlos bereitstellt, verwendet. Kapitel 2 erläutert zwei aus der Literatur bekannte Verfahren - Online- und Batchalgorithmus - zum Training einer selbstorganisierenden Karte. Aus den dargelegten Verfahrenseigenschaften begründet sich die Wahl des Batchverfahrens für die in Kapitel 3 erläuterte Methode. Das in Kapitel 3 vorgestellte Verfahren hat im modellierten operativen Einsatz den gleichen Ablauf, wie eine Klassifikation mit anschließender klassenspezifischer Prognose. Bei dem Training des Verfahrens wird allerdings iterativ vorgegangen, indem im Anschluss an das Training der klassenspezifischen Prognose ermittelt wird, zu welcher Klasse der Klassifikation ein Eingabedatum gehören sollte, um mit den vorliegenden klassenspezifischen Prognosemodellen die höchste Prognosegüte zu erzielen. Die so gewonnene Einteilung der Eingaben kann genutzt werden, um wiederum eine neue Klassifikationsstufe zu trainieren, deren Klassen eine verbesserte klassenspezifisch Prognose ermöglichen.
Resumo:
Facilitating the visual exploration of scientific data has received increasing attention in the past decade or so. Especially in life science related application areas the amount of available data has grown at a breath taking pace. In this paper we describe an approach that allows for visual inspection of large collections of molecular compounds. In contrast to classical visualizations of such spaces we incorporate a specific focus of analysis, for example the outcome of a biological experiment such as high throughout screening results. The presented method uses this experimental data to select molecular fragments of the underlying molecules that have interesting properties and uses the resulting space to generate a two dimensional map based on a singular value decomposition algorithm and a self organizing map. Experiments on real datasets show that the resulting visual landscape groups molecules of similar chemical properties in densely connected regions.
Resumo:
Self-Organizing Map (SOM) algorithm has been extensively used for analysis and classification problems. For this kind of problems, datasets become more and more large and it is necessary to speed up the SOM learning. In this paper we present an application of the Simulated Annealing (SA) procedure to the SOM learning algorithm. The goal of the algorithm is to obtain fast learning and better performance in terms of matching of input data and regularity of the obtained map. An advantage of the proposed technique is that it preserves the simplicity of the basic algorithm. Several tests, carried out on different large datasets, demonstrate the effectiveness of the proposed algorithm in comparison with the original SOM and with some of its modification introduced to speed-up the learning.
Resumo:
We have discovered a novel approach of intrusion detection system using an intelligent data classifier based on a self organizing map (SOM). We have surveyed all other unsupervised intrusion detection methods, different alternative SOM based techniques and KDD winner IDS methods. This paper provides a robust designed and implemented intelligent data classifier technique based on a single large size (30x30) self organizing map (SOM) having the capability to detect all types of attacks given in the DARPA Archive 1999 the lowest false positive rate being 0.04 % and higher detection rate being 99.73% tested using full KDD data sets and 89.54% comparable detection rate and 0.18% lowest false positive rate tested using corrected data sets.
Resumo:
The identification and visualization of clusters formed by motor unit action potentials (MUAPs) is an essential step in investigations seeking to explain the control of the neuromuscular system. This work introduces the generative topographic mapping (GTM), a novel machine learning tool, for clustering of MUAPs, and also it extends the GTM technique to provide a way of visualizing MUAPs. The performance of GTM was compared to that of three other clustering methods: the self-organizing map (SOM), a Gaussian mixture model (GMM), and the neural-gas network (NGN). The results, based on the study of experimental MUAPs, showed that the rate of success of both GTM and SOM outperformed that of GMM and NGN, and also that GTM may in practice be used as a principled alternative to the SOM in the study of MUAPs. A visualization tool, which we called GTM grid, was devised for visualization of MUAPs lying in a high-dimensional space. The visualization provided by the GTM grid was compared to that obtained from principal component analysis (PCA). (c) 2005 Elsevier Ireland Ltd. All rights reserved.
Resumo:
The Self-Organizing Map (SOM) is a popular unsupervised neural network able to provide effective clustering and data visualization for data represented in multidimensional input spaces. In this paper, we describe Fast Learning SOM (FLSOM) which adopts a learning algorithm that improves the performance of the standard SOM with respect to the convergence time in the training phase. We show that FLSOM also improves the quality of the map by providing better clustering quality and topology preservation of multidimensional input data. Several tests have been carried out on different multidimensional datasets, which demonstrate better performances of the algorithm in comparison with the original SOM.
Resumo:
The Self-Organizing Map (SOM) is a popular unsupervised neural network able to provide effective clustering and data visualization for multidimensional input datasets. In this paper, we present an application of the simulated annealing procedure to the SOM learning algorithm with the aim to obtain a fast learning and better performances in terms of quantization error. The proposed learning algorithm is called Fast Learning Self-Organized Map, and it does not affect the easiness of the basic learning algorithm of the standard SOM. The proposed learning algorithm also improves the quality of resulting maps by providing better clustering quality and topology preservation of input multi-dimensional data. Several experiments are used to compare the proposed approach with the original algorithm and some of its modification and speed-up techniques.
Resumo:
Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.
Resumo:
In this paper artificial neural network (ANN) based on supervised and unsupervised algorithms were investigated for use in the study of rheological parameters of solid pharmaceutical excipients, in order to develop computational tools for manufacturing solid dosage forms. Among four supervised neural networks investigated, the best learning performance was achieved by a feedfoward multilayer perceptron whose architectures was composed by eight neurons in the input layer, sixteen neurons in the hidden layer and one neuron in the output layer. Learning and predictive performance relative to repose angle was poor while to Carr index and Hausner ratio (CI and HR, respectively) showed very good fitting capacity and learning, therefore HR and CI were considered suitable descriptors for the next stage of development of supervised ANNs. Clustering capacity was evaluated for five unsupervised strategies. Network based on purely unsupervised competitive strategies, classic "Winner-Take-All", "Frequency-Sensitive Competitive Learning" and "Rival-Penalize Competitive Learning" (WTA, FSCL and RPCL, respectively) were able to perform clustering from database, however this classification was very poor, showing severe classification errors by grouping data with conflicting properties into the same cluster or even the same neuron. On the other hand it could not be established what was the criteria adopted by the neural network for those clustering. Self-Organizing Maps (SOM) and Neural Gas (NG) networks showed better clustering capacity. Both have recognized the two major groupings of data corresponding to lactose (LAC) and cellulose (CEL). However, SOM showed some errors in classify data from minority excipients, magnesium stearate (EMG) , talc (TLC) and attapulgite (ATP). NG network in turn performed a very consistent classification of data and solve the misclassification of SOM, being the most appropriate network for classifying data of the study. The use of NG network in pharmaceutical technology was still unpublished. NG therefore has great potential for use in the development of software for use in automated classification systems of pharmaceutical powders and as a new tool for mining and clustering data in drug development
Resumo:
We propose a multi-resolution approach for surface reconstruction from clouds of unorganized points representing an object surface in 3D space. The proposed method uses a set of mesh operators and simple rules for selective mesh refinement, with a strategy based on Kohonen s self-organizing map. Basically, a self-adaptive scheme is used for iteratively moving vertices of an initial simple mesh in the direction of the set of points, ideally the object boundary. Successive refinement and motion of vertices are applied leading to a more detailed surface, in a multi-resolution, iterative scheme. Reconstruction was experimented with several point sets, induding different shapes and sizes. Results show generated meshes very dose to object final shapes. We include measures of performance and discuss robustness.
Resumo:
As concessionárias de energia, para garantir que sua rede seja confiável, necessitam realizar um procedimento para estudo e análise baseado em funções de entrega de energia nos pontos de consumo. Este estudo, geralmente chamado de planejamento de sistemas de distribuição de energia elétrica, é essencial para garantir que variações na demanda de energia não afetem o desempenho do sistema, que deverá se manter operando de maneira técnica e economicamente viável. Nestes estudos, geralmente são analisados, demanda, tipologia de curva de carga, fator de carga e outros aspectos das cargas existentes. Considerando então a importância da determinação das tipologias de curvas de cargas para as concessionárias de energia em seu processo de planejamento, a Companhia de Eletricidade do Amapá (CEA) realizou uma campanha de medidas de curvas de carga de transformadores de distribuição para obtenção das tipologias de curvas de carga que caracterizam seus consumidores. Neste trabalho apresentam-se os resultados satisfatórios obtidos a partir da utilização de Mineração de Dados baseada em Inteligência Computacional (Mapas Auto-Organizáveis de Kohonen) para seleção das curvas típicas e determinação das tipologias de curvas de carga de consumidores residenciais e industriais da cidade de Macapá, localizada no estado do Amapá. O mapa auto-organizável de Kohonen é um tipo de Rede Neural Artificial que combina operações de projeção e agrupamento, permitindo a realização de análise exploratória de dados, com o objetivo de produzir descrições sumarizadas de grandes conjuntos de dados.
Resumo:
Apesar das diversas vantagens oferecidas pelas redes neurais artificiais (RNAs), algumas limitações ainda impedem sua larga utilização, principalmente em aplicações que necessitem de tomada de decisões essenciais para garantir a segurança em ambientes como, por exemplo, em Sistemas de Energia. Uma das principais limitações das RNAs diz respeito à incapacidade que estas redes apresentam de explicar como chegam a determinadas decisões; explicação esta que seja humanamente compreensível. Desta forma, este trabalho propõe um método para extração de regras a partir do mapa auto-organizável de Kohonen, projetando um sistema de inferência difusa capaz de explicar as decisões/classificação obtidas através do mapa. A metodologia proposta é aplicada ao problema de diagnóstico de faltas incipientes em transformadores, em que se obtém um sistema classificatório eficiente e com capacidade de explicação em relação aos resultados obtidos, o que gera mais confiança aos especialistas da área na hora de tomar decisões.
Resumo:
BACKGROUND: Pneumococcal meningitis is associated with high mortality (approximately 30%) and morbidity. Up to 50% of survivors are affected by neurological sequelae due to a wide spectrum of brain injury mainly affecting the cortex and hippocampus. Despite this significant disease burden, the genetic program that regulates the host response leading to brain damage as a consequence of bacterial meningitis is largely unknown.We used an infant rat model of pneumococcal meningitis to assess gene expression profiles in cortex and hippocampus at 22 and 44 hours after infection and in controls at 22 h after mock-infection with saline. To analyze the biological significance of the data generated by Affymetrix DNA microarrays, a bioinformatics pipeline was used combining (i) a literature-profiling algorithm to cluster genes based on the vocabulary of abstracts indexed in MEDLINE (NCBI) and (ii) the self-organizing map (SOM), a clustering technique based on covariance in gene expression kinetics. RESULTS: Among 598 genes differentially regulated (change factor > or = 1.5; p < or = 0.05), 77% were automatically assigned to one of 11 functional groups with 94% accuracy. SOM disclosed six patterns of expression kinetics. Genes associated with growth control/neuroplasticity, signal transduction, cell death/survival, cytoskeleton, and immunity were generally upregulated. In contrast, genes related to neurotransmission and lipid metabolism were transiently downregulated on the whole. The majority of the genes associated with ionic homeostasis, neurotransmission, signal transduction and lipid metabolism were differentially regulated specifically in the hippocampus. Of the cell death/survival genes found to be continuously upregulated only in hippocampus, the majority are pro-apoptotic, while those continuously upregulated only in cortex are anti-apoptotic. CONCLUSION: Temporal and spatial analysis of gene expression in experimental pneumococcal meningitis identified potential targets for therapy.
Resumo:
Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging behaviour of ants, which move through the data space searching for high data-density regions, and leave pheromone trails on their path. The pheromone map is used to identify the exact number of clusters, and assign the pixels to these clusters using the pheromone gradient. We applied ASCA to detection of microcalcifications in digital mammograms and compared its performance with state-of-the-art clustering algorithms such as 1D Self-Organizing Map, k-Means, Fuzzy c-Means and Possibilistic Fuzzy c-Means. The main advantage of ASCA is that the number of clusters needs not to be known a priori. The experimental results show that ASCA is more efficient than the other algorithms in detecting small clusters of atypical data.