14 resultados para Web Mining, Data Mining, User Topic Model, Web User Profiles

em Universitätsbibliothek Kassel, Universität Kassel, Germany


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bei Meßprojekten in denen Gebäude mit installierter Lüftungsanlage untersucht wurden, stellte man immer wieder ein breite Streuung der Meßwerte, als auch eine oftmals deutliche Abweichung vom vorher ermittelten Heizwärmebedarf der Gebäude fest. Es wird vermutet, daß diese Unterschiede systemspezifische Ursachen haben, ein Nachweis kann aufgrund der geringen Anzahl vorhandener Meßpunkte jedoch nicht geführt werden. Um die Sensitivität verschiedener Randbedingungen auf den Energieverbrauch zu ermitteln, wird im vorliegenden Forschungsprojekt ein Simulationsmodell erstellt. Das thermische Verhalten und die Durchströmung des Gebäudes werden durch ein gekoppeltes Modell abgebildet. Unterschiedliche Lüftungsanlagensysteme werden miteinander verglichen. Auf Basis vorhandener Meßdaten wird ein klimaabhängiges Modell zur Fensterlüftung entwickelt, welches in die Modellbildung der Gebäudedurchströmung mit einfließt. Feuchtegeregelte Abluftanlagen sind in der Lage den mittleren Luftwechsel auf ein hygienisch sinnvollen Wert zu begrenzen. Sie erweisen sich im Hinblick auf die Sensitivität verschiedener Randbedingungen als robuste Systeme. Trotz Einsatz von Lüftungsanlagen kann je nach Betriebszustand insbesondere bei Abluftanlagen keine ausreichende Luftqualität sichergestellt werden. Zukünftige Systeme dürfen das "Lüftungssystem" Fenster nicht vernachlässigen, sondern müssen es in das Gesamtkonzept mit einbeziehen.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With this document, we provide a compilation of in-depth discussions on some of the most current security issues in distributed systems. The six contributions have been collected and presented at the 1st Kassel Student Workshop on Security in Distributed Systems (KaSWoSDS’08). We are pleased to present a collection of papers not only shedding light on the theoretical aspects of their topics, but also being accompanied with elaborate practical examples. In Chapter 1, Stephan Opfer discusses Viruses, one of the oldest threats to system security. For years there has been an arms race between virus producers and anti-virus software providers, with no end in sight. Stefan Triller demonstrates how malicious code can be injected in a target process using a buffer overflow in Chapter 2. Websites usually store their data and user information in data bases. Like buffer overflows, the possibilities of performing SQL injection attacks targeting such data bases are left open by unwary programmers. Stephan Scheuermann gives us a deeper insight into the mechanisms behind such attacks in Chapter 3. Cross-site scripting (XSS) is a method to insert malicious code into websites viewed by other users. Michael Blumenstein explains this issue in Chapter 4. Code can be injected in other websites via XSS attacks in order to spy out data of internet users, spoofing subsumes all methods that directly involve taking on a false identity. In Chapter 5, Till Amma shows us different ways how this can be done and how it is prevented. Last but not least, cryptographic methods are used to encode confidential data in a way that even if it got in the wrong hands, the culprits cannot decode it. Over the centuries, many different ciphers have been developed, applied, and finally broken. Ilhan Glogic sketches this history in Chapter 6.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Numerous studies have proven an effect of a probable climate change on the hydrosphere’s different subsystems. In the 21st century global and regional redistribution of water has to be expected and it is very likely that extreme weather phenomenon will occur more frequently. From a global view the flood situation will exacerbate. In contrast to these discoveries the classical approach of flood frequency analysis provides terms like “mean flood recurrence interval”. But for this analysis to be valid there is a need for the precondition of stationary distribution parameters which implies that the flood frequencies are constant in time. Newer approaches take into account extreme value distributions with time-dependent parameters. But the latter implies a discard of the mentioned old terminology that has been used up-to-date in engineering hydrology. On the regional scale climate change affects the hydrosphere in various ways. So, the question appears to be whether in central Europe the classical approach of flood frequency analysis is not usable anymore and whether the traditional terminology should be renewed. In the present case study hydro-meteorological time series of the Fulda catchment area (6930 km²), upstream of the gauging station Bonaforth, are analyzed for the time period 1960 to 2100. At first a distributed catchment area model (SWAT2005) is build up, calibrated and finally validated. The Edertal reservoir is regulated as well by a feedback control of the catchments output in case of low water. Due to this intricacy a special modeling strategy has been necessary: The study area is divided into three SWAT basin models and an additional physically-based reservoir model is developed. To further improve the streamflow predictions of the SWAT model, a correction by an artificial neural network (ANN) has been tested successfully which opens a new way to improve hydrological models. With this extension the calibration and validation of the SWAT model for the Fulda catchment area is improved significantly. After calibration of the model for the past 20th century observed streamflow, the SWAT model is driven by high resolution climate data of the regional model REMO using the IPCC scenarios A1B, A2, and B1, to generate future runoff time series for the 21th century for the various sub-basins in the study area. In a second step flood time series HQ(a) are derived from the 21st century runoff time series (scenarios A1B, A2, and B1). Then these flood projections are extensively tested with regard to stationarity, homogeneity and statistical independence. All these tests indicate that the SWAT-predicted 21st-century trends in the flood regime are not significant. Within the projected time the members of the flood time series are proven to be stationary and independent events. Hence, the classical stationary approach of flood frequency analysis can still be used within the Fulda catchment area, notwithstanding the fact that some regional climate change has been predicted using the IPCC scenarios. It should be noted, however, that the present results are not transferable to other catchment areas. Finally a new method is presented that enables the calculation of extreme flood statistics, even if the flood time series is non-stationary and also if the latter exhibits short- and longterm persistence. This method, which is called Flood Series Maximum Analysis here, enables the calculation of maximum design floods for a given risk- or safety level and time period.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: an increasing number of researchers is working on improving the results of Web Mining by exploiting semantic structures in the Web, and they make use of Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The Semantic Web is the second-generation WWW, enriched by machine-processable information which supports the user in his tasks. Given the enormous size even of today’s Web, it is impossible to manually enrich all of these resources. Therefore, automated schemes for learning the relevant information are increasingly being used. Web Mining aims at discovering insights about the meaning of Web resources and their usage. Given the primarily syntactical nature of the data being mined, the discovery of meaning is impossible based on these data only. Therefore, formalizations of the semantics of Web sites and navigation behavior are becoming more and more common. Furthermore, mining the Semantic Web itself is another upcoming application. We argue that the two areas Web Mining and Semantic Web need each other to fulfill their goals, but that the full potential of this convergence is not yet realized. This paper gives an overview of where the two areas meet today, and sketches ways of how a closer integration could be profitable.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. The idea is to improve, on the one hand, the results of Web Mining by exploiting the new semantic structures in the Web; and to make use of Web Mining, on overview of where the two areas meet today, and sketches ways of how a closer integration could be profitable.

Relevância:

90.00% 90.00%

Publicador:

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: Growing numbers of researchers work on improving the results of Web Mining by exploiting semantic structures in the Web, and they use Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The second aim of this paper is to use these concepts to circumscribe what Web space is, what it represents and how it can be represented and analyzed. This is used to sketch the role that Semantic Web Mining and the software agents and human agents involved in it can play in the evolution of Web space.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. These systems provide currently relatively few structure. We discuss in this paper, how association rule mining can be adopted to analyze and structure folksonomies, and how the results can be used for ontology learning and supporting emergent semantics. We demonstrate our approach on a large scale dataset stemming from an online system.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Data mining means to summarize information from large amounts of raw data. It is one of the key technologies in many areas of economy, science, administration and the internet. In this report we introduce an approach for utilizing evolutionary algorithms to breed fuzzy classifier systems. This approach was exercised as part of a structured procedure by the students Achler, Göb and Voigtmann as contribution to the 2006 Data-Mining-Cup contest, yielding encouragingly positive results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a new algorithm called TITANIC for computing concept lattices. It is based on data mining techniques for computing frequent itemsets. The algorithm is experimentally evaluated and compared with B. Ganter's Next-Closure algorithm.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Web services from different partners can be combined to applications that realize a more complex business goal. Such applications built as Web service compositions define how interactions between Web services take place in order to implement the business logic. Web service compositions not only have to provide the desired functionality but also have to comply with certain Quality of Service (QoS) levels. Maximizing the users' satisfaction, also reflected as Quality of Experience (QoE), is a primary goal to be achieved in a Service-Oriented Architecture (SOA). Unfortunately, in a dynamic environment like SOA unforeseen situations might appear like services not being available or not responding in the desired time frame. In such situations, appropriate actions need to be triggered in order to avoid the violation of QoS and QoE constraints. In this thesis, proper solutions are developed to manage Web services and Web service compositions with regard to QoS and QoE requirements. The Business Process Rules Language (BPRules) was developed to manage Web service compositions when undesired QoS or QoE values are detected. BPRules provides a rich set of management actions that may be triggered for controlling the service composition and for improving its quality behavior. Regarding the quality properties, BPRules allows to distinguish between the QoS values as they are promised by the service providers, QoE values that were assigned by end-users, the monitored QoS as measured by our BPR framework, and the predicted QoS and QoE values. BPRules facilitates the specification of certain user groups characterized by different context properties and allows triggering a personalized, context-aware service selection tailored for the specified user groups. In a service market where a multitude of services with the same functionality and different quality values are available, the right services need to be selected for realizing the service composition. We developed new and efficient heuristic algorithms that are applied to choose high quality services for the composition. BPRules offers the possibility to integrate multiple service selection algorithms. The selection algorithms are applicable also for non-linear objective functions and constraints. The BPR framework includes new approaches for context-aware service selection and quality property predictions. We consider the location information of users and services as context dimension for the prediction of response time and throughput. The BPR framework combines all new features and contributions to a comprehensive management solution. Furthermore, it facilitates flexible monitoring of QoS properties without having to modify the description of the service composition. We show how the different modules of the BPR framework work together in order to execute the management rules. We evaluate how our selection algorithms outperform a genetic algorithm from related research. The evaluation reveals how context data can be used for a personalized prediction of response time and throughput.