38 resultados para architectural design -- data processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The process of developing software that takes advantage of multiple processors is commonly referred to as parallel programming. For various reasons, this process is much harder than the sequential case. For decades, parallel programming has been a problem for a small niche only: engineers working on parallelizing mostly numerical applications in High Performance Computing. This has changed with the advent of multi-core processors in mainstream computer architectures. Parallel programming in our days becomes a problem for a much larger group of developers. The main objective of this thesis was to find ways to make parallel programming easier for them. Different aims were identified in order to reach the objective: research the state of the art of parallel programming today, improve the education of software developers about the topic, and provide programmers with powerful abstractions to make their work easier. To reach these aims, several key steps were taken. To start with, a survey was conducted among parallel programmers to find out about the state of the art. More than 250 people participated, yielding results about the parallel programming systems and languages in use, as well as about common problems with these systems. Furthermore, a study was conducted in university classes on parallel programming. It resulted in a list of frequently made mistakes that were analyzed and used to create a programmers' checklist to avoid them in the future. For programmers' education, an online resource was setup to collect experiences and knowledge in the field of parallel programming - called the Parawiki. Another key step in this direction was the creation of the Thinking Parallel weblog, where more than 50.000 readers to date have read essays on the topic. For the third aim (powerful abstractions), it was decided to concentrate on one parallel programming system: OpenMP. Its ease of use and high level of abstraction were the most important reasons for this decision. Two different research directions were pursued. The first one resulted in a parallel library called AthenaMP. It contains so-called generic components, derived from design patterns for parallel programming. These include functionality to enhance the locks provided by OpenMP, to perform operations on large amounts of data (data-parallel programming), and to enable the implementation of irregular algorithms using task pools. AthenaMP itself serves a triple role: the components are well-documented and can be used directly in programs, it enables developers to study the source code and learn from it, and it is possible for compiler writers to use it as a testing ground for their OpenMP compilers. The second research direction was targeted at changing the OpenMP specification to make the system more powerful. The main contributions here were a proposal to enable thread-cancellation and a proposal to avoid busy waiting. Both were implemented in a research compiler, shown to be useful in example applications, and proposed to the OpenMP Language Committee.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A conceptual information system consists of a database together with conceptual hierarchies. The management system TOSCANA visualizes arbitrary combinations of conceptual hierarchies by nested line diagrams and allows an on-line interaction with a database to analyze data conceptually. The paper describes the conception of conceptual information systems and discusses the use of their visualization techniques for on-line analytical processing (OLAP).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While most data analysis and decision support tools use numerical aspects of the data, Conceptual Information Systems focus on their conceptual structure. This paper discusses how both approaches can be combined.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a new algorithm called TITANIC for computing concept lattices. It is based on data mining techniques for computing frequent itemsets. The algorithm is experimentally evaluated and compared with B. Ganter's Next-Closure algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we discuss Conceptual Knowledge Discovery in Databases (CKDD) in its connection with Data Analysis. Our approach is based on Formal Concept Analysis, a mathematical theory which has been developed and proven useful during the last 20 years. Formal Concept Analysis has led to a theory of conceptual information systems which has been applied by using the management system TOSCANA in a wide range of domains. In this paper, we use such an application in database marketing to demonstrate how methods and procedures of CKDD can be applied in Data Analysis. In particular, we show the interplay and integration of data mining and data analysis techniques based on Formal Concept Analysis. The main concern of this paper is to explain how the transition from data to knowledge can be supported by a TOSCANA system. To clarify the transition steps we discuss their correspondence to the five levels of knowledge representation established by R. Brachman and to the steps of empirically grounded theory building proposed by A. Strauss and J. Corbin.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Among many other knowledge representations formalisms, Ontologies and Formal Concept Analysis (FCA) aim at modeling ‘concepts’. We discuss how these two formalisms may complement another from an application point of view. In particular, we will see how FCA can be used to support Ontology Engineering, and how ontologies can be exploited in FCA applications. The interplay of FCA and ontologies is studied along the life cycle of an ontology: (i) FCA can support the building of the ontology as a learning technique. (ii) The established ontology can be analyzed and navigated by using techniques of FCA. (iii) Last but not least, the ontology may be used to improve an FCA application.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

About ten years ago, triadic contexts were presented by Lehmann and Wille as an extension of Formal Concept Analysis. However, they have rarely been used up to now, which may be due to the rather complex structure of the resulting diagrams. In this paper, we go one step back and discuss how traditional line diagrams of standard (dyadic) concept lattices can be used for exploring and navigating triadic data. Our approach is inspired by the slice & dice paradigm of On-Line-Analytical Processing (OLAP). We recall the basic ideas of OLAP, and show how they may be transferred to triadic contexts. For modeling the navigation patterns a user might follow, we use the formalisms of finite state machines. In order to present the benefits of our model, we show how it can be used for navigating the IT Baseline Protection Manual of the German Federal Office for Information Security.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed systems are one of the most vital components of the economy. The most prominent example is probably the internet, a constituent element of our knowledge society. During the recent years, the number of novel network types has steadily increased. Amongst others, sensor networks, distributed systems composed of tiny computational devices with scarce resources, have emerged. The further development and heterogeneous connection of such systems imposes new requirements on the software development process. Mobile and wireless networks, for instance, have to organize themselves autonomously and must be able to react to changes in the environment and to failing nodes alike. Researching new approaches for the design of distributed algorithms may lead to methods with which these requirements can be met efficiently. In this thesis, one such method is developed, tested, and discussed in respect of its practical utility. Our new design approach for distributed algorithms is based on Genetic Programming, a member of the family of evolutionary algorithms. Evolutionary algorithms are metaheuristic optimization methods which copy principles from natural evolution. They use a population of solution candidates which they try to refine step by step in order to attain optimal values for predefined objective functions. The synthesis of an algorithm with our approach starts with an analysis step in which the wanted global behavior of the distributed system is specified. From this specification, objective functions are derived which steer a Genetic Programming process where the solution candidates are distributed programs. The objective functions rate how close these programs approximate the goal behavior in multiple randomized network simulations. The evolutionary process step by step selects the most promising solution candidates and modifies and combines them with mutation and crossover operators. This way, a description of the global behavior of a distributed system is translated automatically to programs which, if executed locally on the nodes of the system, exhibit this behavior. In our work, we test six different ways for representing distributed programs, comprising adaptations and extensions of well-known Genetic Programming methods (SGP, eSGP, and LGP), one bio-inspired approach (Fraglets), and two new program representations called Rule-based Genetic Programming (RBGP, eRBGP) designed by us. We breed programs in these representations for three well-known example problems in distributed systems: election algorithms, the distributed mutual exclusion at a critical section, and the distributed computation of the greatest common divisor of a set of numbers. Synthesizing distributed programs the evolutionary way does not necessarily lead to the envisaged results. In a detailed analysis, we discuss the problematic features which make this form of Genetic Programming particularly hard. The two Rule-based Genetic Programming approaches have been developed especially in order to mitigate these difficulties. In our experiments, at least one of them (eRBGP) turned out to be a very efficient approach and in most cases, was superior to the other representations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Die Bedeutung des Dienstgüte-Managements (SLM) im Bereich von Unternehmensanwendungen steigt mit der zunehmenden Kritikalität von IT-gestützten Prozessen für den Erfolg einzelner Unternehmen. Traditionell werden zur Implementierung eines wirksamen SLMs Monitoringprozesse in hierarchischen Managementumgebungen etabliert, die einen Administrator bei der notwendigen Rekonfiguration von Systemen unterstützen. Auf aktuelle, hochdynamische Softwarearchitekturen sind diese hierarchischen Ansätze jedoch nur sehr eingeschränkt anwendbar. Ein Beispiel dafür sind dienstorientierte Architekturen (SOA), bei denen die Geschäftsfunktionalität durch das Zusammenspiel einzelner, voneinander unabhängiger Dienste auf Basis deskriptiver Workflow-Beschreibungen modelliert wird. Dadurch ergibt sich eine hohe Laufzeitdynamik der gesamten Architektur. Für das SLM ist insbesondere die dezentrale Struktur einer SOA mit unterschiedlichen administrativen Zuständigkeiten für einzelne Teilsysteme problematisch, da regelnde Eingriffe zum einen durch die Kapselung der Implementierung einzelner Dienste und zum anderen durch das Fehlen einer zentralen Kontrollinstanz nur sehr eingeschränkt möglich sind. Die vorliegende Arbeit definiert die Architektur eines SLM-Systems für SOA-Umgebungen, in dem autonome Management-Komponenten kooperieren, um übergeordnete Dienstgüteziele zu erfüllen: Mithilfe von Selbst-Management-Technologien wird zunächst eine Automatisierung des Dienstgüte-Managements auf Ebene einzelner Dienste erreicht. Die autonomen Management-Komponenten dieser Dienste können dann mithilfe von Selbstorganisationsmechanismen übergreifende Ziele zur Optimierung von Dienstgüteverhalten und Ressourcennutzung verfolgen. Für das SLM auf Ebene von SOA Workflows müssen temporär dienstübergreifende Kooperationen zur Erfüllung von Dienstgüteanforderungen etabliert werden, die sich damit auch über mehrere administrative Domänen erstrecken können. Eine solche zeitlich begrenzte Kooperation autonomer Teilsysteme kann sinnvoll nur dezentral erfolgen, da die jeweiligen Kooperationspartner im Vorfeld nicht bekannt sind und – je nach Lebensdauer einzelner Workflows – zur Laufzeit beteiligte Komponenten ausgetauscht werden können. In der Arbeit wird ein Verfahren zur Koordination autonomer Management-Komponenten mit dem Ziel der Optimierung von Antwortzeiten auf Workflow-Ebene entwickelt: Management-Komponenten können durch Übertragung von Antwortzeitanteilen untereinander ihre individuellen Ziele straffen oder lockern, ohne dass das Gesamtantwortzeitziel dadurch verändert wird. Die Übertragung von Antwortzeitanteilen wird mithilfe eines Auktionsverfahrens realisiert. Technische Grundlage der Kooperation bildet ein Gruppenkommunikationsmechanismus. Weiterhin werden in Bezug auf die Nutzung geteilter, virtualisierter Ressourcen konkurrierende Dienste entsprechend geschäftlicher Ziele priorisiert. Im Rahmen der praktischen Umsetzung wird die Realisierung zentraler Architekturelemente und der entwickelten Verfahren zur Selbstorganisation beispielhaft für das SLM konkreter Komponenten vorgestellt. Zur Untersuchung der Management-Kooperation in größeren Szenarien wird ein hybrider Simulationsansatz verwendet. Im Rahmen der Evaluation werden Untersuchungen zur Skalierbarkeit des Ansatzes durchgeführt. Schwerpunkt ist hierbei die Betrachtung eines Systems aus kooperierenden Management-Komponenten, insbesondere im Hinblick auf den Kommunikationsaufwand. Die Evaluation zeigt, dass ein dienstübergreifendes, autonomes Performance-Management in SOA-Umgebungen möglich ist. Die Ergebnisse legen nahe, dass der entwickelte Ansatz auch in großen Umgebungen erfolgreich angewendet werden kann.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we describe an interdisciplinary project in which visualization techniques were developed for and applied to scholarly work from literary studies. The aim was to bring Christof Schöch's electronic edition of Bérardier de Bataut's Essai sur le récit (1776) to the web. This edition is based on the Text Encoding Initiative's XML-based encoding scheme (TEI P5, subset TEI-Lite). This now de facto standard applies to machine-readable texts used chiefly in the humanities and social sciences. The intention of this edition is to make the edited text freely available on the web, to allow for alternative text views (here original and modern/corrected text), to ensure reader-friendly annotation and navigation, to permit on-line collaboration in encoding and annotation as well as user comments, all in an open source, generically usable, lightweight package. These aims were attained by relying on a GPL-based, public domain CMS (Drupal) and combining it with XSL-Stylesheets and Java Script.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Die vorliegende Arbeit entstand während meiner Zeit als wissenschaftlicher Mitarbeiter im Fachgebiet Technische Informatik an der Universität Kassel. Im Rahmen dieser Arbeit werden der Entwurf und die Implementierung eines Cluster-basierten verteilten Szenengraphen gezeigt. Bei der Implementierung des verteilten Szenengraphen wurde von der Entwicklung eines eigenen Szenengraphen abgesehen. Stattdessen wurde ein bereits vorhandener Szenengraph namens OpenSceneGraph als Basis für die Entwicklung des verteilten Szenengraphen verwendet. Im Rahmen dieser Arbeit wurde eine Clusterunterstützung in den vorliegenden OpenSceneGraph integriert. Bei der Erweiterung des OpenSceneGraphs wurde besonders darauf geachtet den vorliegenden Szenengraphen möglichst nicht zu verändern. Zusätzlich wurde nach Möglichkeit auf die Verwendung und Integration externer Clusterbasierten Softwarepakete verzichtet. Für die Verteilung des OpenSceneGraphs wurde auf Basis von Sockets eine eigene Kommunikationsschicht entwickelt und in den OpenSceneGraph integriert. Diese Kommunikationsschicht wurde verwendet um Sort-First- und Sort-Last-basierte Visualisierung dem OpenSceneGraph zur Verfügung zu stellen. Durch die Erweiterung des OpenScenGraphs um die Cluster-Unterstützung wurde eine Ansteuerung beliebiger Projektionssysteme wie z.B. einer CAVE ermöglicht. Für die Ansteuerung einer CAVE wurden mittels VRPN diverse Eingabegeräte sowie das Tracking in den OpenSceneGraph integriert. Durch die Anbindung der Geräte über VRPN können diese Eingabegeräte auch bei den anderen Cluster-Betriebsarten wie z.B. einer segmentierten Anzeige verwendet werden. Die Verteilung der Daten auf den Cluster wurde von dem Kern des OpenSceneGraphs separat gehalten. Damit kann eine beliebige OpenSceneGraph-basierte Anwendung jederzeit und ohne aufwendige Modifikationen auf einem Cluster ausgeführt werden. Dadurch ist der Anwender in seiner Applikationsentwicklung nicht behindert worden und muss nicht zwischen Cluster-basierten und Standalone-Anwendungen unterscheiden.