944 resultados para Data processing methods
Resumo:
While most data analysis and decision support tools use numerical aspects of the data, Conceptual Information Systems focus on their conceptual structure. This paper discusses how both approaches can be combined.
Resumo:
We present a new algorithm called TITANIC for computing concept lattices. It is based on data mining techniques for computing frequent itemsets. The algorithm is experimentally evaluated and compared with B. Ganter's Next-Closure algorithm.
Resumo:
Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.
Resumo:
Among many other knowledge representations formalisms, Ontologies and Formal Concept Analysis (FCA) aim at modeling ‘concepts’. We discuss how these two formalisms may complement another from an application point of view. In particular, we will see how FCA can be used to support Ontology Engineering, and how ontologies can be exploited in FCA applications. The interplay of FCA and ontologies is studied along the life cycle of an ontology: (i) FCA can support the building of the ontology as a learning technique. (ii) The established ontology can be analyzed and navigated by using techniques of FCA. (iii) Last but not least, the ontology may be used to improve an FCA application.
Resumo:
About ten years ago, triadic contexts were presented by Lehmann and Wille as an extension of Formal Concept Analysis. However, they have rarely been used up to now, which may be due to the rather complex structure of the resulting diagrams. In this paper, we go one step back and discuss how traditional line diagrams of standard (dyadic) concept lattices can be used for exploring and navigating triadic data. Our approach is inspired by the slice & dice paradigm of On-Line-Analytical Processing (OLAP). We recall the basic ideas of OLAP, and show how they may be transferred to triadic contexts. For modeling the navigation patterns a user might follow, we use the formalisms of finite state machines. In order to present the benefits of our model, we show how it can be used for navigating the IT Baseline Protection Manual of the German Federal Office for Information Security.
Resumo:
Ontologies have been established for knowledge sharing and are widely used as a means for conceptually structuring domains of interest. With the growing usage of ontologies, the problem of overlapping knowledge in a common domain becomes critical. In this short paper, we address two methods for merging ontologies based on Formal Concept Analysis: FCA-Merge and ONTEX. --- FCA-Merge is a method for merging ontologies following a bottom-up approach which offers a structural description of the merging process. The method is guided by application-specific instances of the given source ontologies. We apply techniques from natural language processing and formal concept analysis to derive a lattice of concepts as a structural result of FCA-Merge. The generated result is then explored and transformed into the merged ontology with human interaction. --- ONTEX is a method for systematically structuring the top-down level of ontologies. It is based on an interactive, top-down- knowledge acquisition process, which assures that the knowledge engineer considers all possible cases while avoiding redundant acquisition. The method is suited especially for creating/merging the top part(s) of the ontologies, where high accuracy is required, and for supporting the merging of two (or more) ontologies on that level.
Resumo:
Distributed systems are one of the most vital components of the economy. The most prominent example is probably the internet, a constituent element of our knowledge society. During the recent years, the number of novel network types has steadily increased. Amongst others, sensor networks, distributed systems composed of tiny computational devices with scarce resources, have emerged. The further development and heterogeneous connection of such systems imposes new requirements on the software development process. Mobile and wireless networks, for instance, have to organize themselves autonomously and must be able to react to changes in the environment and to failing nodes alike. Researching new approaches for the design of distributed algorithms may lead to methods with which these requirements can be met efficiently. In this thesis, one such method is developed, tested, and discussed in respect of its practical utility. Our new design approach for distributed algorithms is based on Genetic Programming, a member of the family of evolutionary algorithms. Evolutionary algorithms are metaheuristic optimization methods which copy principles from natural evolution. They use a population of solution candidates which they try to refine step by step in order to attain optimal values for predefined objective functions. The synthesis of an algorithm with our approach starts with an analysis step in which the wanted global behavior of the distributed system is specified. From this specification, objective functions are derived which steer a Genetic Programming process where the solution candidates are distributed programs. The objective functions rate how close these programs approximate the goal behavior in multiple randomized network simulations. The evolutionary process step by step selects the most promising solution candidates and modifies and combines them with mutation and crossover operators. This way, a description of the global behavior of a distributed system is translated automatically to programs which, if executed locally on the nodes of the system, exhibit this behavior. In our work, we test six different ways for representing distributed programs, comprising adaptations and extensions of well-known Genetic Programming methods (SGP, eSGP, and LGP), one bio-inspired approach (Fraglets), and two new program representations called Rule-based Genetic Programming (RBGP, eRBGP) designed by us. We breed programs in these representations for three well-known example problems in distributed systems: election algorithms, the distributed mutual exclusion at a critical section, and the distributed computation of the greatest common divisor of a set of numbers. Synthesizing distributed programs the evolutionary way does not necessarily lead to the envisaged results. In a detailed analysis, we discuss the problematic features which make this form of Genetic Programming particularly hard. The two Rule-based Genetic Programming approaches have been developed especially in order to mitigate these difficulties. In our experiments, at least one of them (eRBGP) turned out to be a very efficient approach and in most cases, was superior to the other representations.
Resumo:
Kern der vorliegenden Arbeit ist die Erforschung von Methoden, Techniken und Werkzeugen zur Fehlersuche in modellbasierten Softwareentwicklungsprozessen. Hierzu wird zuerst ein von mir mitentwickelter, neuartiger und modellbasierter Softwareentwicklungsprozess, der sogenannte Fujaba Process, vorgestellt. Dieser Prozess wird von Usecase Szenarien getrieben, die durch spezielle Kollaborationsdiagramme formalisiert werden. Auch die weiteren Artefakte des Prozess bishin zur fertigen Applikation werden durch UML Diagrammarten modelliert. Es ist keine Programmierung im Quelltext nötig. Werkzeugunterstützung für den vorgestellte Prozess wird von dem Fujaba CASE Tool bereitgestellt. Große Teile der Werkzeugunterstützung für den Fujaba Process, darunter die Toolunterstützung für das Testen und Debuggen, wurden im Rahmen dieser Arbeit entwickelt. Im ersten Teil der Arbeit wird der Fujaba Process im Detail erklärt und unsere Erfahrungen mit dem Einsatz des Prozesses in Industrieprojekten sowie in der Lehre dargestellt. Der zweite Teil beschreibt die im Rahmen dieser Arbeit entwickelte Testgenerierung, die zu einem wichtigen Teil des Fujaba Process geworden ist. Hierbei werden aus den formalisierten Usecase Szenarien ausführbare Testfälle generiert. Es wird das zugrunde liegende Konzept, die konkrete technische Umsetzung und die Erfahrungen aus der Praxis mit der entwickelten Testgenerierung dargestellt. Der letzte Teil beschäftigt sich mit dem Debuggen im Fujaba Process. Es werden verschiedene im Rahmen dieser Arbeit entwickelte Konzepte und Techniken vorgestellt, die die Fehlersuche während der Applikationsentwicklung vereinfachen. Hierbei wurde darauf geachtet, dass das Debuggen, wie alle anderen Schritte im Fujaba Process, ausschließlich auf Modellebene passiert. Unter anderem werden Techniken zur schrittweisen Ausführung von Modellen, ein Objekt Browser und ein Debugger, der die rückwärtige Ausführung von Programmen erlaubt (back-in-time debugging), vorgestellt. Alle beschriebenen Konzepte wurden in dieser Arbeit als Plugins für die Eclipse Version von Fujaba, Fujaba4Eclipse, implementiert und erprobt. Bei der Implementierung der Plugins wurde auf eine enge Integration mit Fujaba zum einen und mit Eclipse auf der anderen Seite geachtet. Zusammenfassend wird also ein Entwicklungsprozess vorgestellt, die Möglichkeit in diesem mit automatischen Tests Fehler zu identifizieren und diese Fehler dann mittels spezieller Debuggingtechniken im Programm zu lokalisieren und schließlich zu beheben. Dabei läuft der komplette Prozess auf Modellebene ab. Für die Test- und Debuggingtechniken wurden in dieser Arbeit Plugins für Fujaba4Eclipse entwickelt, die den Entwickler bestmöglich bei der zugehörigen Tätigkeit unterstützen.
Resumo:
Das Ziel der Dissertation war die Untersuchung des Computereinsatzes zur Lern- und Betreuungsunterstützung beim selbstgesteuerten Lernen in der Weiterbildung. In einem bisher konventionell durchgeführten Selbstlernkurs eines berufsbegleitenden Studiengangs, der an das Datenmanagement der Bürodatenverarbeitung heranführt, wurden die Kursunterlagen digitalisiert, die Betreuung auf eine online-basierte Lernbegleitung umgestellt und ein auf die neuen Lernmedien abgestimmtes Lernkonzept entwickelt. Dieses neue Lernkonzept wurde hinsichtlich der Motivation und der Akzeptanz von digitalen Lernmedien evaluiert. Die Evaluation bestand aus zwei Teilen: 1. eine formative, den Entwicklungsprozess begleitende Evaluation zur Optimierung der entwickelten Lernsoftware und des eingeführten Lernkonzeptes, 2. eine sowohl qualitative wie quantitative summative Evaluation der Entwicklungen. Ein zentraler Aspekt der Untersuchung war die freie Wahl der Lernmedien (multimediale Lernsoftware oder konventionelles Begleitbuch) und der Kommunikationsmedien (online-basierte Lernplattform oder die bisher genutzten Kommunikationskanäle: E-Mail, Telefon und Präsenztreffen). Diese Zweigleisigkeit erlaubte eine differenzierte Gegenüberstellung von konventionellen und innovativen Lernarrangements. Die Verbindung von qualitativen und quantitativen Vorgehensweisen, auf Grund derer die subjektiven Einstellungen der Probanden in das Zentrum der Betrachtung rückten, ließen einen Blickwinkel auf den Nutzen und die Wirkung der Neuen Medien in Lernprozessen zu, der es erlaubte einige in der Literatur als gängig angesehene Interpretationen in Frage zu stellen und neu zu diskutieren. So konnten durch eine Kategorisierung des Teilnehmerverhaltens nach online-typisch und nicht online-typisch die Ursache-Wirkungs-Beziehungen der in vielen Untersuchungen angeführten Störungen in Online-Seminaren verdeutlicht werden. In den untersuchten Kursen zeigte sich beispielsweise keine Abhängigkeit der Drop-out-Quote von den Lern- und Betreuungsformen und dass diese Quote mit dem neuen Lernkonzept nur geringfügig beeinflusst werden konnte. Die freie Wahl der Lernmedien führte zu einer gezielten Nutzung der multimedialen Lernsoftware, wodurch die Akzeptanz dieses Lernmedium stieg. Dagegen war die Akzeptanz der Lernenden gegenüber der Lernbegleitung mittels einer Online-Lernplattform von hoch bis sehr niedrig breit gestreut. Unabhängig davon reichte in allen Kursdurchgängen die Online-Betreuung nicht aus, so dass Präsenztreffen erbeten wurde. Hinsichtlich der Motivation war die Wirkung der digitalen Medien niedriger als erwartet. Insgesamt bieten die Ergebnisse Empfehlungen für die Planung und Durchführung von computerunterstützten, online-begleiteten Kursen.
Resumo:
Fujaba is an Open Source UML CASE tool project started at the software engineering group of Paderborn University in 1997. In 2002 Fujaba has been redesigned and became the Fujaba Tool Suite with a plug-in architecture allowing developers to add functionality easily while retaining full control over their contributions. Multiple Application Domains Fujaba followed the model-driven development philosophy right from its beginning in 1997. At the early days, Fujaba had a special focus on code generation from UML diagrams resulting in a visual programming language with a special emphasis on object structure manipulating rules. Today, at least six rather independent tool versions are under development in Paderborn, Kassel, and Darmstadt for supporting (1) reengineering, (2) embedded real-time systems, (3) education, (4) specification of distributed control systems, (5) integration with the ECLIPSE platform, and (6) MOF-based integration of system (re-) engineering tools. International Community According to our knowledge, quite a number of research groups have also chosen Fujaba as a platform for UML and MDA related research activities. In addition, quite a number of Fujaba users send requests for more functionality and extensions. Therefore, the 8th International Fujaba Days aimed at bringing together Fujaba develop- ers and Fujaba users from all over the world to present their ideas and projects and to discuss them with each other and with the Fujaba core development team.
Resumo:
The ongoing growth of the World Wide Web, catalyzed by the increasing possibility of ubiquitous access via a variety of devices, continues to strengthen its role as our prevalent information and commmunication medium. However, although tools like search engines facilitate retrieval, the task of finally making sense of Web content is still often left to human interpretation. The vision of supporting both humans and machines in such knowledge-based activities led to the development of different systems which allow to structure Web resources by metadata annotations. Interestingly, two major approaches which gained a considerable amount of attention are addressing the problem from nearly opposite directions: On the one hand, the idea of the Semantic Web suggests to formalize the knowledge within a particular domain by means of the "top-down" approach of defining ontologies. On the other hand, Social Annotation Systems as part of the so-called Web 2.0 movement implement a "bottom-up" style of categorization using arbitrary keywords. Experience as well as research in the characteristics of both systems has shown that their strengths and weaknesses seem to be inverse: While Social Annotation suffers from problems like, e. g., ambiguity or lack or precision, ontologies were especially designed to eliminate those. On the contrary, the latter suffer from a knowledge acquisition bottleneck, which is successfully overcome by the large user populations of Social Annotation Systems. Instead of being regarded as competing paradigms, the obvious potential synergies from a combination of both motivated approaches to "bridge the gap" between them. These were fostered by the evidence of emergent semantics, i. e., the self-organized evolution of implicit conceptual structures, within Social Annotation data. While several techniques to exploit the emergent patterns were proposed, a systematic analysis - especially regarding paradigms from the field of ontology learning - is still largely missing. This also includes a deeper understanding of the circumstances which affect the evolution processes. This work aims to address this gap by providing an in-depth study of methods and influencing factors to capture emergent semantics from Social Annotation Systems. We focus hereby on the acquisition of lexical semantics from the underlying networks of keywords, users and resources. Structured along different ontology learning tasks, we use a methodology of semantic grounding to characterize and evaluate the semantic relations captured by different methods. In all cases, our studies are based on datasets from several Social Annotation Systems. Specifically, we first analyze semantic relatedness among keywords, and identify measures which detect different notions of relatedness. These constitute the input of concept learning algorithms, which focus then on the discovery of synonymous and ambiguous keywords. Hereby, we assess the usefulness of various clustering techniques. As a prerequisite to induce hierarchical relationships, our next step is to study measures which quantify the level of generality of a particular keyword. We find that comparatively simple measures can approximate the generality information encoded in reference taxonomies. These insights are used to inform the final task, namely the creation of concept hierarchies. For this purpose, generality-based algorithms exhibit advantages compared to clustering approaches. In order to complement the identification of suitable methods to capture semantic structures, we analyze as a next step several factors which influence their emergence. Empirical evidence is provided that the amount of available data plays a crucial role for determining keyword meanings. From a different perspective, we examine pragmatic aspects by considering different annotation patterns among users. Based on a broad distinction between "categorizers" and "describers", we find that the latter produce more accurate results. This suggests a causal link between pragmatic and semantic aspects of keyword annotation. As a special kind of usage pattern, we then have a look at system abuse and spam. While observing a mixed picture, we suggest that an individual decision should be taken instead of disregarding spammers as a matter of principle. Finally, we discuss a set of applications which operationalize the results of our studies for enhancing both Social Annotation and semantic systems. These comprise on the one hand tools which foster the emergence of semantics, and on the one hand applications which exploit the socially induced relations to improve, e. g., searching, browsing, or user profiling facilities. In summary, the contributions of this work highlight viable methods and crucial aspects for designing enhanced knowledge-based services of a Social Semantic Web.
Resumo:
Presentation at the 1997 Dagstuhl Seminar "Evaluation of Multimedia Information Retrieval", Norbert Fuhr, Keith van Rijsbergen, Alan F. Smeaton (eds.), Dagstuhl Seminar Report 175, 14.04. - 18.04.97 (9716). - Abstract: This presentation will introduce ESCHER, a database editor which supports visualization in non-standard applications in engineering, science, tourism and the entertainment industry. It was originally based on the extended nested relational data model and is currently extended to include object-relational properties like inheritance, object types, integrity constraints and methods. It serves as a research platform into areas such as multimedia and visual information systems, QBE-like queries, computer-supported concurrent work (CSCW) and novel storage techniques. In its role as a Visual Information System, a database editor must support browsing and navigation. ESCHER provides this access to data by means of so called fingers. They generalize the cursor paradigm in graphical and text editors. On the graphical display, a finger is reflected by a colored area which corresponds to the object a finger is currently pointing at. In a table more than one finger may point to objects, one of which is the active finger and is used for navigating through the table. The talk will mostly concentrate on giving examples for this type of navigation and will discuss some of the architectural needs for fast object traversal and display. ESCHER is available as public domain software from our ftp site in Kassel. The portable C source can be easily compiled for any machine running UNIX and OSF/Motif, in particular our working environments IBM RS/6000 and Intel-based LINUX systems. A porting to Tcl/Tk is under way.
Resumo:
The main instrument used in psychological measurement is the self-report questionnaire. One of its major drawbacks however is its susceptibility to response biases. A known strategy to control these biases has been the use of so-called ipsative items. Ipsative items are items that require the respondent to make between-scale comparisons within each item. The selected option determines to which scale the weight of the answer is attributed. Consequently in questionnaires only consisting of ipsative items every respondent is allotted an equal amount, i.e. the total score, that each can distribute differently over the scales. Therefore this type of response format yields data that can be considered compositional from its inception. Methodological oriented psychologists have heavily criticized this type of item format, since the resulting data is also marked by the associated unfavourable statistical properties. Nevertheless, clinicians have kept using these questionnaires to their satisfaction. This investigation therefore aims to evaluate both positions and addresses the similarities and differences between the two data collection methods. The ultimate objective is to formulate a guideline when to use which type of item format. The comparison is based on data obtained with both an ipsative and normative version of three psychological questionnaires, which were administered to 502 first-year students in psychology according to a balanced within-subjects design. Previous research only compared the direct ipsative scale scores with the derived ipsative scale scores. The use of compositional data analysis techniques also enables one to compare derived normative score ratios with direct normative score ratios. The addition of the second comparison not only offers the advantage of a better-balanced research strategy. In principle it also allows for parametric testing in the evaluation
Resumo:
The system described herein represents the first example of a recommender system in digital ecosystems where agents negotiate services on behalf of small companies. The small companies compete not only with price or quality, but with a wider service-by-service composition by subcontracting with other companies. The final result of these offerings depends on negotiations at the scale of millions of small companies. This scale requires new platforms for supporting digital business ecosystems, as well as related services like open-id, trust management, monitors and recommenders. This is done in the Open Negotiation Environment (ONE), which is an open-source platform that allows agents, on behalf of small companies, to negotiate and use the ecosystem services, and enables the development of new agent technologies. The methods and tools of cyber engineering are necessary to build up Open Negotiation Environments that are stable, a basic condition for predictable business and reliable business environments. Aiming to build stable digital business ecosystems by means of improved collective intelligence, we introduce a model of negotiation style dynamics from the point of view of computational ecology. This model inspires an ecosystem monitor as well as a novel negotiation style recommender. The ecosystem monitor provides hints to the negotiation style recommender to achieve greater stability of an open negotiation environment in a digital business ecosystem. The greater stability provides the small companies with higher predictability, and therefore better business results. The negotiation style recommender is implemented with a simulated annealing algorithm at a constant temperature, and its impact is shown by applying it to a real case of an open negotiation environment populated by Italian companies