Biblioteca Digital

960 resultados para chiroptical switches, data processing, enantiospecificity, photochromism, steric hindrance

Conceptual knowledge discovery and data analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we discuss Conceptual Knowledge Discovery in Databases (CKDD) in its connection with Data Analysis. Our approach is based on Formal Concept Analysis, a mathematical theory which has been developed and proven useful during the last 20 years. Formal Concept Analysis has led to a theory of conceptual information systems which has been applied by using the management system TOSCANA in a wide range of domains. In this paper, we use such an application in database marketing to demonstrate how methods and procedures of CKDD can be applied in Data Analysis. In particular, we show the interplay and integration of data mining and data analysis techniques based on Formal Concept Analysis. The main concern of this paper is to explain how the transition from data to knowledge can be supported by a TOSCANA system. To clarify the transition steps we discuss their correspondence to the five levels of knowledge representation established by R. Brachman and to the steps of empirically grounded theory building proposed by A. Strauss and J. Corbin.

Efficient data mining based on formal concept analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.

Conceptual knowledge processing with formal concept analysis and ontologies

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Among many other knowledge representations formalisms, Ontologies and Formal Concept Analysis (FCA) aim at modeling ‘concepts’. We discuss how these two formalisms may complement another from an application point of view. In particular, we will see how FCA can be used to support Ontology Engineering, and how ontologies can be exploited in FCA applications. The interplay of FCA and ontologies is studied along the life cycle of an ontology: (i) FCA can support the building of the ontology as a learning technique. (ii) The established ontology can be analyzed and navigated by using techniques of FCA. (iii) Last but not least, the ontology may be used to improve an FCA application.

Some applications of conceptual knowledge processing

Relevância:

100.00% 100.00%

Publicador:

A finite state model for on-line analytical processing in triadic contexts

Relevância:

100.00% 100.00%

Publicador:

Resumo:

About ten years ago, triadic contexts were presented by Lehmann and Wille as an extension of Formal Concept Analysis. However, they have rarely been used up to now, which may be due to the rather complex structure of the resulting diagrams. In this paper, we go one step back and discuss how traditional line diagrams of standard (dyadic) concept lattices can be used for exploring and navigating triadic data. Our approach is inspired by the slice & dice paradigm of On-Line-Analytical Processing (OLAP). We recall the basic ideas of OLAP, and show how they may be transferred to triadic contexts. For modeling the navigation patterns a user might follow, we use the formalisms of finite state machines. In order to present the benefits of our model, we show how it can be used for navigating the IT Baseline Protection Manual of the German Federal Office for Information Security.

Adaptive Real-time Anomaly-based Intrusion Detection using Data Mining and Machine Learning Techniques

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.

Rhodium(I) catalyzed [2+2+2] cycloaddition reactions: experimental and theoretical studies

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The [2+2+2] cycloaddition reaction involves the formation of three carbon-carbon bonds in one single step using alkynes, alkenes, nitriles, carbonyls and other unsaturated reagents as reactants. This is one of the most elegant methods for the construction of polycyclic aromatic compounds and heteroaromatic, which have important academic and industrial uses. The thesis is divided into ten chapters including six related publications. The first study based on the Wilkinson’s catalyst, RhCl(PPh3)3, compares the reaction mechanism of the [2+2+2] cycloaddition process of acetylene with the cycloaddition obtained for the model of the complex, RhCl(PH3)3. In an attempt to reduce computational costs in DFT studies, this research project aimed to substitute PPh3 ligands for PH3, despite the electronic and steric effects produced by PPh3 ligands being significantly different to those created by PH3 ones. In this first study, detailed theoretical calculations were performed to determine the reaction mechanism of the two complexes. Despite some differences being detected, it was found that modelling PPh3 by PH3 in the catalyst helps to reduce the computational cost significantly while at the same time providing qualitatively acceptable results. Taking into account the results obtained in this earlier study, the model of the Wilkinson’s catalyst, RhCl(PH3)3, was applied to study different [2+2+2] cycloaddition reactions with unsaturated systems conducted in the laboratory. Our research group found that in the case of totally closed systems, specifically 15- and 25-membered azamacrocycles can afford benzenic compounds, except in the case of 20-membered azamacrocycle (20-MAA) which was inactive with the Wilkinson’s catalyst. In this study, theoretical calculations allowed to determine the origin of the different reactivity of the 20-MAA, where it was found that the activation barrier of the oxidative addition of two alkynes is higher than those obtained for the 15- and 25-membered macrocycles. This barrier was attributed primarily to the interaction energy, which corresponds to the energy that is released when the two deformed reagents interact in the transition state. The main factor that helped to provide an explanation to the different reactivity observed was that the 20-MAA had a more stable and delocalized HOMO orbital in the oxidative addition step. Moreover, we observed that the formation of a strained ten-membered ring during the cycloaddition of 20-MAA presents significant steric hindrance. Furthermore, in Chapter 5, an electrochemical study is presented in collaboration with Prof. Anny Jutand from Paris. This work allowed studying the main steps of the catalytic cycle of the [2+2+2] cycloaddition reaction between diynes with a monoalkyne. First kinetic data were obtained of the [2+2+2] cycloaddition process catalyzed by the Wilkinson’s catalyst, where it was observed that the rate-determining step of the reaction can change depending on the structure of the starting reagents. In the case of the [2+2+2] cycloaddition reaction involving two alkynes and one alkene in the same molecule (enediynes), it is well known that the oxidative coupling may occur between two alkynes giving the corresponding metallacyclopentadiene, or between one alkyne and the alkene affording the metallacyclopentene complex. Wilkinson’s model was used in DFT calculations to analyze the different factors that may influence in the reaction mechanism. Here it was observed that the cyclic enediynes always prefer the oxidative coupling between two alkynes moieties, while the acyclic cases have different preferences depending on the linker and the substituents used in the alkynes. Moreover, the Wilkinson’s model was used to explain the experimental results achieved in Chapter 7 where the [2+2+2] cycloaddition reaction of enediynes is studied varying the position of the double bond in the starting reagent. It was observed that enediynes type yne-ene-yne preferred the standard [2+2+2] cycloaddition reaction, while enediynes type yne-yne-ene suffered β-hydride elimination followed a reductive elimination of Wilkinson’s catalyst giving cyclohexadiene compounds, which are isomers from those that would be obtained through standard [2+2+2] cycloaddition reactions. Finally, the last chapter of this thesis is based on the use of DFT calculations to determine the reaction mechanism when the macrocycles are treated with transition metals that are inactive to the [2+2+2] cycloaddition reaction, but which are thermally active leading to new polycyclic compounds. Thus, a domino process was described combining an ene reaction and a Diels-Alder cycloaddition.

An observing system simulation experiment for climate monitoring with GNSS radio occulation data: setup and testbed study

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The long-term stability, high accuracy, all-weather capability, high vertical resolution, and global coverage of Global Navigation Satellite System (GNSS) radio occultation (RO) suggests it as a promising tool for global monitoring of atmospheric temperature change. With the aim to investigate and quantify how well a GNSS RO observing system is able to detect climate trends, we are currently performing an (climate) observing system simulation experiment over the 25-year period 2001 to 2025, which involves quasi-realistic modeling of the neutral atmosphere and the ionosphere. We carried out two climate simulations with the general circulation model MAECHAM5 (Middle Atmosphere European Centre/Hamburg Model Version 5) of the MPI-M Hamburg, covering the period 2001–2025: One control run with natural variability only and one run also including anthropogenic forcings due to greenhouse gases, sulfate aerosols, and tropospheric ozone. On the basis of this, we perform quasi-realistic simulations of RO observables for a small GNSS receiver constellation (six satellites), state-of-the-art data processing for atmospheric profiles retrieval, and a statistical analysis of temperature trends in both the “observed” climatology and the “true” climatology. Here we describe the setup of the experiment and results from a test bed study conducted to obtain a basic set of realistic estimates of observational errors (instrument- and retrieval processing-related errors) and sampling errors (due to spatial-temporal undersampling). The test bed results, obtained for a typical summer season and compared to the climatic 2001–2025 trends from the MAECHAM5 simulation including anthropogenic forcing, were found encouraging for performing the full 25-year experiment. They indicated that observational and sampling errors (both contributing about 0.2 K) are consistent with recent estimates of these errors from real RO data and that they should be sufficiently small for monitoring expected temperature trends in the global atmosphere over the next 10 to 20 years in most regions of the upper troposphere and lower stratosphere (UTLS). Inspection of the MAECHAM5 trends in different RO-accessible atmospheric parameters (microwave refractivity and pressure/geopotential height in addition to temperature) indicates complementary climate change sensitivity in different regions of the UTLS so that optimized climate monitoring shall combine information from all climatic key variables retrievable from GNSS RO data.

GODIVA2: interactive visualization of environmental data on the Web

Relevância:

100.00% 100.00%

Publicador:

Resumo:

GODIVA2 is a dynamic website that provides visual access to several terabytes of physically distributed, four-dimensional environmental data. It allows users to explore large datasets interactively without the need to install new software or download and understand complex data. Through the use of open international standards, GODIVA2 maintains a high level of interoperability with third-party systems, allowing diverse datasets to be mutually compared. Scientists can use the system to search for features in large datasets and to diagnose the output from numerical simulations and data processing algorithms. Data providers around Europe have adopted GODIVA2 as an INSPIRE-compliant dynamic quick-view system for providing visual access to their data.

The world at one's fingertips: interactive interpretation of environmental data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This chapter introduces the latest practices and technologies in the interactive interpretation of environmental data. With environmental data becoming ever larger, more diverse and more complex, there is a need for a new generation of tools that provides new capabilities over and above those of the standard workhorses of science. These new tools aid the scientist in discovering interesting new features (and also problems) in large datasets by allowing the data to be explored interactively using simple, intuitive graphical tools. In this way, new discoveries are made that are commonly missed by automated batch data processing. This chapter discusses the characteristics of environmental science data, common current practice in data analysis and the supporting tools and infrastructure. New approaches are introduced and illustrated from the points of view of both the end user and the underlying technology. We conclude by speculating as to future developments in the field and what must be achieved to fulfil this vision.

Case study on the maintenance of a construction monitoring using USN-based data acquisition

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years, there has been an increasing interest in the adoption of emerging ubiquitous sensor network (USN) technologies for instrumentation within a variety of sustainability systems. USN is emerging as a sensing paradigm that is being newly considered by the sustainability management field as an alternative to traditional tethered monitoring systems. Researchers have been discovering that USN is an exciting technology that should not be viewed simply as a substitute for traditional tethered monitoring systems. In this study, we investigate how a movement monitoring measurement system of a complex building is developed as a research environment for USN and related decision-supportive technologies. To address the apparent danger of building movement, agent-mediated communication concepts have been designed to autonomously manage large volumes of exchanged information. In this study, we additionally detail the design of the proposed system, including its principles, data processing algorithms, system architecture, and user interface specifics. Results of the test and case study demonstrate the effectiveness of the USN-based data acquisition system for real-time monitoring of movement operations.

A cross layer protocol for energy aware and critical data delivered applications using Wireless Sensor Networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Environment monitoring applications using Wireless Sensor Networks (WSNs) have had a lot of attention in recent years. In much of this research tasks like sensor data processing, environment states and events decision making and emergency message sending are done by a remote server. A proposed cross layer protocol for two different applications where, reliability for delivered data, delay and life time of the network need to be considered, has been simulated and the results are presented in this paper. A WSN designed for the proposed applications needs efficient MAC and routing protocols to provide a guarantee for the reliability of the data delivered from source nodes to the sink. A cross layer based on the design given in [1] has been extended and simulated for the proposed applications, with new features, such as routes discovery algorithms added. Simulation results show that the proposed cross layer based protocol can conserve energy for nodes and provide the required performance such as life time of the network, delay and reliability.

Role of ligands in catalytic water oxidation by mononuclear ruthenium complexes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Since first reported in 2005, mononuclear ruthenium water oxidation catalysts have attracted a great deal of attention due to their catalytic performance and synthetic flexibility. In particular, ligands coordinated to a Ru metal centre play an important role in the catalytic mechanisms, exhibiting significant impact on catalyst efficiency, stability and activity towards water oxidation. This review focuses on finding possible correlations between the ligand effects and activity of mononuclear Ru aqua and non-aqua complexes as water oxidation catalysts. The ligand effects highlighted in the text include the electronic nature of core ligands and their substituents, the trans–cis effect, steric hindrance and the strain effect, the net charge effect, the geometric arrangement of the aqua ligand and the supramolecular effects, e.g., hydrogen bonding and influence of a pendant base. The outcome is not always obvious at the present knowledge level. Deeper understanding of the ligand effects, based on new input data, is mandatory for further progress towards a rational development of novel catalysts featuring enhanced activity in water oxidation.

Agents and stream data mining: a new perspective

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many organizations struggle with the massive amount of data they collect. Today, data does more than serve as the ingredients for churning out statistical reports. They help support efficient operations in many organizations, and to some extent, data provide the competitive intelligence organizations need to survive in today's economy. Data mining can't always deliver timely and relevant results because data are constantly changing. However, stream-data processing might be more effective, judging by the Matrix project.

Learning from large data : bias, variance, sampling, and learning curves

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the fundamental machine learning tasks is that of predictive classification. Given that organisations collect an ever increasing amount of data, predictive classification methods must be able to effectively and efficiently handle large amounts of data. However, it is understood that present requirements push existing algorithms to, and sometimes beyond, their limits since many classification prediction algorithms were designed when currently common data set sizes were beyond imagination. This has led to a significant amount of research into ways of making classification learning algorithms more effective and efficient. Although substantial progress has been made, a number of key questions have not been answered. This dissertation investigates two of these key questions. The first is whether different types of algorithms to those currently employed are required when using large data sets. This is answered by analysis of the way in which the bias plus variance decomposition of predictive classification error changes as training set size is increased. Experiments find that larger training sets require different types of algorithms to those currently used. Some insight into the characteristics of suitable algorithms is provided, and this may provide some direction for the development of future classification prediction algorithms which are specifically designed for use with large data sets. The second question investigated is that of the role of sampling in machine learning with large data sets. Sampling has long been used as a means of avoiding the need to scale up algorithms to suit the size of the data set by scaling down the size of the data sets to suit the algorithm. However, the costs of performing sampling have not been widely explored. Two popular sampling methods are compared with learning from all available data in terms of predictive accuracy, model complexity, and execution time. The comparison shows that sub-sampling generally products models with accuracy close to, and sometimes greater than, that obtainable from learning with all available data. This result suggests that it may be possible to develop algorithms that take advantage of the sub-sampling methodology to reduce the time required to infer a model while sacrificing little if any accuracy. Methods of improving effective and efficient learning via sampling are also investigated, and now sampling methodologies proposed. These methodologies include using a varying-proportion of instances to determine the next inference step and using a statistical calculation at each inference step to determine sufficient sample size. Experiments show that using a statistical calculation of sample size can not only substantially reduce execution time but can do so with only a small loss, and occasional gain, in accuracy. One of the common uses of sampling is in the construction of learning curves. Learning curves are often used to attempt to determine the optimal training size which will maximally reduce execution time while nut being detrimental to accuracy. An analysis of the performance of methods for detection of convergence of learning curves is performed, with the focus of the analysis on methods that calculate the gradient, of the tangent to the curve. Given that such methods can be susceptible to local accuracy plateaus, an investigation into the frequency of local plateaus is also performed. It is shown that local accuracy plateaus are a common occurrence, and that ensuring a small loss of accuracy often results in greater computational cost than learning from all available data. These results cast doubt over the applicability of gradient of tangent methods for detecting convergence, and of the viability of learning curves for reducing execution time in general.

«
1
2
...
10
11
12
13
14
15
16
...
63
64
»