967 resultados para satellite data processing
Resumo:
We present a new algorithm called TITANIC for computing concept lattices. It is based on data mining techniques for computing frequent itemsets. The algorithm is experimentally evaluated and compared with B. Ganter's Next-Closure algorithm.
Resumo:
In this paper, we discuss Conceptual Knowledge Discovery in Databases (CKDD) in its connection with Data Analysis. Our approach is based on Formal Concept Analysis, a mathematical theory which has been developed and proven useful during the last 20 years. Formal Concept Analysis has led to a theory of conceptual information systems which has been applied by using the management system TOSCANA in a wide range of domains. In this paper, we use such an application in database marketing to demonstrate how methods and procedures of CKDD can be applied in Data Analysis. In particular, we show the interplay and integration of data mining and data analysis techniques based on Formal Concept Analysis. The main concern of this paper is to explain how the transition from data to knowledge can be supported by a TOSCANA system. To clarify the transition steps we discuss their correspondence to the five levels of knowledge representation established by R. Brachman and to the steps of empirically grounded theory building proposed by A. Strauss and J. Corbin.
Resumo:
Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.
Resumo:
Among many other knowledge representations formalisms, Ontologies and Formal Concept Analysis (FCA) aim at modeling ‘concepts’. We discuss how these two formalisms may complement another from an application point of view. In particular, we will see how FCA can be used to support Ontology Engineering, and how ontologies can be exploited in FCA applications. The interplay of FCA and ontologies is studied along the life cycle of an ontology: (i) FCA can support the building of the ontology as a learning technique. (ii) The established ontology can be analyzed and navigated by using techniques of FCA. (iii) Last but not least, the ontology may be used to improve an FCA application.
Resumo:
About ten years ago, triadic contexts were presented by Lehmann and Wille as an extension of Formal Concept Analysis. However, they have rarely been used up to now, which may be due to the rather complex structure of the resulting diagrams. In this paper, we go one step back and discuss how traditional line diagrams of standard (dyadic) concept lattices can be used for exploring and navigating triadic data. Our approach is inspired by the slice & dice paradigm of On-Line-Analytical Processing (OLAP). We recall the basic ideas of OLAP, and show how they may be transferred to triadic contexts. For modeling the navigation patterns a user might follow, we use the formalisms of finite state machines. In order to present the benefits of our model, we show how it can be used for navigating the IT Baseline Protection Manual of the German Federal Office for Information Security.
Resumo:
Die zunehmende Vernetzung der Informations- und Kommunikationssysteme führt zu einer weiteren Erhöhung der Komplexität und damit auch zu einer weiteren Zunahme von Sicherheitslücken. Klassische Schutzmechanismen wie Firewall-Systeme und Anti-Malware-Lösungen bieten schon lange keinen Schutz mehr vor Eindringversuchen in IT-Infrastrukturen. Als ein sehr wirkungsvolles Instrument zum Schutz gegenüber Cyber-Attacken haben sich hierbei die Intrusion Detection Systeme (IDS) etabliert. Solche Systeme sammeln und analysieren Informationen von Netzwerkkomponenten und Rechnern, um ungewöhnliches Verhalten und Sicherheitsverletzungen automatisiert festzustellen. Während signatur-basierte Ansätze nur bereits bekannte Angriffsmuster detektieren können, sind anomalie-basierte IDS auch in der Lage, neue bisher unbekannte Angriffe (Zero-Day-Attacks) frühzeitig zu erkennen. Das Kernproblem von Intrusion Detection Systeme besteht jedoch in der optimalen Verarbeitung der gewaltigen Netzdaten und der Entwicklung eines in Echtzeit arbeitenden adaptiven Erkennungsmodells. Um diese Herausforderungen lösen zu können, stellt diese Dissertation ein Framework bereit, das aus zwei Hauptteilen besteht. Der erste Teil, OptiFilter genannt, verwendet ein dynamisches "Queuing Concept", um die zahlreich anfallenden Netzdaten weiter zu verarbeiten, baut fortlaufend Netzverbindungen auf, und exportiert strukturierte Input-Daten für das IDS. Den zweiten Teil stellt ein adaptiver Klassifikator dar, der ein Klassifikator-Modell basierend auf "Enhanced Growing Hierarchical Self Organizing Map" (EGHSOM), ein Modell für Netzwerk Normalzustand (NNB) und ein "Update Model" umfasst. In dem OptiFilter werden Tcpdump und SNMP traps benutzt, um die Netzwerkpakete und Hostereignisse fortlaufend zu aggregieren. Diese aggregierten Netzwerkpackete und Hostereignisse werden weiter analysiert und in Verbindungsvektoren umgewandelt. Zur Verbesserung der Erkennungsrate des adaptiven Klassifikators wird das künstliche neuronale Netz GHSOM intensiv untersucht und wesentlich weiterentwickelt. In dieser Dissertation werden unterschiedliche Ansätze vorgeschlagen und diskutiert. So wird eine classification-confidence margin threshold definiert, um die unbekannten bösartigen Verbindungen aufzudecken, die Stabilität der Wachstumstopologie durch neuartige Ansätze für die Initialisierung der Gewichtvektoren und durch die Stärkung der Winner Neuronen erhöht, und ein selbst-adaptives Verfahren eingeführt, um das Modell ständig aktualisieren zu können. Darüber hinaus besteht die Hauptaufgabe des NNB-Modells in der weiteren Untersuchung der erkannten unbekannten Verbindungen von der EGHSOM und der Überprüfung, ob sie normal sind. Jedoch, ändern sich die Netzverkehrsdaten wegen des Concept drif Phänomens ständig, was in Echtzeit zur Erzeugung nicht stationärer Netzdaten führt. Dieses Phänomen wird von dem Update-Modell besser kontrolliert. Das EGHSOM-Modell kann die neuen Anomalien effektiv erkennen und das NNB-Model passt die Änderungen in Netzdaten optimal an. Bei den experimentellen Untersuchungen hat das Framework erfolgversprechende Ergebnisse gezeigt. Im ersten Experiment wurde das Framework in Offline-Betriebsmodus evaluiert. Der OptiFilter wurde mit offline-, synthetischen- und realistischen Daten ausgewertet. Der adaptive Klassifikator wurde mit dem 10-Fold Cross Validation Verfahren evaluiert, um dessen Genauigkeit abzuschätzen. Im zweiten Experiment wurde das Framework auf einer 1 bis 10 GB Netzwerkstrecke installiert und im Online-Betriebsmodus in Echtzeit ausgewertet. Der OptiFilter hat erfolgreich die gewaltige Menge von Netzdaten in die strukturierten Verbindungsvektoren umgewandelt und der adaptive Klassifikator hat sie präzise klassifiziert. Die Vergleichsstudie zwischen dem entwickelten Framework und anderen bekannten IDS-Ansätzen zeigt, dass der vorgeschlagene IDSFramework alle anderen Ansätze übertrifft. Dies lässt sich auf folgende Kernpunkte zurückführen: Bearbeitung der gesammelten Netzdaten, Erreichung der besten Performanz (wie die Gesamtgenauigkeit), Detektieren unbekannter Verbindungen und Entwicklung des in Echtzeit arbeitenden Erkennungsmodells von Eindringversuchen.
Resumo:
This paper reports on a new satellite sensor, the Geostationary Earth Radiation Budget (GERB) experiment. GERB is designed to make the first measurements of the Earth's radiation budget from geostationary orbit. Measurements at high absolute accuracy of the reflected sunlight from the Earth, and the thermal radiation emitted by the Earth are made every 15 min, with a spatial resolution at the subsatellite point of 44.6 km (north–south) by 39.3 km (east–west). With knowledge of the incoming solar constant, this gives the primary forcing and response components of the top-of-atmosphere radiation. The first GERB instrument is an instrument of opportunity on Meteosat-8, a new spin-stabilized spacecraft platform also carrying the Spinning Enhanced Visible and Infrared (SEVIRI) sensor, which is currently positioned over the equator at 3.5°W. This overview of the project includes a description of the instrument design and its preflight and in-flight calibration. An evaluation of the instrument performance after its first year in orbit, including comparisons with data from the Clouds and the Earth's Radiant Energy System (CERES) satellite sensors and with output from numerical models, are also presented. After a brief summary of the data processing system and data products, some of the scientific studies that are being undertaken using these early data are described. This marks the beginning of a decade or more of observations from GERB, as subsequent models will fly on each of the four Meteosat Second Generation satellites.
Resumo:
GODIVA2 is a dynamic website that provides visual access to several terabytes of physically distributed, four-dimensional environmental data. It allows users to explore large datasets interactively without the need to install new software or download and understand complex data. Through the use of open international standards, GODIVA2 maintains a high level of interoperability with third-party systems, allowing diverse datasets to be mutually compared. Scientists can use the system to search for features in large datasets and to diagnose the output from numerical simulations and data processing algorithms. Data providers around Europe have adopted GODIVA2 as an INSPIRE-compliant dynamic quick-view system for providing visual access to their data.
Resumo:
Real-time rainfall monitoring in Africa is of great practical importance for operational applications in hydrology and agriculture. Satellite data have been used in this context for many years because of the lack of surface observations. This paper describes an improved artificial neural network algorithm for operational applications. The algorithm combines numerical weather model information with the satellite data. Using this algorithm, daily rainfall estimates were derived for 4 yr of the Ethiopian and Zambian main rainy seasons and were compared with two other algorithms-a multiple linear regression making use of the same information as that of the neural network and a satellite-only method. All algorithms were validated against rain gauge data. Overall, the neural network performs best, but the extent to which it does so depends on the calibration/validation protocol. The advantages of the neural network are most evident when calibration data are numerous and close in space and time to the validation data. This result emphasizes the importance of a real-time calibration system.
Resumo:
[ 1] The European Centre for Medium-Range Weather Forecasts (ECMWF) 40-year Reanalysis (ERA-40) ozone and water vapor reanalysis fields during the 1990s have been compared with independent satellite data from the Halogen Occultation Experiment (HALOE) and Microwave Limb Sounder (MLS) instruments on board the Upper Atmosphere Research Satellite (UARS). In addition, ERA-40 has been compared with aircraft data from the Measurements of Ozone and Water Vapour by Airbus In-Service Aircraft (MOZAIC) program. Overall, in comparison with the values derived from the independent observations, the upper stratosphere in ERA-40 has about 5 - 10% more ozone and 15 - 20% less water vapor. This dry bias in the reanalysis appears to be global and extends into the middle stratosphere down to 40 hPa. Most of the discrepancies and seasonal variations between ERA-40 and the independent observations occur within the upper troposphere over the tropics and the lower stratosphere over the high latitudes. ERA-40 reproduces a weaker Antarctic ozone hole, and of less vertical extent, than the independent observations; values in the ozone maximum in the tropical stratosphere are lower for the reanalysis. ERA-40 mixing ratios of water vapor are considerably larger than those for MOZAIC, typically by 20% in the tropical upper troposphere, and they may exceed 60% in the lower stratosphere over high latitudes. The results imply that the Brewer-Dobson circulation in the ECMWF reanalysis system is too fast, as is also evidenced by deficiencies in the way ERA-40 reproduces the water vapor "tape recorder'' signal in the tropical stratosphere. Finally, the paper examines the biases and their temporal variation during the 1990s in the way ERA-40 compares to the independent observations. We also discuss how the evaluation results depend on the instrument used, as well as on the version of the data.
Resumo:
There is remarkable agreement in expectations today for vastly improved ocean data management a decade from now -- capabilities that will help to bring significant benefits to ocean research and to society. Advancing data management to such a degree, however, will require cultural and policy changes that are slow to effect. The technological foundations upon which data management systems are built are certain to continue advancing rapidly in parallel. These considerations argue for adopting attitudes of pragmatism and realism when planning data management strategies. In this paper we adopt those attitudes as we outline opportunities for progress in ocean data management. We begin with a synopsis of expectations for integrated ocean data management a decade from now. We discuss factors that should be considered by those evaluating candidate “standards”. We highlight challenges and opportunities in a number of technical areas, including “Web 2.0” applications, data modeling, data discovery and metadata, real-time operational data, archival of data, biological data management and satellite data management. We discuss the importance of investments in the development of software toolkits to accelerate progress. We conclude the paper by recommending a few specific, short term targets for implementation, that we believe to be both significant and achievable, and calling for action by community leadership to effect these advancements.
Resumo:
A system for continuous data assimilation is presented and discussed. To simulate the dynamical development a channel version of a balanced barotropic model is used and geopotential (height) data are assimilated into the models computations as data become available. In the first experiment the updating is performed every 24th, 12th and 6th hours with a given network. The stations are distributed at random in 4 groups in order to simulate 4 areas with different density of stations. Optimum interpolation is performed for the difference between the forecast and the valid observations. The RMS-error of the analyses is reduced in time, and the error being smaller the more frequent the updating is performed. The updating every 6th hour yields an error in the analysis less than the RMS-error of the observation. In a second experiment the updating is performed by data from a moving satellite with a side-scan capability of about 15°. If the satellite data are analysed at every time step before they are introduced into the system the error of the analysis is reduced to a value below the RMS-error of the observation already after 24 hours and yields as a whole a better result than updating from a fixed network. If the satellite data are introduced without any modification the error of the analysis is reduced much slower and it takes about 4 days to reach a comparable result to the one where the data have been analysed.
Resumo:
Considerable progress has taken place in numerical weather prediction over the last decade. It has been possible to extend predictive skills in the extra-tropics of the Northern Hemisphere during the winter from less than five days to seven days. Similar improvements, albeit on a lower level, have taken place in the Southern Hemisphere. Another example of improvement in the forecasts is the prediction of intense synoptic phenomena such as cyclogenesis which on the whole is quite successful with the most advanced operational models (Bengtsson (1989), Gadd and Kruze (1988)). A careful examination shows that there are no single causes for the improvements in predictive skill, but instead they are due to several different factors encompassing the forecasting system as a whole (Bengtsson, 1985). In this paper we will focus our attention on the role of data-assimilation and the effect it may have on reducing the initial error and hence improving the forecast. The first part of the paper contains a theoretical discussion on error growth in simple data assimilation systems, following Leith (1983). In the second part we will apply the result on actual forecast data from ECMWF. The potential for further forecast improvements within the framework of the present observing system in the two hemispheres will be discussed.