2 resultados para Cluster Counting Algorithm
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
This thesis analyses problems related to the applicability, in business environments, of Process Mining tools and techniques. The first contribution is a presentation of the state of the art of Process Mining and a characterization of companies, in terms of their "process awareness". The work continues identifying circumstance where problems can emerge: data preparation; actual mining; and results interpretation. Other problems are the configuration of parameters by not-expert users and computational complexity. We concentrate on two possible scenarios: "batch" and "on-line" Process Mining. Concerning the batch Process Mining, we first investigated the data preparation problem and we proposed a solution for the identification of the "case-ids" whenever this field is not explicitly indicated. After that, we concentrated on problems at mining time and we propose the generalization of a well-known control-flow discovery algorithm in order to exploit non instantaneous events. The usage of interval-based recording leads to an important improvement of performance. Later on, we report our work on the parameters configuration for not-expert users. We present two approaches to select the "best" parameters configuration: one is completely autonomous; the other requires human interaction to navigate a hierarchy of candidate models. Concerning the data interpretation and results evaluation, we propose two metrics: a model-to-model and a model-to-log. Finally, we present an automatic approach for the extension of a control-flow model with social information, in order to simplify the analysis of these perspectives. The second part of this thesis deals with control-flow discovery algorithms in on-line settings. We propose a formal definition of the problem, and two baseline approaches. The actual mining algorithms proposed are two: the first is the adaptation, to the control-flow discovery problem, of a frequency counting algorithm; the second constitutes a framework of models which can be used for different kinds of streams (stationary versus evolving).
Resumo:
The present thesis focuses on the on-fault slip distribution of large earthquakes in the framework of tsunami hazard assessment and tsunami warning improvement. It is widely known that ruptures on seismic faults are strongly heterogeneous. In the case of tsunamigenic earthquakes, the slip heterogeneity strongly influences the spatial distribution of the largest tsunami effects along the nearest coastlines. Unfortunately, after an earthquake occurs, the so-called finite-fault models (FFM) describing the coseismic on-fault slip pattern becomes available over time scales that are incompatible with early tsunami warning purposes, especially in the near field. Our work aims to characterize the slip heterogeneity in a fast, but still suitable way. Using finite-fault models to build a starting dataset of seismic events, the characteristics of the fault planes are studied with respect to the magnitude. The patterns of the slip distribution on the rupture plane, analysed with a cluster identification algorithm, reveal a preferential single-asperity representation that can be approximated by a two-dimensional Gaussian slip distribution (2D GD). The goodness of the 2D GD model is compared to other distributions used in literature and its ability to represent the slip heterogeneity in the form of the main asperity is proven. The magnitude dependence of the 2D GD parameters is investigated and turns out to be of primary importance from an early warning perspective. The Gaussian model is applied to the 16 September 2015 Illapel, Chile, earthquake and used to compute early tsunami predictions that are satisfactorily compared with the available observations. The fast computation of the 2D GD and its suitability in representing the slip complexity of the seismic source make it a useful tool for the tsunami early warning assessments, especially for what concerns the near field.