8 resultados para Gaussian Probability Distribution
em AMS Tesi di Dottorato - Alm@DL - Universit
Resumo:
The aim of this Thesis is to investigate the possibility that the observations related to the epoch of reionization can probe not only the evolution of the IGM state, but also the cosmological background in which this process occurs. In fact, the history of the IGM ionization is indeed affected by the evolution of the sources of ionizing photons that, under the assumption of a structure formation paradigm determined by the hierarchic growth of the matter uctuations, results strongly dependent on the characteristics of the background universe. For the purpose of our investigation, we have analysed the reionization history in innovative cosmological frameworks, still in agreement with the recent observational tests related to the SNIa and the CMB probes, comparing our results with the reionization scenario predicted by the commonly used LCDM cosmology. In particular, in this Thesis we have considered two different alternative universes. The first one is a at universe dominated at late epochs by a dynamic dark energy component, characterized by an equation of state evolving in time. The second cosmological framework we have assumed is a LCDM characterized by a primordial overdensity field having a non-Gaussian probability distribution. The reionization scenario have been investigated, in this Thesis, through semi-analytic approaches based on the hierarichic growth of the matter uctuations and on suitable assumptions concerning the ionization and the recombination of the IGM. We make predictions for the evolution and the distribution of the HII regions, and for the global features of reionization, that can be constrained by future observations. Finally, we brie y discuss the possible future prospects of this Thesis work.
Resumo:
The present thesis focuses on the on-fault slip distribution of large earthquakes in the framework of tsunami hazard assessment and tsunami warning improvement. It is widely known that ruptures on seismic faults are strongly heterogeneous. In the case of tsunamigenic earthquakes, the slip heterogeneity strongly influences the spatial distribution of the largest tsunami effects along the nearest coastlines. Unfortunately, after an earthquake occurs, the so-called finite-fault models (FFM) describing the coseismic on-fault slip pattern becomes available over time scales that are incompatible with early tsunami warning purposes, especially in the near field. Our work aims to characterize the slip heterogeneity in a fast, but still suitable way. Using finite-fault models to build a starting dataset of seismic events, the characteristics of the fault planes are studied with respect to the magnitude. The patterns of the slip distribution on the rupture plane, analysed with a cluster identification algorithm, reveal a preferential single-asperity representation that can be approximated by a two-dimensional Gaussian slip distribution (2D GD). The goodness of the 2D GD model is compared to other distributions used in literature and its ability to represent the slip heterogeneity in the form of the main asperity is proven. The magnitude dependence of the 2D GD parameters is investigated and turns out to be of primary importance from an early warning perspective. The Gaussian model is applied to the 16 September 2015 Illapel, Chile, earthquake and used to compute early tsunami predictions that are satisfactorily compared with the available observations. The fast computation of the 2D GD and its suitability in representing the slip complexity of the seismic source make it a useful tool for the tsunami early warning assessments, especially for what concerns the near field.
Resumo:
The hydrologic risk (and the hydro-geologic one, closely related to it) is, and has always been, a very relevant issue, due to the severe consequences that may be provoked by a flooding or by waters in general in terms of human and economic losses. Floods are natural phenomena, often catastrophic, and cannot be avoided, but their damages can be reduced if they are predicted sufficiently in advance. For this reason, the flood forecasting plays an essential role in the hydro-geological and hydrological risk prevention. Thanks to the development of sophisticated meteorological, hydrologic and hydraulic models, in recent decades the flood forecasting has made a significant progress, nonetheless, models are imperfect, which means that we are still left with a residual uncertainty on what will actually happen. In this thesis, this type of uncertainty is what will be discussed and analyzed. In operational problems, it is possible to affirm that the ultimate aim of forecasting systems is not to reproduce the river behavior, but this is only a means through which reducing the uncertainty associated to what will happen as a consequence of a precipitation event. In other words, the main objective is to assess whether or not preventive interventions should be adopted and which operational strategy may represent the best option. The main problem for a decision maker is to interpret model results and translate them into an effective intervention strategy. To make this possible, it is necessary to clearly define what is meant by uncertainty, since in the literature confusion is often made on this issue. Therefore, the first objective of this thesis is to clarify this concept, starting with a key question: should be the choice of the intervention strategy to adopt based on the evaluation of the model prediction based on its ability to represent the reality or on the evaluation of what actually will happen on the basis of the information given by the model forecast? Once the previous idea is made unambiguous, the other main concern of this work is to develope a tool that can provide an effective decision support, making possible doing objective and realistic risk evaluations. In particular, such tool should be able to provide an uncertainty assessment as accurate as possible. This means primarily three things: it must be able to correctly combine all the available deterministic forecasts, it must assess the probability distribution of the predicted quantity and it must quantify the flooding probability. Furthermore, given that the time to implement prevention strategies is often limited, the flooding probability will have to be linked to the time of occurrence. For this reason, it is necessary to quantify the flooding probability within a horizon time related to that required to implement the intervention strategy and it is also necessary to assess the probability of the flooding time.
Resumo:
There are different ways to do cluster analysis of categorical data in the literature and the choice among them is strongly related to the aim of the researcher, if we do not take into account time and economical constraints. Main approaches for clustering are usually distinguished into model-based and distance-based methods: the former assume that objects belonging to the same class are similar in the sense that their observed values come from the same probability distribution, whose parameters are unknown and need to be estimated; the latter evaluate distances among objects by a defined dissimilarity measure and, basing on it, allocate units to the closest group. In clustering, one may be interested in the classification of similar objects into groups, and one may be interested in finding observations that come from the same true homogeneous distribution. But do both of these aims lead to the same clustering? And how good are clustering methods designed to fulfil one of these aims in terms of the other? In order to answer, two approaches, namely a latent class model (mixture of multinomial distributions) and a partition around medoids one, are evaluated and compared by Adjusted Rand Index, Average Silhouette Width and Pearson-Gamma indexes in a fairly wide simulation study. Simulation outcomes are plotted in bi-dimensional graphs via Multidimensional Scaling; size of points is proportional to the number of points that overlap and different colours are used according to the cluster membership.
Resumo:
Non-Equilibrium Statistical Mechanics is a broad subject. Grossly speaking, it deals with systems which have not yet relaxed to an equilibrium state, or else with systems which are in a steady non-equilibrium state, or with more general situations. They are characterized by external forcing and internal fluxes, resulting in a net production of entropy which quantifies dissipation and the extent by which, by the Second Law of Thermodynamics, time-reversal invariance is broken. In this thesis we discuss some of the mathematical structures involved with generic discrete-state-space non-equilibrium systems, that we depict with networks in all analogous to electrical networks. We define suitable observables and derive their linear regime relationships, we discuss a duality between external and internal observables that reverses the role of the system and of the environment, we show that network observables serve as constraints for a derivation of the minimum entropy production principle. We dwell on deep combinatorial aspects regarding linear response determinants, which are related to spanning tree polynomials in graph theory, and we give a geometrical interpretation of observables in terms of Wilson loops of a connection and gauge degrees of freedom. We specialize the formalism to continuous-time Markov chains, we give a physical interpretation for observables in terms of locally detailed balanced rates, we prove many variants of the fluctuation theorem, and show that a well-known expression for the entropy production due to Schnakenberg descends from considerations of gauge invariance, where the gauge symmetry is related to the freedom in the choice of a prior probability distribution. As an additional topic of geometrical flavor related to continuous-time Markov chains, we discuss the Fisher-Rao geometry of nonequilibrium decay modes, showing that the Fisher matrix contains information about many aspects of non-equilibrium behavior, including non-equilibrium phase transitions and superposition of modes. We establish a sort of statistical equivalence principle and discuss the behavior of the Fisher matrix under time-reversal. To conclude, we propose that geometry and combinatorics might greatly increase our understanding of nonequilibrium phenomena.
Resumo:
The thesis has extensively investigated for the first time the statistical distributions of atmospheric surface variables and heat fluxes for the Mediterranean Sea. After retrieving a 30-year atmospheric analysis dataset, we have captured the spatial patterns of the probability distribution of the relevant atmospheric variables for ocean atmospheric forcing: wind components (U,V), wind amplitude, air temperature (T2M), dewpoint temperature (D2M) and mean sea-level pressure (MSL-P). The study reveals that a two-parameter PDF is not a good fit for T2M, D2M, MSL-P and wind components (U,V) and a three parameter skew-normal PDF is better suited. Such distribution captures properly the data asymmetric tails (skewness). After removing the large seasonal cycle, we show the quality of the fit and the geographic structure of the PDF parameters. It is found that the PDF parameters vary between different regions, in particular the shape (connected to the asymmetric tails) and the scale (connected to the spread of the distribution) parameters cluster around two or more values, probably connected to the different dynamics that produces the surface atmospheric fields in the Mediterranean basin. Moreover, using the atmospheric variables, we have computed the air-sea heat fluxes for a 20-years period and estimated the net heat budget over the Mediterranean Sea. Interestingly, the higher resolution analysis dataset provides a negative heat budget of –3 W/m2 which is within the acceptable range for the Mediterranean Sea heat budget closure. The lower resolution atmospheric reanalysis dataset(ERA5) does not satisfy the heat budget closure problem pointing out that a minimal resolution of the atmospheric forcing is crucial for the Mediterranean Sea dynamics. The PDF framework developed in this thesis will be the basis for a future ensemble forecasting system that will use the statistical distributions to create perturbations of the atmospheric ocean forcing.
Resumo:
Spatial prediction of hourly rainfall via radar calibration is addressed. The change of support problem (COSP), arising when the spatial supports of different data sources do not coincide, is faced in a non-Gaussian setting; in fact, hourly rainfall in Emilia-Romagna region, in Italy, is characterized by abundance of zero values and right-skeweness of the distribution of positive amounts. Rain gauge direct measurements on sparsely distributed locations and hourly cumulated radar grids are provided by the ARPA-SIMC Emilia-Romagna. We propose a three-stage Bayesian hierarchical model for radar calibration, exploiting rain gauges as reference measure. Rain probability and amounts are modeled via linear relationships with radar in the log scale; spatial correlated Gaussian effects capture the residual information. We employ a probit link for rainfall probability and Gamma distribution for rainfall positive amounts; the two steps are joined via a two-part semicontinuous model. Three model specifications differently addressing COSP are presented; in particular, a stochastic weighting of all radar pixels, driven by a latent Gaussian process defined on the grid, is employed. Estimation is performed via MCMC procedures implemented in C, linked to R software. Communication and evaluation of probabilistic, point and interval predictions is investigated. A non-randomized PIT histogram is proposed for correctly assessing calibration and coverage of two-part semicontinuous models. Predictions obtained with the different model specifications are evaluated via graphical tools (Reliability Plot, Sharpness Histogram, PIT Histogram, Brier Score Plot and Quantile Decomposition Plot), proper scoring rules (Brier Score, Continuous Rank Probability Score) and consistent scoring functions (Root Mean Square Error and Mean Absolute Error addressing the predictive mean and median, respectively). Calibration is reached and the inclusion of neighbouring information slightly improves predictions. All specifications outperform a benchmark model with incorrelated effects, confirming the relevance of spatial correlation for modeling rainfall probability and accumulation.
Resumo:
The study of random probability measures is a lively research topic that has attracted interest from different fields in recent years. In this thesis, we consider random probability measures in the context of Bayesian nonparametrics, where the law of a random probability measure is used as prior distribution, and in the context of distributional data analysis, where the goal is to perform inference given avsample from the law of a random probability measure. The contributions contained in this thesis can be subdivided according to three different topics: (i) the use of almost surely discrete repulsive random measures (i.e., whose support points are well separated) for Bayesian model-based clustering, (ii) the proposal of new laws for collections of random probability measures for Bayesian density estimation of partially exchangeable data subdivided into different groups, and (iii) the study of principal component analysis and regression models for probability distributions seen as elements of the 2-Wasserstein space. Specifically, for point (i) above we propose an efficient Markov chain Monte Carlo algorithm for posterior inference, which sidesteps the need of split-merge reversible jump moves typically associated with poor performance, we propose a model for clustering high-dimensional data by introducing a novel class of anisotropic determinantal point processes, and study the distributional properties of the repulsive measures, shedding light on important theoretical results which enable more principled prior elicitation and more efficient posterior simulation algorithms. For point (ii) above, we consider several models suitable for clustering homogeneous populations, inducing spatial dependence across groups of data, extracting the characteristic traits common to all the data-groups, and propose a novel vector autoregressive model to study of growth curves of Singaporean kids. Finally, for point (iii), we propose a novel class of projected statistical methods for distributional data analysis for measures on the real line and on the unit-circle.