6 resultados para Probability Distribution
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The aim of this Thesis is to investigate the possibility that the observations related to the epoch of reionization can probe not only the evolution of the IGM state, but also the cosmological background in which this process occurs. In fact, the history of the IGM ionization is indeed affected by the evolution of the sources of ionizing photons that, under the assumption of a structure formation paradigm determined by the hierarchic growth of the matter uctuations, results strongly dependent on the characteristics of the background universe. For the purpose of our investigation, we have analysed the reionization history in innovative cosmological frameworks, still in agreement with the recent observational tests related to the SNIa and the CMB probes, comparing our results with the reionization scenario predicted by the commonly used LCDM cosmology. In particular, in this Thesis we have considered two different alternative universes. The first one is a at universe dominated at late epochs by a dynamic dark energy component, characterized by an equation of state evolving in time. The second cosmological framework we have assumed is a LCDM characterized by a primordial overdensity field having a non-Gaussian probability distribution. The reionization scenario have been investigated, in this Thesis, through semi-analytic approaches based on the hierarichic growth of the matter uctuations and on suitable assumptions concerning the ionization and the recombination of the IGM. We make predictions for the evolution and the distribution of the HII regions, and for the global features of reionization, that can be constrained by future observations. Finally, we brie y discuss the possible future prospects of this Thesis work.
Resumo:
The hydrologic risk (and the hydro-geologic one, closely related to it) is, and has always been, a very relevant issue, due to the severe consequences that may be provoked by a flooding or by waters in general in terms of human and economic losses. Floods are natural phenomena, often catastrophic, and cannot be avoided, but their damages can be reduced if they are predicted sufficiently in advance. For this reason, the flood forecasting plays an essential role in the hydro-geological and hydrological risk prevention. Thanks to the development of sophisticated meteorological, hydrologic and hydraulic models, in recent decades the flood forecasting has made a significant progress, nonetheless, models are imperfect, which means that we are still left with a residual uncertainty on what will actually happen. In this thesis, this type of uncertainty is what will be discussed and analyzed. In operational problems, it is possible to affirm that the ultimate aim of forecasting systems is not to reproduce the river behavior, but this is only a means through which reducing the uncertainty associated to what will happen as a consequence of a precipitation event. In other words, the main objective is to assess whether or not preventive interventions should be adopted and which operational strategy may represent the best option. The main problem for a decision maker is to interpret model results and translate them into an effective intervention strategy. To make this possible, it is necessary to clearly define what is meant by uncertainty, since in the literature confusion is often made on this issue. Therefore, the first objective of this thesis is to clarify this concept, starting with a key question: should be the choice of the intervention strategy to adopt based on the evaluation of the model prediction based on its ability to represent the reality or on the evaluation of what actually will happen on the basis of the information given by the model forecast? Once the previous idea is made unambiguous, the other main concern of this work is to develope a tool that can provide an effective decision support, making possible doing objective and realistic risk evaluations. In particular, such tool should be able to provide an uncertainty assessment as accurate as possible. This means primarily three things: it must be able to correctly combine all the available deterministic forecasts, it must assess the probability distribution of the predicted quantity and it must quantify the flooding probability. Furthermore, given that the time to implement prevention strategies is often limited, the flooding probability will have to be linked to the time of occurrence. For this reason, it is necessary to quantify the flooding probability within a horizon time related to that required to implement the intervention strategy and it is also necessary to assess the probability of the flooding time.
Resumo:
There are different ways to do cluster analysis of categorical data in the literature and the choice among them is strongly related to the aim of the researcher, if we do not take into account time and economical constraints. Main approaches for clustering are usually distinguished into model-based and distance-based methods: the former assume that objects belonging to the same class are similar in the sense that their observed values come from the same probability distribution, whose parameters are unknown and need to be estimated; the latter evaluate distances among objects by a defined dissimilarity measure and, basing on it, allocate units to the closest group. In clustering, one may be interested in the classification of similar objects into groups, and one may be interested in finding observations that come from the same true homogeneous distribution. But do both of these aims lead to the same clustering? And how good are clustering methods designed to fulfil one of these aims in terms of the other? In order to answer, two approaches, namely a latent class model (mixture of multinomial distributions) and a partition around medoids one, are evaluated and compared by Adjusted Rand Index, Average Silhouette Width and Pearson-Gamma indexes in a fairly wide simulation study. Simulation outcomes are plotted in bi-dimensional graphs via Multidimensional Scaling; size of points is proportional to the number of points that overlap and different colours are used according to the cluster membership.
Resumo:
Non-Equilibrium Statistical Mechanics is a broad subject. Grossly speaking, it deals with systems which have not yet relaxed to an equilibrium state, or else with systems which are in a steady non-equilibrium state, or with more general situations. They are characterized by external forcing and internal fluxes, resulting in a net production of entropy which quantifies dissipation and the extent by which, by the Second Law of Thermodynamics, time-reversal invariance is broken. In this thesis we discuss some of the mathematical structures involved with generic discrete-state-space non-equilibrium systems, that we depict with networks in all analogous to electrical networks. We define suitable observables and derive their linear regime relationships, we discuss a duality between external and internal observables that reverses the role of the system and of the environment, we show that network observables serve as constraints for a derivation of the minimum entropy production principle. We dwell on deep combinatorial aspects regarding linear response determinants, which are related to spanning tree polynomials in graph theory, and we give a geometrical interpretation of observables in terms of Wilson loops of a connection and gauge degrees of freedom. We specialize the formalism to continuous-time Markov chains, we give a physical interpretation for observables in terms of locally detailed balanced rates, we prove many variants of the fluctuation theorem, and show that a well-known expression for the entropy production due to Schnakenberg descends from considerations of gauge invariance, where the gauge symmetry is related to the freedom in the choice of a prior probability distribution. As an additional topic of geometrical flavor related to continuous-time Markov chains, we discuss the Fisher-Rao geometry of nonequilibrium decay modes, showing that the Fisher matrix contains information about many aspects of non-equilibrium behavior, including non-equilibrium phase transitions and superposition of modes. We establish a sort of statistical equivalence principle and discuss the behavior of the Fisher matrix under time-reversal. To conclude, we propose that geometry and combinatorics might greatly increase our understanding of nonequilibrium phenomena.
Resumo:
The thesis has extensively investigated for the first time the statistical distributions of atmospheric surface variables and heat fluxes for the Mediterranean Sea. After retrieving a 30-year atmospheric analysis dataset, we have captured the spatial patterns of the probability distribution of the relevant atmospheric variables for ocean atmospheric forcing: wind components (U,V), wind amplitude, air temperature (T2M), dewpoint temperature (D2M) and mean sea-level pressure (MSL-P). The study reveals that a two-parameter PDF is not a good fit for T2M, D2M, MSL-P and wind components (U,V) and a three parameter skew-normal PDF is better suited. Such distribution captures properly the data asymmetric tails (skewness). After removing the large seasonal cycle, we show the quality of the fit and the geographic structure of the PDF parameters. It is found that the PDF parameters vary between different regions, in particular the shape (connected to the asymmetric tails) and the scale (connected to the spread of the distribution) parameters cluster around two or more values, probably connected to the different dynamics that produces the surface atmospheric fields in the Mediterranean basin. Moreover, using the atmospheric variables, we have computed the air-sea heat fluxes for a 20-years period and estimated the net heat budget over the Mediterranean Sea. Interestingly, the higher resolution analysis dataset provides a negative heat budget of –3 W/m2 which is within the acceptable range for the Mediterranean Sea heat budget closure. The lower resolution atmospheric reanalysis dataset(ERA5) does not satisfy the heat budget closure problem pointing out that a minimal resolution of the atmospheric forcing is crucial for the Mediterranean Sea dynamics. The PDF framework developed in this thesis will be the basis for a future ensemble forecasting system that will use the statistical distributions to create perturbations of the atmospheric ocean forcing.
Resumo:
The study of random probability measures is a lively research topic that has attracted interest from different fields in recent years. In this thesis, we consider random probability measures in the context of Bayesian nonparametrics, where the law of a random probability measure is used as prior distribution, and in the context of distributional data analysis, where the goal is to perform inference given avsample from the law of a random probability measure. The contributions contained in this thesis can be subdivided according to three different topics: (i) the use of almost surely discrete repulsive random measures (i.e., whose support points are well separated) for Bayesian model-based clustering, (ii) the proposal of new laws for collections of random probability measures for Bayesian density estimation of partially exchangeable data subdivided into different groups, and (iii) the study of principal component analysis and regression models for probability distributions seen as elements of the 2-Wasserstein space. Specifically, for point (i) above we propose an efficient Markov chain Monte Carlo algorithm for posterior inference, which sidesteps the need of split-merge reversible jump moves typically associated with poor performance, we propose a model for clustering high-dimensional data by introducing a novel class of anisotropic determinantal point processes, and study the distributional properties of the repulsive measures, shedding light on important theoretical results which enable more principled prior elicitation and more efficient posterior simulation algorithms. For point (ii) above, we consider several models suitable for clustering homogeneous populations, inducing spatial dependence across groups of data, extracting the characteristic traits common to all the data-groups, and propose a novel vector autoregressive model to study of growth curves of Singaporean kids. Finally, for point (iii), we propose a novel class of projected statistical methods for distributional data analysis for measures on the real line and on the unit-circle.