13 resultados para Cumulative probability distribution functions
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The aim of this Thesis is to investigate the possibility that the observations related to the epoch of reionization can probe not only the evolution of the IGM state, but also the cosmological background in which this process occurs. In fact, the history of the IGM ionization is indeed affected by the evolution of the sources of ionizing photons that, under the assumption of a structure formation paradigm determined by the hierarchic growth of the matter uctuations, results strongly dependent on the characteristics of the background universe. For the purpose of our investigation, we have analysed the reionization history in innovative cosmological frameworks, still in agreement with the recent observational tests related to the SNIa and the CMB probes, comparing our results with the reionization scenario predicted by the commonly used LCDM cosmology. In particular, in this Thesis we have considered two different alternative universes. The first one is a at universe dominated at late epochs by a dynamic dark energy component, characterized by an equation of state evolving in time. The second cosmological framework we have assumed is a LCDM characterized by a primordial overdensity field having a non-Gaussian probability distribution. The reionization scenario have been investigated, in this Thesis, through semi-analytic approaches based on the hierarichic growth of the matter uctuations and on suitable assumptions concerning the ionization and the recombination of the IGM. We make predictions for the evolution and the distribution of the HII regions, and for the global features of reionization, that can be constrained by future observations. Finally, we brie y discuss the possible future prospects of this Thesis work.
Resumo:
We propose an extension of the approach provided by Kluppelberg and Kuhn (2009) for inference on second-order structure moments. As in Kluppelberg and Kuhn (2009) we adopt a copula-based approach instead of assuming normal distribution for the variables, thus relaxing the equality in distribution assumption. A new copula-based estimator for structure moments is investigated. The methodology provided by Kluppelberg and Kuhn (2009) is also extended considering the copulas associated with the family of Eyraud-Farlie-Gumbel-Morgenstern distribution functions (Kotz, Balakrishnan, and Johnson, 2000, Equation 44.73). Finally, a comprehensive simulation study and an application to real financial data are performed in order to compare the different approaches.
Resumo:
The hydrologic risk (and the hydro-geologic one, closely related to it) is, and has always been, a very relevant issue, due to the severe consequences that may be provoked by a flooding or by waters in general in terms of human and economic losses. Floods are natural phenomena, often catastrophic, and cannot be avoided, but their damages can be reduced if they are predicted sufficiently in advance. For this reason, the flood forecasting plays an essential role in the hydro-geological and hydrological risk prevention. Thanks to the development of sophisticated meteorological, hydrologic and hydraulic models, in recent decades the flood forecasting has made a significant progress, nonetheless, models are imperfect, which means that we are still left with a residual uncertainty on what will actually happen. In this thesis, this type of uncertainty is what will be discussed and analyzed. In operational problems, it is possible to affirm that the ultimate aim of forecasting systems is not to reproduce the river behavior, but this is only a means through which reducing the uncertainty associated to what will happen as a consequence of a precipitation event. In other words, the main objective is to assess whether or not preventive interventions should be adopted and which operational strategy may represent the best option. The main problem for a decision maker is to interpret model results and translate them into an effective intervention strategy. To make this possible, it is necessary to clearly define what is meant by uncertainty, since in the literature confusion is often made on this issue. Therefore, the first objective of this thesis is to clarify this concept, starting with a key question: should be the choice of the intervention strategy to adopt based on the evaluation of the model prediction based on its ability to represent the reality or on the evaluation of what actually will happen on the basis of the information given by the model forecast? Once the previous idea is made unambiguous, the other main concern of this work is to develope a tool that can provide an effective decision support, making possible doing objective and realistic risk evaluations. In particular, such tool should be able to provide an uncertainty assessment as accurate as possible. This means primarily three things: it must be able to correctly combine all the available deterministic forecasts, it must assess the probability distribution of the predicted quantity and it must quantify the flooding probability. Furthermore, given that the time to implement prevention strategies is often limited, the flooding probability will have to be linked to the time of occurrence. For this reason, it is necessary to quantify the flooding probability within a horizon time related to that required to implement the intervention strategy and it is also necessary to assess the probability of the flooding time.
Resumo:
There are different ways to do cluster analysis of categorical data in the literature and the choice among them is strongly related to the aim of the researcher, if we do not take into account time and economical constraints. Main approaches for clustering are usually distinguished into model-based and distance-based methods: the former assume that objects belonging to the same class are similar in the sense that their observed values come from the same probability distribution, whose parameters are unknown and need to be estimated; the latter evaluate distances among objects by a defined dissimilarity measure and, basing on it, allocate units to the closest group. In clustering, one may be interested in the classification of similar objects into groups, and one may be interested in finding observations that come from the same true homogeneous distribution. But do both of these aims lead to the same clustering? And how good are clustering methods designed to fulfil one of these aims in terms of the other? In order to answer, two approaches, namely a latent class model (mixture of multinomial distributions) and a partition around medoids one, are evaluated and compared by Adjusted Rand Index, Average Silhouette Width and Pearson-Gamma indexes in a fairly wide simulation study. Simulation outcomes are plotted in bi-dimensional graphs via Multidimensional Scaling; size of points is proportional to the number of points that overlap and different colours are used according to the cluster membership.
Resumo:
Non-Equilibrium Statistical Mechanics is a broad subject. Grossly speaking, it deals with systems which have not yet relaxed to an equilibrium state, or else with systems which are in a steady non-equilibrium state, or with more general situations. They are characterized by external forcing and internal fluxes, resulting in a net production of entropy which quantifies dissipation and the extent by which, by the Second Law of Thermodynamics, time-reversal invariance is broken. In this thesis we discuss some of the mathematical structures involved with generic discrete-state-space non-equilibrium systems, that we depict with networks in all analogous to electrical networks. We define suitable observables and derive their linear regime relationships, we discuss a duality between external and internal observables that reverses the role of the system and of the environment, we show that network observables serve as constraints for a derivation of the minimum entropy production principle. We dwell on deep combinatorial aspects regarding linear response determinants, which are related to spanning tree polynomials in graph theory, and we give a geometrical interpretation of observables in terms of Wilson loops of a connection and gauge degrees of freedom. We specialize the formalism to continuous-time Markov chains, we give a physical interpretation for observables in terms of locally detailed balanced rates, we prove many variants of the fluctuation theorem, and show that a well-known expression for the entropy production due to Schnakenberg descends from considerations of gauge invariance, where the gauge symmetry is related to the freedom in the choice of a prior probability distribution. As an additional topic of geometrical flavor related to continuous-time Markov chains, we discuss the Fisher-Rao geometry of nonequilibrium decay modes, showing that the Fisher matrix contains information about many aspects of non-equilibrium behavior, including non-equilibrium phase transitions and superposition of modes. We establish a sort of statistical equivalence principle and discuss the behavior of the Fisher matrix under time-reversal. To conclude, we propose that geometry and combinatorics might greatly increase our understanding of nonequilibrium phenomena.
Resumo:
This doctoral dissertation presents a new method to asses the influence of clearancein the kinematic pairs on the configuration of planar and spatial mechanisms. The subject has been widely investigated in both past and present scientific literature, and is approached in different ways: a static/kinetostatic way, which looks for the clearance take-up due to the external loads on the mechanism; a probabilistic way, which expresses clearance-due displacements using probability density functions; a dynamic way, which evaluates dynamic effects like the actual forces in the pairs caused by impacts, or the consequent vibrations. This dissertation presents a new method to approach the problem of clearance. The problem is studied from a purely kinematic perspective. With reference to a given mechanism configuration, the pose (position and orientation) error of the mechanism link of interest is expressed as a vector function of the degrees of freedom introduced in each pair by clearance: the presence of clearance in a kinematic pair, in facts, causes the actual pair to have more degrees of freedom than the theoretical clearance-free one. The clearance-due degrees of freedom are bounded by the pair geometry. A proper modelling of clearance-affected pairs allows expressing such bounding through analytical functions. It is then possible to study the problem as a maximization problem, where a continuous function (the pose error of the link of interest) subject to some constraints (the analytical functions bounding clearance- due degrees of freedom) has to be maximize. Revolute, prismatic, cylindrical, and spherical clearance-affected pairs have been analytically modelled; with reference to mechanisms involving such pairs, the solution to the maximization problem has been obtained in a closed form.
Resumo:
The thesis has extensively investigated for the first time the statistical distributions of atmospheric surface variables and heat fluxes for the Mediterranean Sea. After retrieving a 30-year atmospheric analysis dataset, we have captured the spatial patterns of the probability distribution of the relevant atmospheric variables for ocean atmospheric forcing: wind components (U,V), wind amplitude, air temperature (T2M), dewpoint temperature (D2M) and mean sea-level pressure (MSL-P). The study reveals that a two-parameter PDF is not a good fit for T2M, D2M, MSL-P and wind components (U,V) and a three parameter skew-normal PDF is better suited. Such distribution captures properly the data asymmetric tails (skewness). After removing the large seasonal cycle, we show the quality of the fit and the geographic structure of the PDF parameters. It is found that the PDF parameters vary between different regions, in particular the shape (connected to the asymmetric tails) and the scale (connected to the spread of the distribution) parameters cluster around two or more values, probably connected to the different dynamics that produces the surface atmospheric fields in the Mediterranean basin. Moreover, using the atmospheric variables, we have computed the air-sea heat fluxes for a 20-years period and estimated the net heat budget over the Mediterranean Sea. Interestingly, the higher resolution analysis dataset provides a negative heat budget of –3 W/m2 which is within the acceptable range for the Mediterranean Sea heat budget closure. The lower resolution atmospheric reanalysis dataset(ERA5) does not satisfy the heat budget closure problem pointing out that a minimal resolution of the atmospheric forcing is crucial for the Mediterranean Sea dynamics. The PDF framework developed in this thesis will be the basis for a future ensemble forecasting system that will use the statistical distributions to create perturbations of the atmospheric ocean forcing.
Resumo:
This thesis presents a study of globular clusters (GCs), based on analysis of Monte Carlo simulations of globular clusters (GCs) with the aim to define new empirical parameters measurable from observations and able to trace the different phases of their dynamical evolution history. During their long term dynamical evolution, due to mass segregation and and dynamical friction, massive stars transfer kinetic energy to lower-mass objects, causing them to sink toward the cluster center. This continuous transfer of kinetic energy from the core to the outskirts triggers the runaway contraction of the core, known as "core collapse" (CC), followed by episodes of expansion and contraction called gravothermal oscillations. Clearly, such an internal dynamical evolution corresponds to significant variations also of the structure of the system. Determining the dynamical age of a cluster can be challenging as it depends on various internal and external properties. The traditional classification of GCs as CC or post-CC systems relies on detecting a steep power-law cusp in the central density profile, which may not always be reliable due to post-CC oscillations or other processes. In this thesis, based on the normalized cumulative radial distribution (nCRD) within a fraction of the half-mass radius is analyzed, and three diagnostics (A5, P5, and S2.5) are defined. These diagnostics show sensitivity to dynamical evolution and can distinguish pre-CC clusters from post-CC clusters.The analysis performed using multiple simulations with different initial conditions, including varying binary fractions and the presence of dark remnants showed the time variations of the diagnostics follow distinct patterns depending on the binary fraction and the retention or ejection of black holes. This analysis is extended to a larger set of simulations matching the observed properties of Galactic GCs, and the parameters show a potential to distinguish the dynamical stages of the observed clusters as well.
Resumo:
In high-energy hadron collisions, the production at parton level of heavy-flavour quarks (charm and bottom) is described by perturbative Quantum Chromo-dynamics (pQCD) calculations, given the hard scale set by the quark masses. However, in hadron-hadron collisions, the predictions of the heavy-flavour hadrons eventually produced entail the knowledge of the parton distribution functions, as well as an accurate description of the hadronisation process. The latter is taken into account via the fragmentation functions measured at e$^+$e$^-$ colliders or in ep collisions, but several observations in LHC Run 1 and Run 2 data challenged this picture. In this dissertation, I studied the charm hadronisation in proton-proton collision at $\sqrt{s}$ = 13 TeV with the ALICE experiment at the LHC, making use of a large statistic data sample collected during LHC Run 2. The production of heavy-flavour in this collision system will be discussed, also describing various hadronisation models implemented in commonly used event generators, which try to reproduce experimental data, taking into account the unexpected results at LHC regarding the enhanced production of charmed baryons. The role of multiple parton interaction (MPI) will also be presented and how it affects the total charm production as a function of multiplicity. The ALICE apparatus will be described before moving to the experimental results, which are related to the measurement of relative production rates of the charm hadrons $\Sigma_c^{0,++}$ and $\Lambda_c^+$, which allow us to study the hadronisation mechanisms of charm quarks and to give constraints to different hadronisation models. Furthermore, the analysis of D mesons ($D^{0}$, $D^{+}$ and $D^{*+}$) as a function of charged-particle multiplicity and spherocity will be shown, investigating the role of multi-parton interactions. This research is relevant per se and for the mission of the ALICE experiment at the LHC, which is devoted to the study of Quark-Gluon Plasma.
Resumo:
The DNA topology is an important modifier of DNA functions. Torsional stress is generated when right handed DNA is either over- or underwound, producing structural deformations which drive or are driven by processes such as replication, transcription, recombination and repair. DNA topoisomerases are molecular machines that regulate the topological state of the DNA in the cell. These enzymes accomplish this task by either passing one strand of the DNA through a break in the opposing strand or by passing a region of the duplex from the same or a different molecule through a double-stranded cut generated in the DNA. Because of their ability to cut one or two strands of DNA they are also target for some of the most successful anticancer drugs used in standard combination therapies of human cancers. An effective anticancer drug is Camptothecin (CPT) that specifically targets DNA topoisomerase 1 (TOP 1). The research project of the present thesis has been focused on the role of human TOP 1 during transcription and on the transcriptional consequences associated with TOP 1 inhibition by CPT in human cell lines. Previous findings demonstrate that TOP 1 inhibition by CPT perturbs RNA polymerase (RNAP II) density at promoters and along transcribed genes suggesting an involvement of TOP 1 in RNAP II promoter proximal pausing site. Within the transcription cycle, promoter pausing is a fundamental step the importance of which has been well established as a means of coupling elongation to RNA maturation. By measuring nascent RNA transcripts bound to chromatin, we demonstrated that TOP 1 inhibition by CPT can enhance RNAP II escape from promoter proximal pausing site of the human Hypoxia Inducible Factor 1 (HIF-1) and c-MYC genes in a dose dependent manner. This effect is dependent from Cdk7/Cdk9 activities since it can be reversed by the kinases inhibitor DRB. Since CPT affects RNAP II by promoting the hyperphosphorylation of its Rpb1 subunit the findings suggest that TOP 1inhibition by CPT may increase the activity of Cdks which in turn phosphorylate the Rpb1 subunit of RNAP II enhancing its escape from pausing. Interestingly, the transcriptional consequences of CPT induced topological stress are wider than expected. CPT increased co-transcriptional splicing of exon1 and 2 and markedly affected alternative splicing at exon 11. Surprisingly despite its well-established transcription inhibitory activity, CPT can trigger the production of a novel long RNA (5’aHIF-1) antisense to the human HIF-1 mRNA and a known antisense RNA at the 3’ end of the gene, while decreasing mRNA levels. The effects require TOP 1 and are independent from CPT induced DNA damage. Thus, when the supercoiling imbalance promoted by CPT occurs at promoter, it may trigger deregulation of the RNAP II pausing, increased chromatin accessibility and activation/derepression of antisense transcripts in a Cdks dependent manner. A changed balance of antisense transcripts and mRNAs may regulate the activity of HIF-1 and contribute to the control of tumor progression After focusing our TOP 1 investigations at a single gene level, we have extended the study to the whole genome by developing the “Topo-Seq” approach which generates a map of genome-wide distribution of sites of TOP 1 activity sites in human cells. The preliminary data revealed that TOP 1 preferentially localizes at intragenic regions and in particular at 5’ and 3’ ends of genes. Surprisingly upon TOP 1 downregulation, which impairs protein expression by 80%, TOP 1 molecules are mostly localized around 3’ ends of genes, thus suggesting that its activity is essential at these regions and can be compensate at 5’ ends. The developed procedure is a pioneer tool for the detection of TOP 1 cleavage sites across the genome and can open the way to further investigations of the enzyme roles in different nuclear processes.
Resumo:
The continuous advancements and enhancements of wireless systems are enabling new compelling scenarios where mobile services can adapt according to the current execution context, represented by the computational resources available at the local device, current physical location, people in physical proximity, and so forth. Such services called context-aware require the timely delivery of all relevant information describing the current context, and that introduces several unsolved complexities, spanning from low-level context data transmission up to context data storage and replication into the mobile system. In addition, to ensure correct and scalable context provisioning, it is crucial to integrate and interoperate with different wireless technologies (WiFi, Bluetooth, etc.) and modes (infrastructure-based and ad-hoc), and to use decentralized solutions to store and replicate context data on mobile devices. These challenges call for novel middleware solutions, here called Context Data Distribution Infrastructures (CDDIs), capable of delivering relevant context data to mobile devices, while hiding all the issues introduced by data distribution in heterogeneous and large-scale mobile settings. This dissertation thoroughly analyzes CDDIs for mobile systems, with the main goal of achieving a holistic approach to the design of such type of middleware solutions. We discuss the main functions needed by context data distribution in large mobile systems, and we claim the precise definition and clean respect of quality-based contracts between context consumers and CDDI to reconfigure main middleware components at runtime. We present the design and the implementation of our proposals, both in simulation-based and in real-world scenarios, along with an extensive evaluation that confirms the technical soundness of proposed CDDI solutions. Finally, we consider three highly heterogeneous scenarios, namely disaster areas, smart campuses, and smart cities, to better remark the wide technical validity of our analysis and solutions under different network deployments and quality constraints.
Resumo:
Spatial prediction of hourly rainfall via radar calibration is addressed. The change of support problem (COSP), arising when the spatial supports of different data sources do not coincide, is faced in a non-Gaussian setting; in fact, hourly rainfall in Emilia-Romagna region, in Italy, is characterized by abundance of zero values and right-skeweness of the distribution of positive amounts. Rain gauge direct measurements on sparsely distributed locations and hourly cumulated radar grids are provided by the ARPA-SIMC Emilia-Romagna. We propose a three-stage Bayesian hierarchical model for radar calibration, exploiting rain gauges as reference measure. Rain probability and amounts are modeled via linear relationships with radar in the log scale; spatial correlated Gaussian effects capture the residual information. We employ a probit link for rainfall probability and Gamma distribution for rainfall positive amounts; the two steps are joined via a two-part semicontinuous model. Three model specifications differently addressing COSP are presented; in particular, a stochastic weighting of all radar pixels, driven by a latent Gaussian process defined on the grid, is employed. Estimation is performed via MCMC procedures implemented in C, linked to R software. Communication and evaluation of probabilistic, point and interval predictions is investigated. A non-randomized PIT histogram is proposed for correctly assessing calibration and coverage of two-part semicontinuous models. Predictions obtained with the different model specifications are evaluated via graphical tools (Reliability Plot, Sharpness Histogram, PIT Histogram, Brier Score Plot and Quantile Decomposition Plot), proper scoring rules (Brier Score, Continuous Rank Probability Score) and consistent scoring functions (Root Mean Square Error and Mean Absolute Error addressing the predictive mean and median, respectively). Calibration is reached and the inclusion of neighbouring information slightly improves predictions. All specifications outperform a benchmark model with incorrelated effects, confirming the relevance of spatial correlation for modeling rainfall probability and accumulation.
Resumo:
The study of random probability measures is a lively research topic that has attracted interest from different fields in recent years. In this thesis, we consider random probability measures in the context of Bayesian nonparametrics, where the law of a random probability measure is used as prior distribution, and in the context of distributional data analysis, where the goal is to perform inference given avsample from the law of a random probability measure. The contributions contained in this thesis can be subdivided according to three different topics: (i) the use of almost surely discrete repulsive random measures (i.e., whose support points are well separated) for Bayesian model-based clustering, (ii) the proposal of new laws for collections of random probability measures for Bayesian density estimation of partially exchangeable data subdivided into different groups, and (iii) the study of principal component analysis and regression models for probability distributions seen as elements of the 2-Wasserstein space. Specifically, for point (i) above we propose an efficient Markov chain Monte Carlo algorithm for posterior inference, which sidesteps the need of split-merge reversible jump moves typically associated with poor performance, we propose a model for clustering high-dimensional data by introducing a novel class of anisotropic determinantal point processes, and study the distributional properties of the repulsive measures, shedding light on important theoretical results which enable more principled prior elicitation and more efficient posterior simulation algorithms. For point (ii) above, we consider several models suitable for clustering homogeneous populations, inducing spatial dependence across groups of data, extracting the characteristic traits common to all the data-groups, and propose a novel vector autoregressive model to study of growth curves of Singaporean kids. Finally, for point (iii), we propose a novel class of projected statistical methods for distributional data analysis for measures on the real line and on the unit-circle.