14 resultados para Models, statistical
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The uncertainties in the determination of the stratigraphic profile of natural soils is one of the main problems in geotechnics, in particular for landslide characterization and modeling. The study deals with a new approach in geotechnical modeling which relays on a stochastic generation of different soil layers distributions, following a boolean logic – the method has been thus called BoSG (Boolean Stochastic Generation). In this way, it is possible to randomize the presence of a specific material interdigitated in a uniform matrix. In the building of a geotechnical model it is generally common to discard some stratigraphic data in order to simplify the model itself, assuming that the significance of the results of the modeling procedure would not be affected. With the proposed technique it is possible to quantify the error associated with this simplification. Moreover, it could be used to determine the most significant zones where eventual further investigations and surveys would be more effective to build the geotechnical model of the slope. The commercial software FLAC was used for the 2D and 3D geotechnical model. The distribution of the materials was randomized through a specifically coded MatLab program that automatically generates text files, each of them representing a specific soil configuration. Besides, a routine was designed to automate the computation of FLAC with the different data files in order to maximize the sample number. The methodology is applied with reference to a simplified slope in 2D, a simplified slope in 3D and an actual landslide, namely the Mortisa mudslide (Cortina d’Ampezzo, BL, Italy). However, it could be extended to numerous different cases, especially for hydrogeological analysis and landslide stability assessment, in different geological and geomorphological contexts.
Resumo:
Long-term monitoring of acoustical environments is gaining popularity thanks to the relevant amount of scientific and engineering insights that it provides. The increasing interest is due to the constant growth of storage capacity and computational power to process large amounts of data. In this perspective, machine learning (ML) provides a broad family of data-driven statistical techniques to deal with large databases. Nowadays, the conventional praxis of sound level meter measurements limits the global description of a sound scene to an energetic point of view. The equivalent continuous level Leq represents the main metric to define an acoustic environment, indeed. Finer analyses involve the use of statistical levels. However, acoustic percentiles are based on temporal assumptions, which are not always reliable. A statistical approach, based on the study of the occurrences of sound pressure levels, would bring a different perspective to the analysis of long-term monitoring. Depicting a sound scene through the most probable sound pressure level, rather than portions of energy, brought more specific information about the activity carried out during the measurements. The statistical mode of the occurrences can capture typical behaviors of specific kinds of sound sources. The present work aims to propose an ML-based method to identify, separate and measure coexisting sound sources in real-world scenarios. It is based on long-term monitoring and is addressed to acousticians focused on the analysis of environmental noise in manifold contexts. The presented method is based on clustering analysis. Two algorithms, Gaussian Mixture Model and K-means clustering, represent the main core of a process to investigate different active spaces monitored through sound level meters. The procedure has been applied in two different contexts: university lecture halls and offices. The proposed method shows robust and reliable results in describing the acoustic scenario and it could represent an important analytical tool for acousticians.
Resumo:
Background. The surgical treatment of dysfunctional hips is a severe condition for the patient and a costly therapy for the public health. Hip resurfacing techniques seem to hold the promise of various advantages over traditional THR, with particular attention to young and active patients. Although the lesson provided in the past by many branches of engineering is that success in designing competitive products can be achieved only by predicting the possible scenario of failure, to date the understanding of the implant quality is poorly pre-clinically addressed. Thus revision is the only delayed and reliable end point for assessment. The aim of the present work was to model the musculoskeletal system so as to develop a protocol for predicting failure of hip resurfacing prosthesis. Methods. Preliminary studies validated the technique for the generation of subject specific finite element (FE) models of long bones from Computed Thomography data. The proposed protocol consisted in the numerical analysis of the prosthesis biomechanics by deterministic and statistic studies so as to assess the risk of biomechanical failure on the different operative conditions the implant might face in a population of interest during various activities of daily living. Physiological conditions were defined including the variability of the anatomy, bone densitometry, surgery uncertainties and published boundary conditions at the hip. The protocol was tested by analysing a successful design on the market and a new prototype of a resurfacing prosthesis. Results. The intrinsic accuracy of models on bone stress predictions (RMSE < 10%) was aligned to the current state of the art in this field. The accuracy of prediction on the bone-prosthesis contact mechanics was also excellent (< 0.001 mm). The sensitivity of models prediction to uncertainties on modelling parameter was found below 8.4%. The analysis of the successful design resulted in a very good agreement with published retrospective studies. The geometry optimisation of the new prototype lead to a final design with a low risk of failure. The statistical analysis confirmed the minimal risk of the optimised design over the entire population of interest. The performances of the optimised design showed a significant improvement with respect to the first prototype (+35%). Limitations. On the authors opinion the major limitation of this study is on boundary conditions. The muscular forces and the hip joint reaction were derived from the few data available in the literature, which can be considered significant but hardly representative of the entire variability of boundary conditions the implant might face over the patients population. This moved the focus of the research on modelling the musculoskeletal system; the ongoing activity is to develop subject-specific musculoskeletal models of the lower limb from medical images. Conclusions. The developed protocol was able to accurately predict known clinical outcomes when applied to a well-established device and, to support the design optimisation phase providing important information on critical characteristics of the patients when applied to a new prosthesis. The presented approach does have a relevant generality that would allow the extension of the protocol to a large set of orthopaedic scenarios with minor changes. Hence, a failure mode analysis criterion can be considered a suitable tool in developing new orthopaedic devices.
Resumo:
The motivation for the work presented in this thesis is to retrieve profile information for the atmospheric trace constituents nitrogen dioxide (NO2) and ozone (O3) in the lower troposphere from remote sensing measurements. The remote sensing technique used, referred to as Multiple AXis Differential Optical Absorption Spectroscopy (MAX-DOAS), is a recent technique that represents a significant advance on the well-established DOAS, especially for what it concerns the study of tropospheric trace consituents. NO2 is an important trace gas in the lower troposphere due to the fact that it is involved in the production of tropospheric ozone; ozone and nitrogen dioxide are key factors in determining the quality of air with consequences, for example, on human health and the growth of vegetation. To understand the NO2 and ozone chemistry in more detail not only the concentrations at ground but also the acquisition of the vertical distribution is necessary. In fact, the budget of nitrogen oxides and ozone in the atmosphere is determined both by local emissions and non-local chemical and dynamical processes (i.e. diffusion and transport at various scales) that greatly impact on their vertical and temporal distribution: thus a tool to resolve the vertical profile information is really important. Useful measurement techniques for atmospheric trace species should fulfill at least two main requirements. First, they must be sufficiently sensitive to detect the species under consideration at their ambient concentration levels. Second, they must be specific, which means that the results of the measurement of a particular species must be neither positively nor negatively influenced by any other trace species simultaneously present in the probed volume of air. Air monitoring by spectroscopic techniques has proven to be a very useful tool to fulfill these desirable requirements as well as a number of other important properties. During the last decades, many such instruments have been developed which are based on the absorption properties of the constituents in various regions of the electromagnetic spectrum, ranging from the far infrared to the ultraviolet. Among them, Differential Optical Absorption Spectroscopy (DOAS) has played an important role. DOAS is an established remote sensing technique for atmospheric trace gases probing, which identifies and quantifies the trace gases in the atmosphere taking advantage of their molecular absorption structures in the near UV and visible wavelengths of the electromagnetic spectrum (from 0.25 μm to 0.75 μm). Passive DOAS, in particular, can detect the presence of a trace gas in terms of its integrated concentration over the atmospheric path from the sun to the receiver (the so called slant column density). The receiver can be located at ground, as well as on board an aircraft or a satellite platform. Passive DOAS has, therefore, a flexible measurement configuration that allows multiple applications. The ability to properly interpret passive DOAS measurements of atmospheric constituents depends crucially on how well the optical path of light collected by the system is understood. This is because the final product of DOAS is the concentration of a particular species integrated along the path that radiation covers in the atmosphere. This path is not known a priori and can only be evaluated by Radiative Transfer Models (RTMs). These models are used to calculate the so called vertical column density of a given trace gas, which is obtained by dividing the measured slant column density to the so called air mass factor, which is used to quantify the enhancement of the light path length within the absorber layers. In the case of the standard DOAS set-up, in which radiation is collected along the vertical direction (zenith-sky DOAS), calculations of the air mass factor have been made using “simple” single scattering radiative transfer models. This configuration has its highest sensitivity in the stratosphere, in particular during twilight. This is the result of the large enhancement in stratospheric light path at dawn and dusk combined with a relatively short tropospheric path. In order to increase the sensitivity of the instrument towards tropospheric signals, measurements with the telescope pointing the horizon (offaxis DOAS) have to be performed. In this circumstances, the light path in the lower layers can become very long and necessitate the use of radiative transfer models including multiple scattering, the full treatment of atmospheric sphericity and refraction. In this thesis, a recent development in the well-established DOAS technique is described, referred to as Multiple AXis Differential Optical Absorption Spectroscopy (MAX-DOAS). The MAX-DOAS consists in the simultaneous use of several off-axis directions near the horizon: using this configuration, not only the sensitivity to tropospheric trace gases is greatly improved, but vertical profile information can also be retrieved by combining the simultaneous off-axis measurements with sophisticated RTM calculations and inversion techniques. In particular there is a need for a RTM which is capable of dealing with all the processes intervening along the light path, supporting all DOAS geometries used, and treating multiple scattering events with varying phase functions involved. To achieve these multiple goals a statistical approach based on the Monte Carlo technique should be used. A Monte Carlo RTM generates an ensemble of random photon paths between the light source and the detector, and uses these paths to reconstruct a remote sensing measurement. Within the present study, the Monte Carlo radiative transfer model PROMSAR (PROcessing of Multi-Scattered Atmospheric Radiation) has been developed and used to correctly interpret the slant column densities obtained from MAX-DOAS measurements. In order to derive the vertical concentration profile of a trace gas from its slant column measurement, the AMF is only one part in the quantitative retrieval process. One indispensable requirement is a robust approach to invert the measurements and obtain the unknown concentrations, the air mass factors being known. For this purpose, in the present thesis, we have used the Chahine relaxation method. Ground-based Multiple AXis DOAS, combined with appropriate radiative transfer models and inversion techniques, is a promising tool for atmospheric studies in the lower troposphere and boundary layer, including the retrieval of profile information with a good degree of vertical resolution. This thesis has presented an application of this powerful comprehensive tool for the study of a preserved natural Mediterranean area (the Castel Porziano Estate, located 20 km South-West of Rome) where pollution is transported from remote sources. Application of this tool in densely populated or industrial areas is beginning to look particularly fruitful and represents an important subject for future studies.
Resumo:
Statistical modelling and statistical learning theory are two powerful analytical frameworks for analyzing signals and developing efficient processing and classification algorithms. In this thesis, these frameworks are applied for modelling and processing biomedical signals in two different contexts: ultrasound medical imaging systems and primate neural activity analysis and modelling. In the context of ultrasound medical imaging, two main applications are explored: deconvolution of signals measured from a ultrasonic transducer and automatic image segmentation and classification of prostate ultrasound scans. In the former application a stochastic model of the radio frequency signal measured from a ultrasonic transducer is derived. This model is then employed for developing in a statistical framework a regularized deconvolution procedure, for enhancing signal resolution. In the latter application, different statistical models are used to characterize images of prostate tissues, extracting different features. These features are then uses to segment the images in region of interests by means of an automatic procedure based on a statistical model of the extracted features. Finally, machine learning techniques are used for automatic classification of the different region of interests. In the context of neural activity signals, an example of bio-inspired dynamical network was developed to help in studies of motor-related processes in the brain of primate monkeys. The presented model aims to mimic the abstract functionality of a cell population in 7a parietal region of primate monkeys, during the execution of learned behavioural tasks.
Resumo:
The present work is devoted to the assessment of the energy fluxes physics in the space of scales and physical space of wall-turbulent flows. The generalized Kolmogorov equation will be applied to DNS data of a turbulent channel flow in order to describe the energy fluxes paths from production to dissipation in the augmented space of wall-turbulent flows. This multidimensional description will be shown to be crucial to understand the formation and sustainment of the turbulent fluctuations fed by the energy fluxes coming from the near-wall production region. An unexpected behavior of the energy fluxes comes out from this analysis consisting of spiral-like paths in the combined physical/scale space where the controversial reverse energy cascade plays a central role. The observed behavior conflicts with the classical notion of the Richardson/Kolmogorov energy cascade and may have strong repercussions on both theoretical and modeling approaches to wall-turbulence. To this aim a new relation stating the leading physical processes governing the energy transfer in wall-turbulence is suggested and shown able to capture most of the rich dynamics of the shear dominated region of the flow. Two dynamical processes are identified as driving mechanisms for the fluxes, one in the near wall region and a second one further away from the wall. The former, stronger one is related to the dynamics involved in the near-wall turbulence regeneration cycle. The second suggests an outer self-sustaining mechanism which is asymptotically expected to take place in the log-layer and could explain the debated mixed inner/outer scaling of the near-wall statistics. The same approach is applied for the first time to a filtered velocity field. A generalized Kolmogorov equation specialized for filtered velocity field is derived and discussed. The results will show what effects the subgrid scales have on the resolved motion in both physical and scale space, singling out the prominent role of the filter length compared to the cross-over scale between production dominated scales and inertial range, lc, and the reverse energy cascade region lb. The systematic characterization of the resolved and subgrid physics as function of the filter scale and of the wall-distance will be shown instrumental for a correct use of LES models in the simulation of wall turbulent flows. Taking inspiration from the new relation for the energy transfer in wall turbulence, a new class of LES models will be also proposed. Finally, the generalized Kolmogorov equation specialized for filtered velocity fields will be shown to be an helpful statistical tool for the assessment of LES models and for the development of new ones. As example, some classical purely dissipative eddy viscosity models are analyzed via an a priori procedure.
Resumo:
The goal of this dissertation is to use statistical tools to analyze specific financial risks that have played dominant roles in the US financial crisis of 2008-2009. The first risk relates to the level of aggregate stress in the financial markets. I estimate the impact of financial stress on economic activity and monetary policy using structural VAR analysis. The second set of risks concerns the US housing market. There are in fact two prominent risks associated with a US mortgage, as borrowers can both prepay or default on a mortgage. I test the existence of unobservable heterogeneity in the borrower's decision to default or prepay on his mortgage by estimating a multinomial logit model with borrower-specific random coefficients.
Resumo:
In this thesis, we extend some ideas of statistical physics to describe the properties of human mobility. By using a database containing GPS measures of individual paths (position, velocity and covered space at a spatial scale of 2 Km or a time scale of 30 sec), which includes the 2% of the private vehicles in Italy, we succeed in determining some statistical empirical laws pointing out "universal" characteristics of human mobility. Developing simple stochastic models suggesting possible explanations of the empirical observations, we are able to indicate what are the key quantities and cognitive features that are ruling individuals' mobility. To understand the features of individual dynamics, we have studied different aspects of urban mobility from a physical point of view. We discuss the implications of the Benford's law emerging from the distribution of times elapsed between successive trips. We observe how the daily travel-time budget is related with many aspects of the urban environment, and describe how the daily mobility budget is then spent. We link the scaling properties of individual mobility networks to the inhomogeneous average durations of the activities that are performed, and those of the networks describing people's common use of space with the fractional dimension of the urban territory. We study entropy measures of individual mobility patterns, showing that they carry almost the same information of the related mobility networks, but are also influenced by a hierarchy among the activities performed. We discover that Wardrop's principles are violated as drivers have only incomplete information on traffic state and therefore rely on knowledge on the average travel-times. We propose an assimilation model to solve the intrinsic scattering of GPS data on the street network, permitting the real-time reconstruction of traffic state at a urban scale.
Resumo:
The advances that have been characterizing spatial econometrics in recent years are mostly theoretical and have not found an extensive empirical application yet. In this work we aim at supplying a review of the main tools of spatial econometrics and to show an empirical application for one of the most recently introduced estimators. Despite the numerous alternatives that the econometric theory provides for the treatment of spatial (and spatiotemporal) data, empirical analyses are still limited by the lack of availability of the correspondent routines in statistical and econometric software. Spatiotemporal modeling represents one of the most recent developments in spatial econometric theory and the finite sample properties of the estimators that have been proposed are currently being tested in the literature. We provide a comparison between some estimators (a quasi-maximum likelihood, QML, estimator and some GMM-type estimators) for a fixed effects dynamic panel data model under certain conditions, by means of a Monte Carlo simulation analysis. We focus on different settings, which are characterized either by fully stable or quasi-unit root series. We also investigate the extent of the bias that is caused by a non-spatial estimation of a model when the data are characterized by different degrees of spatial dependence. Finally, we provide an empirical application of a QML estimator for a time-space dynamic model which includes a temporal, a spatial and a spatiotemporal lag of the dependent variable. This is done by choosing a relevant and prolific field of analysis, in which spatial econometrics has only found limited space so far, in order to explore the value-added of considering the spatial dimension of the data. In particular, we study the determinants of cropland value in Midwestern U.S.A. in the years 1971-2009, by taking the present value model (PVM) as the theoretical framework of analysis.
Resumo:
Small-scale dynamic stochastic general equilibrium have been treated as the benchmark of much of the monetary policy literature, given their ability to explain the impact of monetary policy on output, inflation and financial markets. One cause of the empirical failure of New Keynesian models is partially due to the Rational Expectations (RE) paradigm, which entails a tight structure on the dynamics of the system. Under this hypothesis, the agents are assumed to know the data genereting process. In this paper, we propose the econometric analysis of New Keynesian DSGE models under an alternative expectations generating paradigm, which can be regarded as an intermediate position between rational expectations and learning, nameley an adapted version of the "Quasi-Rational" Expectatations (QRE) hypothesis. Given the agents' statistical model, we build a pseudo-structural form from the baseline system of Euler equations, imposing that the length of the reduced form is the same as in the `best' statistical model.
Resumo:
The present work aims to provide a deeper understanding of thermally driven turbulence and to address some modelling aspects related to the physics of the flow. For this purpose, two idealized systems are investigated by Direct Numerical Simulation: the rotating and non-rotating Rayleigh-Bénard convection. The preliminary study of the flow topologies shows how the coherent structures organise into different patterns depending on the rotation rate. From a statistical perspective, the analysis of the turbulent kinetic energy and temperature variance budgets allows to identify the flow regions where the production, the transport, and the dissipation of turbulent fluctuations occur. To provide a multi-scale description of the flows, a theoretical framework based on the Kolmogorov and Yaglom equations is applied for the first time to the Rayleigh-Bénard convection. The analysis shows how the spatial inhomogeneity modulates the dynamics at different scales and wall-distances. Inside the core of the flow, the space of scales can be divided into an inhomogeneity-dominated range at large scales, an inertial-like range at intermediate scales and a dissipative range at small scales. This classic scenario breaks close to the walls, where the inhomogeneous mechanisms and the viscous/diffusive processes are important at every scale and entail more complex dynamics. The same theoretical framework is extended to the filtered velocity and temperature fields of non-rotating Rayleigh-Bénard convection. The analysis of the filtered Kolmogorov and Yaglom equations reveals the influence of the residual scales on the filtered dynamics both in physical and scale space, highlighting the effect of the relative position between the filter length and the crossover that separates the inhomogeneity-dominated range from the quasi-homogeneous range. The assessment of the filtered and residual physics results to be instrumental for the correct use of the existing Large-Eddy Simulation models and for the development of new ones.
Resumo:
The main purpose of this thesis is to go beyond two usual assumptions that accompany theoretical analysis in spin-glasses and inference: the i.i.d. (independently and identically distributed) hypothesis on the noise elements and the finite rank regime. The first one appears since the early birth of spin-glasses. The second one instead concerns the inference viewpoint. Disordered systems and Bayesian inference have a well-established relation, evidenced by their continuous cross-fertilization. The thesis makes use of techniques coming both from the rigorous mathematical machinery of spin-glasses, such as the interpolation scheme, and from Statistical Physics, such as the replica method. The first chapter contains an introduction to the Sherrington-Kirkpatrick and spiked Wigner models. The first is a mean field spin-glass where the couplings are i.i.d. Gaussian random variables. The second instead amounts to establish the information theoretical limits in the reconstruction of a fixed low rank matrix, the “spike”, blurred by additive Gaussian noise. In chapters 2 and 3 the i.i.d. hypothesis on the noise is broken by assuming a noise with inhomogeneous variance profile. In spin-glasses this leads to multi-species models. The inferential counterpart is called spatial coupling. All the previous models are usually studied in the Bayes-optimal setting, where everything is known about the generating process of the data. In chapter 4 instead we study the spiked Wigner model where the prior on the signal to reconstruct is ignored. In chapter 5 we analyze the statistical limits of a spiked Wigner model where the noise is no longer Gaussian, but drawn from a random matrix ensemble, which makes its elements dependent. The thesis ends with chapter 6, where the challenging problem of high-rank probabilistic matrix factorization is tackled. Here we introduce a new procedure called "decimation" and we show that it is theoretically to perform matrix factorization through it.
Resumo:
The study of random probability measures is a lively research topic that has attracted interest from different fields in recent years. In this thesis, we consider random probability measures in the context of Bayesian nonparametrics, where the law of a random probability measure is used as prior distribution, and in the context of distributional data analysis, where the goal is to perform inference given avsample from the law of a random probability measure. The contributions contained in this thesis can be subdivided according to three different topics: (i) the use of almost surely discrete repulsive random measures (i.e., whose support points are well separated) for Bayesian model-based clustering, (ii) the proposal of new laws for collections of random probability measures for Bayesian density estimation of partially exchangeable data subdivided into different groups, and (iii) the study of principal component analysis and regression models for probability distributions seen as elements of the 2-Wasserstein space. Specifically, for point (i) above we propose an efficient Markov chain Monte Carlo algorithm for posterior inference, which sidesteps the need of split-merge reversible jump moves typically associated with poor performance, we propose a model for clustering high-dimensional data by introducing a novel class of anisotropic determinantal point processes, and study the distributional properties of the repulsive measures, shedding light on important theoretical results which enable more principled prior elicitation and more efficient posterior simulation algorithms. For point (ii) above, we consider several models suitable for clustering homogeneous populations, inducing spatial dependence across groups of data, extracting the characteristic traits common to all the data-groups, and propose a novel vector autoregressive model to study of growth curves of Singaporean kids. Finally, for point (iii), we propose a novel class of projected statistical methods for distributional data analysis for measures on the real line and on the unit-circle.
Resumo:
In this thesis, we investigate the role of applied physics in epidemiological surveillance through the application of mathematical models, network science and machine learning. The spread of a communicable disease depends on many biological, social, and health factors. The large masses of data available make it possible, on the one hand, to monitor the evolution and spread of pathogenic organisms; on the other hand, to study the behavior of people, their opinions and habits. Presented here are three lines of research in which an attempt was made to solve real epidemiological problems through data analysis and the use of statistical and mathematical models. In Chapter 1, we applied language-inspired Deep Learning models to transform influenza protein sequences into vectors encoding their information content. We then attempted to reconstruct the antigenic properties of different viral strains using regression models and to identify the mutations responsible for vaccine escape. In Chapter 2, we constructed a compartmental model to describe the spread of a bacterium within a hospital ward. The model was informed and validated on time series of clinical measurements, and a sensitivity analysis was used to assess the impact of different control measures. Finally (Chapter 3) we reconstructed the network of retweets among COVID-19 themed Twitter users in the early months of the SARS-CoV-2 pandemic. By means of community detection algorithms and centrality measures, we characterized users’ attention shifts in the network, showing that scientific communities, initially the most retweeted, lost influence over time to national political communities. In the Conclusion, we highlighted the importance of the work done in light of the main contemporary challenges for epidemiological surveillance. In particular, we present reflections on the importance of nowcasting and forecasting, the relationship between data and scientific research, and the need to unite the different scales of epidemiological surveillance.