8 resultados para Gaussian Mixture Model
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Long-term monitoring of acoustical environments is gaining popularity thanks to the relevant amount of scientific and engineering insights that it provides. The increasing interest is due to the constant growth of storage capacity and computational power to process large amounts of data. In this perspective, machine learning (ML) provides a broad family of data-driven statistical techniques to deal with large databases. Nowadays, the conventional praxis of sound level meter measurements limits the global description of a sound scene to an energetic point of view. The equivalent continuous level Leq represents the main metric to define an acoustic environment, indeed. Finer analyses involve the use of statistical levels. However, acoustic percentiles are based on temporal assumptions, which are not always reliable. A statistical approach, based on the study of the occurrences of sound pressure levels, would bring a different perspective to the analysis of long-term monitoring. Depicting a sound scene through the most probable sound pressure level, rather than portions of energy, brought more specific information about the activity carried out during the measurements. The statistical mode of the occurrences can capture typical behaviors of specific kinds of sound sources. The present work aims to propose an ML-based method to identify, separate and measure coexisting sound sources in real-world scenarios. It is based on long-term monitoring and is addressed to acousticians focused on the analysis of environmental noise in manifold contexts. The presented method is based on clustering analysis. Two algorithms, Gaussian Mixture Model and K-means clustering, represent the main core of a process to investigate different active spaces monitored through sound level meters. The procedure has been applied in two different contexts: university lecture halls and offices. The proposed method shows robust and reliable results in describing the acoustic scenario and it could represent an important analytical tool for acousticians.
Resumo:
The recent advent of Next-generation sequencing technologies has revolutionized the way of analyzing the genome. This innovation allows to get deeper information at a lower cost and in less time, and provides data that are discrete measurements. One of the most important applications with these data is the differential analysis, that is investigating if one gene exhibit a different expression level in correspondence of two (or more) biological conditions (such as disease states, treatments received and so on). As for the statistical analysis, the final aim will be statistical testing and for modeling these data the Negative Binomial distribution is considered the most adequate one especially because it allows for "over dispersion". However, the estimation of the dispersion parameter is a very delicate issue because few information are usually available for estimating it. Many strategies have been proposed, but they often result in procedures based on plug-in estimates, and in this thesis we show that this discrepancy between the estimation and the testing framework can lead to uncontrolled first-type errors. We propose a mixture model that allows each gene to share information with other genes that exhibit similar variability. Afterwards, three consistent statistical tests are developed for differential expression analysis. We show that the proposed method improves the sensitivity of detecting differentially expressed genes with respect to the common procedures, since it is the best one in reaching the nominal value for the first-type error, while keeping elevate power. The method is finally illustrated on prostate cancer RNA-seq data.
Resumo:
In this thesis, new classes of models for multivariate linear regression defined by finite mixtures of seemingly unrelated contaminated normal regression models and seemingly unrelated contaminated normal cluster-weighted models are illustrated. The main difference between such families is that the covariates are treated as fixed in the former class of models and as random in the latter. Thus, in cluster-weighted models the assignment of the data points to the unknown groups of observations depends also by the covariates. These classes provide an extension to mixture-based regression analysis for modelling multivariate and correlated responses in the presence of mild outliers that allows to specify a different vector of regressors for the prediction of each response. Expectation-conditional maximisation algorithms for the calculation of the maximum likelihood estimate of the model parameters have been derived. As the number of free parameters incresases quadratically with the number of responses and the covariates, analyses based on the proposed models can become unfeasible in practical applications. These problems have been overcome by introducing constraints on the elements of the covariance matrices according to an approach based on the eigen-decomposition of the covariance matrices. The performances of the new models have been studied by simulations and using real datasets in comparison with other models. In order to gain additional flexibility, mixtures of seemingly unrelated contaminated normal regressions models have also been specified so as to allow mixing proportions to be expressed as functions of concomitant covariates. An illustration of the new models with concomitant variables and a study on housing tension in the municipalities of the Emilia-Romagna region based on different types of multivariate linear regression models have been performed.
Resumo:
Wave breaking is an important coastal process, influencing hydro-morphodynamic processes such as turbulence generation and wave energy dissipation, run-up on the beach and overtopping of coastal defence structures. During breaking, waves are complex mixtures of air and water (“white water”) whose properties affect velocity and pressure fields in the vicinity of the free surface and, depending on the breaker characteristics, different mechanisms for air entrainment are usually observed. Several laboratory experiments have been performed to investigate the role of air bubbles in the wave breaking process (Chanson & Cummings, 1994, among others) and in wave loading on vertical wall (Oumeraci et al., 2001; Peregrine et al., 2006, among others), showing that the air phase is not negligible since the turbulent energy dissipation involves air-water mixture. The recent advancement of numerical models has given valuable insights in the knowledge of wave transformation and interaction with coastal structures. Among these models, some solve the RANS equations coupled with a free-surface tracking algorithm and describe velocity, pressure, turbulence and vorticity fields (Lara et al. 2006 a-b, Clementi et al., 2007). The single-phase numerical model, in which the constitutive equations are solved only for the liquid phase, neglects effects induced by air movement and trapped air bubbles in water. Numerical approximations at the free surface may induce errors in predicting breaking point and wave height and moreover, entrapped air bubbles and water splash in air are not properly represented. The aim of the present thesis is to develop a new two-phase model called COBRAS2 (stands for Cornell Breaking waves And Structures 2 phases), that is the enhancement of the single-phase code COBRAS0, originally developed at Cornell University (Lin & Liu, 1998). In the first part of the work, both fluids are considered as incompressible, while the second part will treat air compressibility modelling. The mathematical formulation and the numerical resolution of the governing equations of COBRAS2 are derived and some model-experiment comparisons are shown. In particular, validation tests are performed in order to prove model stability and accuracy. The simulation of the rising of a large air bubble in an otherwise quiescent water pool reveals the model capability to reproduce the process physics in a realistic way. Analytical solutions for stationary and internal waves are compared with corresponding numerical results, in order to test processes involving wide range of density difference. Waves induced by dam-break in different scenarios (on dry and wet beds, as well as on a ramp) are studied, focusing on the role of air as the medium in which the water wave propagates and on the numerical representation of bubble dynamics. Simulations of solitary and regular waves, characterized by both spilling and plunging breakers, are analyzed with comparisons with experimental data and other numerical model in order to investigate air influence on wave breaking mechanisms and underline model capability and accuracy. Finally, modelling of air compressibility is included in the new developed model and is validated, revealing an accurate reproduction of processes. Some preliminary tests on wave impact on vertical walls are performed: since air flow modelling allows to have a more realistic reproduction of breaking wave propagation, the dependence of wave breaker shapes and aeration characteristics on impact pressure values is studied and, on the basis of a qualitative comparison with experimental observations, the numerical simulations achieve good results.
Resumo:
Spatial prediction of hourly rainfall via radar calibration is addressed. The change of support problem (COSP), arising when the spatial supports of different data sources do not coincide, is faced in a non-Gaussian setting; in fact, hourly rainfall in Emilia-Romagna region, in Italy, is characterized by abundance of zero values and right-skeweness of the distribution of positive amounts. Rain gauge direct measurements on sparsely distributed locations and hourly cumulated radar grids are provided by the ARPA-SIMC Emilia-Romagna. We propose a three-stage Bayesian hierarchical model for radar calibration, exploiting rain gauges as reference measure. Rain probability and amounts are modeled via linear relationships with radar in the log scale; spatial correlated Gaussian effects capture the residual information. We employ a probit link for rainfall probability and Gamma distribution for rainfall positive amounts; the two steps are joined via a two-part semicontinuous model. Three model specifications differently addressing COSP are presented; in particular, a stochastic weighting of all radar pixels, driven by a latent Gaussian process defined on the grid, is employed. Estimation is performed via MCMC procedures implemented in C, linked to R software. Communication and evaluation of probabilistic, point and interval predictions is investigated. A non-randomized PIT histogram is proposed for correctly assessing calibration and coverage of two-part semicontinuous models. Predictions obtained with the different model specifications are evaluated via graphical tools (Reliability Plot, Sharpness Histogram, PIT Histogram, Brier Score Plot and Quantile Decomposition Plot), proper scoring rules (Brier Score, Continuous Rank Probability Score) and consistent scoring functions (Root Mean Square Error and Mean Absolute Error addressing the predictive mean and median, respectively). Calibration is reached and the inclusion of neighbouring information slightly improves predictions. All specifications outperform a benchmark model with incorrelated effects, confirming the relevance of spatial correlation for modeling rainfall probability and accumulation.
Resumo:
Besides increasing the share of electric and hybrid vehicles, in order to comply with more stringent environmental protection limitations, in the mid-term the auto industry must improve the efficiency of the internal combustion engine and the well to wheel efficiency of the employed fuel. To achieve this target, a deeper knowledge of the phenomena that influence the mixture formation and the chemical reactions involving new synthetic fuel components is mandatory, but complex and time intensive to perform purely by experimentation. Therefore, numerical simulations play an important role in this development process, but their use can be effective only if they can be considered accurate enough to capture these variations. The most relevant models necessary for the simulation of the reacting mixture formation and successive chemical reactions have been investigated in the present work, with a critical approach, in order to provide instruments to define the most suitable approaches also in the industrial context, which is limited by time constraints and budget evaluations. To overcome these limitations, new methodologies have been developed to conjugate detailed and simplified modelling techniques for the phenomena involving chemical reactions and mixture formation in non-traditional conditions (e.g. water injection, biofuels etc.). Thanks to the large use of machine learning and deep learning algorithms, several applications have been revised or implemented, with the target of reducing the computing time of some traditional tasks by orders of magnitude. Finally, a complete workflow leveraging these new models has been defined and used for evaluating the effects of different surrogate formulations of the same experimental fuel on a proof-of-concept GDI engine model.
Resumo:
This PhD project focuses on the study of the early stages of bone biomineralization in 2D and 3D cultures of osteoblast-like SaOS-2 osteosarcoma cells, exposed to an osteogenic cocktail. The efficacy of osteogenic treatment was assessed on 2D cell cultures after 7 days. A large calcium minerals production, an overexpression of osteogenic markers and of alkaline phosphatase activity occurred in treated samples. TEM microscopy and cryo-XANES micro-spectroscopy were performed for localizing and characterizing Ca-depositions. These techniques revealed a different localization and chemical composition of Ca-minerals over time and after treatment. Nevertheless, the Mito stress test showed in treated samples a significant increase in maximal respiration levels associated to an upregulation of mitochondrial biogenesis indicative of an ongoing differentiation process. The 3D cell cultures were realized using two different hydrogels: a commercial collagen type I and a mixture of agarose and lactose-modified chitosan (CTL). Both biomaterials showed good biocompatibility with SaOS-2 cells. The gene expression analysis of SaOS-2 cells on collagen scaffolds indicated an osteogenic commitment after treatment. and Alizarin red staining highlighted the presence of Ca-spots in the differentiated samples. In addition, the intracellular magnesium quantification, and the X-ray microscopy on mineral depositions, suggested the incorporation of Mg during the early stages of bone formation process., SaOS-2 cells treated with osteogenic cocktail produced Ca mineral deposits also on CTL/agarose scaffolds, as confirmed by alizarin red staining. Further studies are underway to evaluate the differentiation also at the genetic level. Thanks to the combination of conventional laboratory methods and synchrotron-based techniques, it has been demonstrated that SaOS-2 is a suitable model for the study of biomineralization in vitro. These results have contributed to a deeper knowledge of biomineralization process in osteosarcoma cells and could provide new evidences about a therapeutic strategy acting on the reversibility of tumorigenicity by osteogenic induction.
Resumo:
Comparative studies on constitutional design for divided societies indicate that there is no magic formula to the challenges that these societies pose, as lots of factors influence constitutional design. In the literature on asymmetric federalism, the introduction of constitutional asymmetries is considered a flexible instrument of ethnic conflict resolution, as it provides a mixture of the two main theoretical approaches to constitutional design for divided societies (i.e., integration and accommodation). Indeed, constitutional asymmetries are a complex and multifaceted phenomenon, as their degree of intensity can vary across constitutional systems, and there are both legal and extra-legal factors that may explain such differences. This thesis argues that constitutional asymmetries provide a flexible model of constitutional design and aims to explore the legal factors that are most likely to explain the different degrees of constitutional asymmetry in divided multi-tiered systems. To this end, the research adopts a qualitative methodology, i.e., Qualitative Comparative Analysis (QCA), which allows an understanding of whether a condition or combination of conditions (i.e., the legal factors) determine the outcome (i.e., high, medium, low degree of constitutional asymmetry, or constitutional symmetry). The QCA is conducted on 16 divided multi-tiered systems, and for each case, the degree of constitutional asymmetry was analyzed by employing standardized indexes on subnational autonomy, allowing for a more precise measure of constitutional asymmetry than has previously been provided in the literature. Overall, the research confirms the complex nature of constitutional asymmetries, as the degrees of asymmetries vary substantially not only across systems but also within cases among the dimensions of subnational autonomy. The outcome of the Qualitative Comparative Analysis also confirms a path of complex causality since the different degrees of constitutional asymmetry always depend on several legal factors, that combined produce a low, medium, or high degree of constitutional asymmetry or, conversely, constitutional symmetry.