966 resultados para Gaussian Mixture Model
Resumo:
A Bayesian procedure for the retrieval of wind vectors over the ocean using satellite borne scatterometers requires realistic prior near-surface wind field models over the oceans. We have implemented carefully chosen vector Gaussian Process models; however in some cases these models are too smooth to reproduce real atmospheric features, such as fronts. At the scale of the scatterometer observations, fronts appear as discontinuities in wind direction. Due to the nature of the retrieval problem a simple discontinuity model is not feasible, and hence we have developed a constrained discontinuity vector Gaussian Process model which ensures realistic fronts. We describe the generative model and show how to compute the data likelihood given the model. We show the results of inference using the model with Markov Chain Monte Carlo methods on both synthetic and real data.
Resumo:
This paper presents a novel approach to water pollution detection from remotely sensed low-platform mounted visible band camera images. We examine the feasibility of unsupervised segmentation for slick (oily spills on the water surface) region labelling. Adaptive and non adaptive filtering is combined with density modeling of the obtained textural features. A particular effort is concentrated on the textural feature extraction from raw intensity images using filter banks and adaptive feature extraction from the obtained output coefficients. Segmentation in the extracted feature space is achieved using Gaussian mixture models (GMM).
Resumo:
Investigations into the modelling techniques that depict the transport of discrete phases (gas bubbles or solid particles) and model biochemical reactions in a bubble column reactor are discussed here. The mixture model was used to calculate gas-liquid, solid-liquid and gasliquid-solid interactions. Multiphase flow is a difficult phenomenon to capture, particularly in bubble columns where the major driving force is caused by the injection of gas bubbles. The gas bubbles cause a large density difference to occur that results in transient multi-dimensional fluid motion. Standard design procedures do not account for the transient motion, due to the simplifying assumptions of steady plug flow. Computational fluid dynamics (CFD) can assist in expanding the understanding of complex flows in bubble columns by characterising the flow phenomena for many geometrical configurations. Therefore, CFD has a role in the education of chemical and biochemical engineers, providing the examples of flow phenomena that many engineers may not experience, even through experimentation. The performance of the mixture model was investigated for three domains (plane, rectangular and cylindrical) and three flow models (laminar, k-e turbulence and the Reynolds stresses). mThis investigation raised many questions about how gas-liquid interactions are captured numerically. To answer some of these questions the analogy between thermal convection in a cavity and gas-liquid flow in bubble columns was invoked. This involved modelling the buoyant motion of air in a narrow cavity for a number of turbulence schemes. The difference in density was caused by a temperature gradient that acted across the width of the cavity. Multiple vortices were obtained when the Reynolds stresses were utilised with the addition of a basic flow profile after each time step. To implement the three-phase models an alternative mixture model was developed and compared against a commercially available mixture model for three turbulence schemes. The scheme where just the Reynolds stresses model was employed, predicted the transient motion of the fluids quite well for both mixture models. Solid-liquid and then alternative formulations of gas-liquid-solid model were compared against one another. The alternative form of the mixture model was found to perform particularly well for both gas and solid phase transport when calculating two and three-phase flow. The improvement in the solutions obtained was a result of the inclusion of the Reynolds stresses model and differences in the mixture models employed. The differences between the alternative mixture models were found in the volume fraction equation (flux and deviatoric stress tensor terms) and the viscosity formulation for the mixture phase.
Resumo:
The main objective of the project is to enhance the already effective health-monitoring system (HUMS) for helicopters by analysing structural vibrations to recognise different flight conditions directly from sensor information. The goal of this paper is to develop a new method to select those sensors and frequency bands that are best for detecting changes in flight conditions. We projected frequency information to a 2-dimensional space in order to visualise flight-condition transitions using the Generative Topographic Mapping (GTM) and a variant which supports simultaneous feature selection. We created an objective measure of the separation between different flight conditions in the visualisation space by calculating the Kullback-Leibler (KL) divergence between Gaussian mixture models (GMMs) fitted to each class: the higher the KL-divergence, the better the interclass separation. To find the optimal combination of sensors, they were considered in pairs, triples and groups of four sensors. The sensor triples provided the best result in terms of KL-divergence. We also found that the use of a variational training algorithm for the GMMs gave more reliable results.
Resumo:
Optimal design for parameter estimation in Gaussian process regression models with input-dependent noise is examined. The motivation stems from the area of computer experiments, where computationally demanding simulators are approximated using Gaussian process emulators to act as statistical surrogates. In the case of stochastic simulators, which produce a random output for a given set of model inputs, repeated evaluations are useful, supporting the use of replicate observations in the experimental design. The findings are also applicable to the wider context of experimental design for Gaussian process regression and kriging. Designs are proposed with the aim of minimising the variance of the Gaussian process parameter estimates. A heteroscedastic Gaussian process model is presented which allows for an experimental design technique based on an extension of Fisher information to heteroscedastic models. It is empirically shown that the error of the approximation of the parameter variance by the inverse of the Fisher information is reduced as the number of replicated points is increased. Through a series of simulation experiments on both synthetic data and a systems biology stochastic simulator, optimal designs with replicate observations are shown to outperform space-filling designs both with and without replicate observations. Guidance is provided on best practice for optimal experimental design for stochastic response models. © 2013 Elsevier Inc. All rights reserved.
Resumo:
A new generation of high-capacity WDM systems with extremely robust performance has been enabled by coherent transmission and digital signal processing. To facilitate widespread deployment of this technology, particularly in the metro space, new photonic components and subsystems are being developed to support cost-effective, compact, and scalable transceivers. We briefly review the recent progress in InP-based photonic components, and report numerical simulation results of an InP-based transceiver comprising a dual-polarization I/Q modulator and a commercial DSP ASIC. Predicted performance penalties due to the nonlinear response, lower bandwidth, and finite extinction ratio of these transceivers are less than 1 and 2 dB for 100-G PM-QPSK and 200-G PM-16QAM, respectively. Using the well-established Gaussian-Noise model, estimated system reach of 100-G PM-QPSK is greater than 600 km for typical ROADM-based metro-regional systems with internode losses up to 20 dB. © 1983-2012 IEEE.
Resumo:
In this talk we investigate the usage of spectrally shaped amplified spontaneous emission (ASE) in order to emulate highly dispersed wavelength division multiplexed (WDM) signals in an optical transmission system. Such a technique offers various simplifications to large scale WDM experiments. Not only does it offer a reduction in transmitter complexity, removing the need for multiple source lasers, it potentially reduces the test and measurement complexity by requiring only the centre channel of a WDM system to be measured in order to estimate WDM worst case performance. The use of ASE as a test and measurement tool is well established in optical communication systems and several measurement techniques will be discussed [1, 2]. One of the most prevalent uses of ASE is in the measurement of receiver sensitivity where ASE is introduced in order to degrade the optical signal to noise ratio (OSNR) and measure the resulting bit error rate (BER) at the receiver. From an analytical point of view noise has been used to emulate system performance, the Gaussian Noise model is used as an estimate of highly dispersed signals and has had consider- able interest [3]. The work to be presented here extends the use of ASE by using it as a metric to emulate highly dispersed WDM signals and in the process reduce WDM transmitter complexity and receiver measurement time in a lab environment. Results thus far have indicated [2] that such a transmitter configuration is consistent with an AWGN model for transmission, with modulation format complexity and nonlinearities playing a key role in estimating the performance of systems utilising the ASE channel emulation technique. We conclude this work by investigating techniques capable of characterising the nonlinear and damage limits of optical fibres and the resultant information capacity limits. REFERENCES McCarthy, M. E., N. Mac Suibhne, S. T. Le, P. Harper, and A. D. Ellis, “High spectral efficiency transmission emulation for non-linear transmission performance estimation for high order modulation formats," 2014 European Conference on IEEE Optical Communication (ECOC), 2014. 2. Ellis, A., N. Mac Suibhne, F. Gunning, and S. Sygletos, “Expressions for the nonlinear trans- mission performance of multi-mode optical fiber," Opt. Express, Vol. 21, 22834{22846, 2013. Vacondio, F., O. Rival, C. Simonneau, E. Grellier, A. Bononi, L. Lorcy, J. Antona, and S. Bigo, “On nonlinear distortions of highly dispersive optical coherent systems," Opt. Express, Vol. 20, 1022-1032, 2012.
Resumo:
Estimates of abundance or density are essential for wildlife management and conservation. There are few effective density estimates for the Buff-throated Partridge Tetraophasis szechenyii, a rare and elusive high-mountain Galliform species endemic to western China. In this study, we used the temporary emigration N-mixture model to estimate density of this species, with data acquired from playback point count surveys around a sacred area based on indigenous Tibetan culture of protection of wildlife, in Yajiang County, Sichuan, China, during April–June 2009. Within 84 125-m radius points, we recorded 53 partridge groups during three repeats. The best model indicated that detection probability was described by covariates of vegetation cover type, week of visit, time of day, and weather with weak effects, and a partridge group was present during a sampling period with a constant probability. The abundance component was accounted for by vegetation association. Abundance was substantially higher in rhododendron shrubs, fir-larch forests, mixed spruce-larch-birch forests, and especially oak thickets than in pine forests. The model predicted a density of 5.14 groups/km², which is similar to an estimate of 4.7 – 5.3 groups/km² quantified via an intensive spot-mapping effort. The post-hoc estimate of individual density was 14.44 individuals/km², based on the estimated mean group size of 2.81. We suggest that the method we employed is applicable to estimate densities of Buff-throated Partridges in large areas. Given importance of a mosaic habitat for this species, local logging should be regulated. Despite no effect of the conservation area (sacred) on the abundance of Buff-throated Partridges, we suggest regulations linking the sacred mountain conservation area with the official conservation system because of strong local participation facilitated by sacred mountains in land conservation.
Resumo:
Abstract
Continuous variable is one of the major data types collected by the survey organizations. It can be incomplete such that the data collectors need to fill in the missingness. Or, it can contain sensitive information which needs protection from re-identification. One of the approaches to protect continuous microdata is to sum them up according to different cells of features. In this thesis, I represents novel methods of multiple imputation (MI) that can be applied to impute missing values and synthesize confidential values for continuous and magnitude data.
The first method is for limiting the disclosure risk of the continuous microdata whose marginal sums are fixed. The motivation for developing such a method comes from the magnitude tables of non-negative integer values in economic surveys. I present approaches based on a mixture of Poisson distributions to describe the multivariate distribution so that the marginals of the synthetic data are guaranteed to sum to the original totals. At the same time, I present methods for assessing disclosure risks in releasing such synthetic magnitude microdata. The illustration on a survey of manufacturing establishments shows that the disclosure risks are low while the information loss is acceptable.
The second method is for releasing synthetic continuous micro data by a nonstandard MI method. Traditionally, MI fits a model on the confidential values and then generates multiple synthetic datasets from this model. Its disclosure risk tends to be high, especially when the original data contain extreme values. I present a nonstandard MI approach conditioned on the protective intervals. Its basic idea is to estimate the model parameters from these intervals rather than the confidential values. The encouraging results of simple simulation studies suggest the potential of this new approach in limiting the posterior disclosure risk.
The third method is for imputing missing values in continuous and categorical variables. It is extended from a hierarchically coupled mixture model with local dependence. However, the new method separates the variables into non-focused (e.g., almost-fully-observed) and focused (e.g., missing-a-lot) ones. The sub-model structure of focused variables is more complex than that of non-focused ones. At the same time, their cluster indicators are linked together by tensor factorization and the focused continuous variables depend locally on non-focused values. The model properties suggest that moving the strongly associated non-focused variables to the side of focused ones can help to improve estimation accuracy, which is examined by several simulation studies. And this method is applied to data from the American Community Survey.
Resumo:
Brain injury due to lack of oxygen or impaired blood flow around the time of birth, may cause long term neurological dysfunction or death in severe cases. The treatments need to be initiated as soon as possible and tailored according to the nature of the injury to achieve best outcomes. The Electroencephalogram (EEG) currently provides the best insight into neurological activities. However, its interpretation presents formidable challenge for the neurophsiologists. Moreover, such expertise is not widely available particularly around the clock in a typical busy Neonatal Intensive Care Unit (NICU). Therefore, an automated computerized system for detecting and grading the severity of brain injuries could be of great help for medical staff to diagnose and then initiate on-time treatments. In this study, automated systems for detection of neonatal seizures and grading the severity of Hypoxic-Ischemic Encephalopathy (HIE) using EEG and Heart Rate (HR) signals are presented. It is well known that there is a lot of contextual and temporal information present in the EEG and HR signals if examined at longer time scale. The systems developed in the past, exploited this information either at very early stage of the system without any intelligent block or at very later stage where presence of such information is much reduced. This work has particularly focused on the development of a system that can incorporate the contextual information at the middle (classifier) level. This is achieved by using dynamic classifiers that are able to process the sequences of feature vectors rather than only one feature vector at a time.
Resumo:
The Dirichlet distribution is a multivariate generalization of the Beta distribution. It is an important multivariate continuous distribution in probability and statistics. In this report, we review the Dirichlet distribution and study its properties, including statistical and information-theoretic quantities involving this distribution. Also, relationships between the Dirichlet distribution and other distributions are discussed. There are some different ways to think about generating random variables with a Dirichlet distribution. The stick-breaking approach and the Pólya urn method are discussed. In Bayesian statistics, the Dirichlet distribution and the generalized Dirichlet distribution can both be a conjugate prior for the Multinomial distribution. The Dirichlet distribution has many applications in different fields. We focus on the unsupervised learning of a finite mixture model based on the Dirichlet distribution. The Initialization Algorithm and Dirichlet Mixture Estimation Algorithm are both reviewed for estimating the parameters of a Dirichlet mixture. Three experimental results are shown for the estimation of artificial histograms, summarization of image databases and human skin detection.
Resumo:
An RVE–based stochastic numerical model is used to calculate the permeability of randomly generated porous media at different values of the fiber volume fraction for the case of transverse flow in a unidirectional ply. Analysis of the numerical results shows that the permeability is not normally distributed. With the aim of proposing a new understanding on this particular topic, permeability data are fitted using both a mixture model and a unimodal distribution. Our findings suggest that permeability can be fitted well using a mixture model based on the lognormal and power law distributions. In case of a unimodal distribution, it is found, using the maximum-likelihood estimation method (MLE), that the generalized extreme value (GEV) distribution represents the best fit. Finally, an expression of the permeability as a function of the fiber volume fraction based on the GEV distribution is discussed in light of the previous results.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
O presente trabalho visa o desenvolvimento de um processo para a produção de biodiesel partindo de óleos de alta acidez, aplicando um processo em duas etapas de catálise homogênea. A primeira é a reação de esterificação etílica dos ácidos graxos livres, catalisada por H2SO4, ocorrendo no meio de triglicerídeos e a segunda é a transesterificação dos triglicerídeos remanescentes, ocorrendo no meio dos ésteres alquílicos da primeira etapa e catalisada com álcali (NaOH) e álcool etílico ou metílico. A reação de esterificação foi estudada com uma mistura modelo consistindo de óleo de soja neutro acidificado artificialmente com 15%p de ácido oleico PA. Este valor foi adotado, como referência, devido a certas gorduras regionais (óleo de mamona advinda de agricultura familiar, sebos de matadouro e óleo de farelo de arroz, etc.) apresentarem teores entre 10-20%p de ácidos graxos livres. Nas duas etapas o etanol é reagente e também solvente, sendo a razão molar mistura:álcool um dos parâmetros pesquisados nas relações 1:3, 1:6 e 1:9. Outros foram a temperatura 60 e 80ºC e a concentração percentual do catalisador, 0,5, 1,0 e 1,5%p, (em relação à massa de óleo). A combinatória destes parâmetros resultou em 18 reações. Dentre as condições reacionais estudadas, oito atingiram acidez aceitável inferior a 1,5%p possibilitando a definição das condições para aplicação ótima da segunda etapa. A melhor condição nesta etapa ocorreu quando a reação foi conduzida a 60°C com 1%p de H2SO4 e razão molar 1:6. No final da primeira etapa foram realizados tratamentos pertinentes como a retirada do catalisador e estudada sua influência sobre a acidez final, utilizando-se de lavagens com e sem adição de hexano, seguidas de evaporação ou adição de agente secante. Na segunda etapa estudaram-se as razões molares de óleo:álcool de 1:6 e 1:9 com álcool metílico e etílico, com 0,5 e 1%p de NaOH assim como o tratamento da reação (lavagem ou neutralização do catalisador) a 60°C, resultando em 16 experimentos. A melhor condição nesta segunda etapa ocorreu com 0,5%p de NaOH, razão molar óleo:etanol de 1:6 e somente as reações em que se aplicaram lavagens apresentaram índices de acidez adequados (<1,0%p) coerentes com os parâmetros da ANP.