983 resultados para Mixture-models
Resumo:
Human preimplantation embryos exhibit high levels of apoptotic cells and high rates of developmental arrest during the first week in vitro. The relation between the two is unclear and difficult to determine by conventional experimental approaches, partly because of limited numbers of embryos. We apply a mixture of experiment and mathematical modeling to show that observed levels of cell death can be reconciled with the high levels of embryo arrest seen in the human only if the developmental competence of embryos is already established at the zygote stage, and environmental factors merely modulate this. This suggests that research on improving in vitro fertilization success rates should move from its current concentration on optimizing culture media to focus more on the generation of a healthy zygote and on understanding the mechanisms that cause chromosomal and other abnormalities during early cleavage stages.
Finite mixture regression model with random effects: application to neonatal hospital length of stay
Resumo:
A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
The modelling of inpatient length of stay (LOS) has important implications in health care studies. Finite mixture distributions are usually used to model the heterogeneous LOS distribution, due to a certain proportion of patients sustaining-a longer stay. However, the morbidity data are collected from hospitals, observations clustered within the same hospital are often correlated. The generalized linear mixed model approach is adopted to accommodate the inherent correlation via unobservable random effects. An EM algorithm is developed to obtain residual maximum quasi-likelihood estimation. The proposed hierarchical mixture regression approach enables the identification and assessment of factors influencing the long-stay proportion and the LOS for the long-stay patient subgroup. A neonatal LOS data set is used for illustration, (C) 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
In this paper, we investigate the effects of potential models on the description of equilibria of linear molecules (ethylene and ethane) adsorption on graphitized thermal carbon black. GCMC simulation is used as a tool to give adsorption isotherms, isosteric heat of adsorption and the microscopic configurations of these molecules. At the heart of the GCMC are the potential models, describing fluid-fluid interaction and solid-fluid interaction. Here we studied the two potential models recently proposed in the literature, the UA-TraPPE and AUA4. Their impact in the description of adsorption behavior of pure components will be discussed. Mixtures of these components with nitrogen and argon are also studied. Nitrogen is modeled a two-site plus discrete charges while argon as a spherical particle. GCMC simulation is also used for generating simulation mixture isotherms. It is found that co-operation between species occurs when the surface is fractionally covered while competition is important when surface is fully loaded.
Resumo:
Motivation: An important problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. We provide a straightforward and easily implemented method for estimating the posterior probability that an individual gene is null. The problem can be expressed in a two-component mixture framework, using an empirical Bayes approach. Current methods of implementing this approach either have some limitations due to the minimal assumptions made or with more specific assumptions are computationally intensive. Results: By converting to a z-score the value of the test statistic used to test the significance of each gene, we propose a simple two-component normal mixture that models adequately the distribution of this score. The usefulness of our approach is demonstrated on three real datasets.
Resumo:
Quantitatively predicting mass transport rates for chemical mixtures in porous materials is important in applications of materials such as adsorbents, membranes, and catalysts. Because directly assessing mixture transport experimentally is challenging, theoretical models that can predict mixture diffusion coefficients using Only single-component information would have many uses. One such model was proposed by Skoulidas, Sholl, and Krishna (Langmuir, 2003, 19, 7977), and applications of this model to a variety of chemical mixtures in nanoporous materials have yielded promising results. In this paper, the accuracy of this model for predicting mixture diffusion coefficients in materials that exhibit a heterogeneous distribution of local binding energies is examined. To examine this issue, single-component and binary mixture diffusion coefficients are computed using kinetic Monte Carlo for a two-dimensional lattice model over a wide range of lattice occupancies and compositions. The approach suggested by Skoulidas, Sholl, and Krishna is found to be accurate in situations where the spatial distribution of binding site energies is relatively homogeneous, but is considerably less accurate for strongly heterogeneous energy distributions.
Resumo:
This paper investigates the performance analysis of separation of mutually independent sources in nonlinear models. The nonlinear mapping constituted by an unsupervised linear mixture is followed by an unknown and invertible nonlinear distortion, are found in many signal processing cases. Generally, blind separation of sources from their nonlinear mixtures is rather difficult. We propose using a kernel density estimator incorporated with equivariant gradient analysis to separate the sources with nonlinear distortion. The kernel density estimator parameters of which are iteratively updated to minimize the output independence expressed as a mutual information criterion. The equivariant gradient algorithm has the form of nonlinear decorrelation to perform the convergence analysis. Experiments are proposed to illustrate these results.
Resumo:
Minimization of a sum-of-squares or cross-entropy error function leads to network outputs which approximate the conditional averages of the target data, conditioned on the input vector. For classifications problems, with a suitably chosen target coding scheme, these averages represent the posterior probabilities of class membership, and so can be regarded as optimal. For problems involving the prediction of continuous variables, however, the conditional averages provide only a very limited description of the properties of the target variables. This is particularly true for problems in which the mapping to be learned is multi-valued, as often arises in the solution of inverse problems, since the average of several correct target values is not necessarily itself a correct value. In order to obtain a complete description of the data, for the purposes of predicting the outputs corresponding to new input vectors, we must model the conditional probability distribution of the target data, again conditioned on the input vector. In this paper we introduce a new class of network models obtained by combining a conventional neural network with a mixture density model. The complete system is called a Mixture Density Network, and can in principle represent arbitrary conditional probability distributions in the same way that a conventional neural network can represent arbitrary functions. We demonstrate the effectiveness of Mixture Density Networks using both a toy problem and a problem involving robot inverse kinematics.
Resumo:
An interactive hierarchical Generative Topographic Mapping (HGTM) ¸iteHGTM has been developed to visualise complex data sets. In this paper, we build a more general visualisation system by extending the HGTM visualisation system in 3 directions: bf (1) We generalize HGTM to noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM) developed in ¸iteKabanpami. bf (2) We give the user a choice of initializing the child plots of the current plot in either em interactive, or em automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in ¸iteHGTM, whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of LTMs is employed. bf (3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualisation plots, since they can highlight the boundaries between data clusters. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a toy example and apply our system to three more complex real data sets.
Resumo:
We have proposed a novel robust inversion-based neurocontroller that searches for the optimal control law by sampling from the estimated Gaussian distribution of the inverse plant model. However, for problems involving the prediction of continuous variables, a Gaussian model approximation provides only a very limited description of the properties of the inverse model. This is usually the case for problems in which the mapping to be learned is multi-valued or involves hysteritic transfer characteristics. This often arises in the solution of inverse plant models. In order to obtain a complete description of the inverse model, a more general multicomponent distributions must be modeled. In this paper we test whether our proposed sampling approach can be used when considering an arbitrary conditional probability distributions. These arbitrary distributions will be modeled by a mixture density network. Importance sampling provides a structured and principled approach to constrain the complexity of the search space for the ideal control law. The effectiveness of the importance sampling from an arbitrary conditional probability distribution will be demonstrated using a simple single input single output static nonlinear system with hysteretic characteristics in the inverse plant model.
Resumo:
Damage to insulation materials located near to a primary circuit coolant leak may compromise the operation of the emergency core cooling system (ECCS). Insulation material in the form of mineral wool fiber agglomerates (MWFA) maybe transported to the containment sump strainers, where they may block or penetrate the strainers. Though the impact of MWFA on the pressure drop across the strainers is minimal, corrosion products formed over time may also accumulate in the fiber cakes on the strainers, which can lead to a significant increase in the strainer pressure drop and result in cavitation in the ECCS. An experimental and theoretical study performed by the Helmholtz-Zentrum Dresden-Rossendorf and the Hochschule Zittau/Görlitz is investigating the phenomena that maybe observed in the containment vessel during a primary circuit coolant leak. The study entails the generation of fiber agglomerates, the determination of their transport properties in single and multi-effect experiments and the long-term effect that corrosion and erosion of the containment internals by the coolant has on the strainer pressure drop. The focus of this paper is on the verification and validation of numerical models that can predict the transport of MWFA. A number of pseudo-continuous dispersed phases of spherical wetted agglomerates represent the MWFA. The size, density, the relative viscosity of the fluid-fiber agglomerate mixture and the turbulent dispersion all affect how the fiber agglomerates are transported. In the cases described here, the size is kept constant while the density is modified. This definition affects both the terminal velocity and volume fraction of the dispersed phases. Note that the relative viscosity is only significant at high concentrations. Three single effect experiments were used to provide validation data on the transport of the fiber agglomerates under conditions of sedimentation in quiescent fluid, sedimentation in a horizontal flow and suspension in a horizontal flow. The experiments were performed in a rectangular column for the quiescent fluid and a racetrack type channel that provided a near uniform horizontal flow. The numerical models of sedimentation in the column and the racetrack channel found that the sedimentation characteristics are consistent with the experiments. For channel suspension, the heavier fibers tend to accumulate at the channel base even at high velocities, while lighter phases are more likely to be transported around the channel.
Resumo:
Mineral wool insulation material applied to the primary cooling circuit of a nuclear reactor maybe damaged in the course of a loss of coolant accident (LOCA). The insulation material released by the leak may compromise the operation of the emergency core cooling system (ECCS), as it maybe transported together with the coolant in the form of mineral wool fiber agglomerates (MWFA) suspensions to the containment sump strainers, which are mounted at the inlet of the ECCS to keep any debris away from the emergency cooling pumps. In the further course of the LOCA, the MWFA may block or penetrate the strainers. In addition to the impact of MWFA on the pressure drop across the strainers, corrosion products formed over time may also accumulate in the fiber cakes on the strainers, which can lead to a significant increase in the strainer pressure drop and result in cavitation in the ECCS. Therefore, it is essential to understand the transport characteristics of the insulation materials in order to determine the long-term operability of nuclear reactors, which undergo LOCA. An experimental and theoretical study performed by the Helmholtz-Zentrum Dresden-Rossendorf and the Hochschule Zittau/Görlitz1 is investigating the phenomena that maybe observed in the containment vessel during a primary circuit coolant leak. The study entails the generation of fiber agglomerates, the determination of their transport properties in single and multi-effect experiments and the long-term effects that particles formed due to corrosion of metallic containment internals by the coolant medium have on the strainer pressure drop. The focus of this presentation is on the numerical models that are used to predict the transport of MWFA by CFD simulations. A number of pseudo-continuous dispersed phases of spherical wetted agglomerates can represent the MWFA. The size, density, the relative viscosity of the fluid-fiber agglomerate mixture and the turbulent dispersion all affect how the fiber agglomerates are transported. In the cases described here, the size is kept constant while the density is modified. This definition affects both the terminal velocity and volume fraction of the dispersed phases. Only one of the single effect experimental scenarios is described here that are used in validation of the numerical models. The scenario examines the suspension and horizontal transport of the fiber agglomerates in a racetrack type channel. The corresponding experiments will be described in an accompanying presentation (see abstract of Seeliger et al.).
Resumo:
Projection of a high-dimensional dataset onto a two-dimensional space is a useful tool to visualise structures and relationships in the dataset. However, a single two-dimensional visualisation may not display all the intrinsic structure. Therefore, hierarchical/multi-level visualisation methods have been used to extract more detailed understanding of the data. Here we propose a multi-level Gaussian process latent variable model (MLGPLVM). MLGPLVM works by segmenting data (with e.g. K-means, Gaussian mixture model or interactive clustering) in the visualisation space and then fitting a visualisation model to each subset. To measure the quality of multi-level visualisation (with respect to parent and child models), metrics such as trustworthiness, continuity, mean relative rank errors, visualisation distance distortion and the negative log-likelihood per point are used. We evaluate the MLGPLVM approach on the ‘Oil Flow’ dataset and a dataset of protein electrostatic potentials for the ‘Major Histocompatibility Complex (MHC) class I’ of humans. In both cases, visual observation and the quantitative quality measures have shown better visualisation at lower levels.
Resumo:
Recently, we have developed the hierarchical Generative Topographic Mapping (HGTM), an interactive method for visualization of large high-dimensional real-valued data sets. In this paper, we propose a more general visualization system by extending HGTM in three ways, which allows the user to visualize a wider range of data sets and better support the model development process. 1) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the Latent Trait Model (LTM). This enables us to visualize data of inherently discrete nature, e.g., collections of documents, in a hierarchical manner. 2) We give the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode, the user selects "regions of interest," whereas in the automatic mode, an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. 3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualization plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets. © 2005 IEEE.
Resumo:
Models for the conditional joint distribution of the U.S. Dollar/Japanese Yen and Euro/Japanese Yen exchange rates, from November 2001 until June 2007, are evaluated and compared. The conditional dependency is allowed to vary across time, as a function of either historical returns or a combination of past return data and option-implied dependence estimates. Using prices of currency options that are available in the public domain, risk-neutral dependency expectations are extracted through a copula repre- sentation of the bivariate risk-neutral density. For this purpose, we employ either the one-parameter \Normal" or a two-parameter \Gumbel Mixture" specification. The latter provides forward-looking information regarding the overall degree of covariation, as well as, the level and direction of asymmetric dependence. Specifications that include option-based measures in their information set are found to outperform, in-sample and out-of-sample, models that rely solely on historical returns.