900 resultados para Hierarchical Mixtures
Resumo:
We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM's). Learning is treated as a maximum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an on-line learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain.
Resumo:
Q. Meng and M.H. Lee, 'Error-driven active learning in growing radial basis function networks for early robot learning', 2006 IEEE International Conference on Robotics and Automation (IEEE ICRA 2006), 2984-90, Orlando, Florida, USA.
Resumo:
The expectation-maximization (EM) algorithm has been of considerable interest in recent years as the basis for various algorithms in application areas of neural networks such as pattern recognition. However, there exists some misconceptions concerning its application to neural networks. In this paper, we clarify these misconceptions and consider how the EM algorithm can be adopted to train multilayer perceptron (MLP) and mixture of experts (ME) networks in applications to multiclass classification. We identify some situations where the application of the EM algorithm to train MLP networks may be of limited value and discuss some ways of handling the difficulties. For ME networks, it is reported in the literature that networks trained by the EM algorithm using iteratively reweighted least squares (IRLS) algorithm in the inner loop of the M-step, often performed poorly in multiclass classification. However, we found that the convergence of the IRLS algorithm is stable and that the log likelihood is monotonic increasing when a learning rate smaller than one is adopted. Also, we propose the use of an expectation-conditional maximization (ECM) algorithm to train ME networks. Its performance is demonstrated to be superior to the IRLS algorithm on some simulated and real data sets.
Resumo:
Objective: Inpatient length of stay (LOS) is an important measure of hospital activity, health care resource consumption, and patient acuity. This research work aims at developing an incremental expectation maximization (EM) based learning approach on mixture of experts (ME) system for on-line prediction of LOS. The use of a batchmode learning process in most existing artificial neural networks to predict LOS is unrealistic, as the data become available over time and their pattern change dynamically. In contrast, an on-line process is capable of providing an output whenever a new datum becomes available. This on-the-spot information is therefore more useful and practical for making decisions, especially when one deals with a tremendous amount of data. Methods and material: The proposed approach is illustrated using a real example of gastroenteritis LOS data. The data set was extracted from a retrospective cohort study on all infants born in 1995-1997 and their subsequent admissions for gastroenteritis. The total number of admissions in this data set was n = 692. Linked hospitalization records of the cohort were retrieved retrospectively to derive the outcome measure, patient demographics, and associated co-morbidities information. A comparative study of the incremental learning and the batch-mode learning algorithms is considered. The performances of the learning algorithms are compared based on the mean absolute difference (MAD) between the predictions and the actual LOS, and the proportion of predictions with MAD < 1 day (Prop(MAD < 1)). The significance of the comparison is assessed through a regression analysis. Results: The incremental learning algorithm provides better on-line prediction of LOS when the system has gained sufficient training from more examples (MAD = 1.77 days and Prop(MAD < 1) = 54.3%), compared to that using the batch-mode learning. The regression analysis indicates a significant decrease of MAD (p-value = 0.063) and a significant (p-value = 0.044) increase of Prop(MAD
Resumo:
To improve the spatial distribution of nano particles in a polymeric host and to enhance the interfacial interaction with the host, the use of chain-end grafted nanoparticle has gained popularity in the field of polymeric nanocomposites. Besides changing the material properties of the host, these grafted nanoparticles strongly alter the dynamics of the polymer chain at both local and cooperative length scales (relaxations) by manipulating the enthalpic and entropic interactions. It is difficult to map the distribution of these chain-end grafted nanoparticles in the blend by conventional techniques, and herein, we attempted to characterize it by unique technique(s) like peak force quantitative nanomechanical mapping (PFQNM) through AFM (atomic force microscopy) imaging and dielectric relaxation spectroscopy (DRS). Such techniques, besides shedding light on the spatial distribution of the nanoparticles, also give critical information on the changing elasticity at smaller length scales and hierarchical polymer chain dynamics in the vicinity of the nanoparticles. The effect of one-dimensional rodlike multiwall carbon nanotubes (MWNTs), with the characteristic dimension of the order of the radius of gyration of the polymeric chain, on the phase miscibility and chain dynamics in a classical LCST mixture of polystyrene/ poly(vinyl methyl ether) (PS/PVME) was examined in detail using the above techniques. In order to tune the localization of the nanotubes, different molecular weights of PS (13, 31, and 46 kDa), synthesized using RAFT (reversible addition fragmentation chain transfer) polymerization, was grafted onto MWNTs in situ. The thermodynamic miscibility in the blends was assessed by low-amplitude isochronal temperature sweeps, the spatial distribution of MWNTs in the blends was evaluated by PFQNM, and the hierarchical polymer chain dynamics was studied by DRS. It was observed that the miscibility, concentration fluctuation, and cooperative relaxations of the PS/PVME blends are strongly governed by the spatial distribution of MWNTs in the blends. These findings should help guide theories and simulations of hierarchical chain dynamics in LCST mixtures containing rodlike nanoparticles.
Resumo:
To improve the spatial distribution of nano particles in a polymeric host and to enhance the interfacial interaction with the host, the use of chain-end grafted nanoparticle has gained popularity in the field of polymeric nanocomposites. Besides changing the material properties of the host, these grafted nanoparticles strongly alter the dynamics of the polymer chain at both local and cooperative length scales (relaxations) by manipulating the enthalpic and entropic interactions. It is difficult to map the distribution of these chain-end grafted nanoparticles in the blend by conventional techniques, and herein, we attempted to characterize it by unique technique(s) like peak force quantitative nanomechanical mapping (PFQNM) through AFM (atomic force microscopy) imaging and dielectric relaxation spectroscopy (DRS). Such techniques, besides shedding light on the spatial distribution of the nanoparticles, also give critical information on the changing elasticity at smaller length scales and hierarchical polymer chain dynamics in the vicinity of the nanoparticles. The effect of one-dimensional rodlike multiwall carbon nanotubes (MWNTs), with the characteristic dimension of the order of the radius of gyration of the polymeric chain, on the phase miscibility and chain dynamics in a classical LCST mixture of polystyrene/ poly(vinyl methyl ether) (PS/PVME) was examined in detail using the above techniques. In order to tune the localization of the nanotubes, different molecular weights of PS (13, 31, and 46 kDa), synthesized using RAFT (reversible addition fragmentation chain transfer) polymerization, was grafted onto MWNTs in situ. The thermodynamic miscibility in the blends was assessed by low-amplitude isochronal temperature sweeps, the spatial distribution of MWNTs in the blends was evaluated by PFQNM, and the hierarchical polymer chain dynamics was studied by DRS. It was observed that the miscibility, concentration fluctuation, and cooperative relaxations of the PS/PVME blends are strongly governed by the spatial distribution of MWNTs in the blends. These findings should help guide theories and simulations of hierarchical chain dynamics in LCST mixtures containing rodlike nanoparticles.
Resumo:
Many data are naturally modeled by an unobserved hierarchical structure. In this paper we propose a flexible nonparametric prior over unknown data hierarchies. The approach uses nested stick-breaking processes to allow for trees of unbounded width and depth, where data can live at any node and are infinitely exchangeable. One can view our model as providing infinite mixtures where the components have a dependency structure corresponding to an evolutionary diffusion down a tree. By using a stick-breaking approach, we can apply Markov chain Monte Carlo methods based on slice sampling to perform Bayesian inference and simulate from the posterior distribution on trees. We apply our method to hierarchical clustering of images and topic modeling of text data.
Resumo:
The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs (1993) recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architecture of Jordan and Jacobs (1992). They showed empirically that the EM algorithm for these architectures yields significantly faster convergence than gradient ascent. In the current paper we provide a theoretical analysis of this algorithm. We show that the algorithm can be regarded as a variable metric algorithm with its searching direction having a positive projection on the gradient of the log likelihood. We also analyze the convergence of the algorithm and provide an explicit expression for the convergence rate. In addition, we describe an acceleration technique that yields a significant speedup in simulation experiments.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for nonlinear regression models under scale mixtures of skew-normal distributions. This novel class of models provides a useful generalization of the symmetrical nonlinear regression models since the error distributions cover both skewness and heavy-tailed distributions such as the skew-t, skew-slash and the skew-contaminated normal distributions. The main advantage of these class of distributions is that they have a nice hierarchical representation that allows the implementation of Markov chain Monte Carlo (MCMC) methods to simulate samples from the joint posterior distribution. In order to examine the robust aspects of this flexible class, against outlying and influential observations, we present a Bayesian case deletion influence diagnostics based on the Kullback-Leibler divergence. Further, some discussions on the model selection criteria are given. The newly developed procedures are illustrated considering two simulations study, and a real data previously analyzed under normal and skew-normal nonlinear regression models. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Scale mixtures of the skew-normal (SMSN) distribution is a class of asymmetric thick-tailed distributions that includes the skew-normal (SN) distribution as a special case. The main advantage of these classes of distributions is that they are easy to simulate and have a nice hierarchical representation facilitating easy implementation of the expectation-maximization algorithm for the maximum-likelihood estimation. In this paper, we assume an SMSN distribution for the unobserved value of the covariates and a symmetric scale mixtures of the normal distribution for the error term of the model. This provides a robust alternative to parameter estimation in multivariate measurement error models. Specific distributions examined include univariate and multivariate versions of the SN, skew-t, skew-slash and skew-contaminated normal distributions. The results and methods are applied to a real data set.
Resumo:
With global warming becoming one of the main problems our society is facing nowadays, there is an urgent demand to develop materials suitable for CO2 storage as well as for gas separation. Within this context, hierarchical porous structures are of great interest for in-flow applications because of the desirable combination of an extensive internal reactive surface along narrow nanopores with facile molecular transport through broad “highways” leading to and from these pores. Deep eutectic solvents (DESs) have been recently used in the synthesis of carbon monoliths exhibiting a bicontinuous porous structure composed of continuous macroporous channels and a continuous carbon network that contains a certain microporosity and provides considerable surface area. In this work, we have prepared two DESs for the preparation of two hierarchical carbon monoliths with different compositions (e.g., either nitrogen-doped or not) and structure. It is worth noting that DESs played a capital role in the synthesis of hierarchical carbon monoliths not only promoting the spinodal decomposition that governs the formation of the bicontinuous porous structure but also providing the precursors required to tailor the composition and the molecular sieve structure of the resulting carbons. We have studied the performance of these two carbons for CO2, N2, and CH4 adsorption in both monolithic and powdered form. We have also studied the selective adsorption of CO2 versus CH4 in equilibrium and dynamic conditions. We found that these materials combined a high CO2-sorption capacity besides an excellent CO2/N2 and CO2/CH4 selectivity and, interestingly, this performance was preserved when processed in both monolithic and powdered form.
Resumo:
Hierarchical porous carbon materials prepared by the direct carbonization of lignin/zeolite mixtures and the subsequent basic etching of the inorganic template have been electrochemically characterized in acidic media. These lignin-based templated carbons have interesting surface chemistry features, such as a variety of surface oxygen groups and also pyridone and pyridinic groups, which results in a high capacitance enhancement compared to petroleum-pitch-based carbons obtained by the same procedure. Furthermore, they are easily electro-oxidized in a sulfuric acid electrolyte under positive polarization to produce a large amount of surface oxygen groups that boosts the pseudocapacitance. The lignin-based templated carbons showed a specific capacitance as high as 250 F g−1 at 50 mA g−1, with a capacitance retention of 50 % and volumetric capacitance of 75 F cm−3 at current densities higher than 20 A g−1 thanks to their suitable porous texture. These results indicate the potential use of inexpensive biomass byproducts, such as lignin, as carbon precursors in the production of hierarchical carbon materials for electrodes in electrochemical capacitors.
Resumo:
Bayesian methods offer a flexible and convenient probabilistic learning framework to extract interpretable knowledge from complex and structured data. Such methods can characterize dependencies among multiple levels of hidden variables and share statistical strength across heterogeneous sources. In the first part of this dissertation, we develop two dependent variational inference methods for full posterior approximation in non-conjugate Bayesian models through hierarchical mixture- and copula-based variational proposals, respectively. The proposed methods move beyond the widely used factorized approximation to the posterior and provide generic applicability to a broad class of probabilistic models with minimal model-specific derivations. In the second part of this dissertation, we design probabilistic graphical models to accommodate multimodal data, describe dynamical behaviors and account for task heterogeneity. In particular, the sparse latent factor model is able to reveal common low-dimensional structures from high-dimensional data. We demonstrate the effectiveness of the proposed statistical learning methods on both synthetic and real-world data.
Resumo:
This paper presents a preliminary study on the dielectric properties and curing of three different types of epoxy resins mixed at various stichiometric mixture of hardener, flydust and aluminium powder under microwave energy. In this work, the curing process of thin layers of epoxy resins using microwave radiation was investigated as an alternative technique that can be implemented to develop a new rapid product development technique. In this study it was observed that the curing time and temperature were a function of the percentage of hardener and fillers presence in the epoxy resins. Initially dielectric properties of epoxy resins with hardener were measured which was directly correlated to the curing process in order to understand the properties of cured specimen. Tensile tests were conducted on the three different types of epoxy resins with hardener and fillers. Modifying dielectric properties of the mixtures a significant decrease in curing time was observed. In order to study the microstructural changes of cured specimen the morphology of the fracture surface was carried out by using scanning electron microscopy.