Biblioteca Digital

158 resultados para MIXTURE-MODELS

Fitting mixtures of Kent distributions to aid in joint set identification

Relevância:

60.00% 60.00%

Publicador:

Resumo:

When examining a rock mass, joint sets and their orientations can play a significant role with regard to how the rock mass will behave. To identify joint sets present in the rock mass, the orientation of individual fracture planer can be measured on exposed rock faces and the resulting data can be examined for heterogeneity. In this article, the expectation-maximization algorithm is used to lit mixtures of Kent component distributions to the fracture data to aid in the identification of joint sets. An additional uniform component is also included in the model to accommodate the noise present in the data.

Cluster analysis of high-dimensional data: A case study

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Normal mixture models are often used to cluster continuous data. However, conventional approaches for fitting these models will have problems in producing nonsingular estimates of the component-covariance matrices when the dimension of the observations is large relative to the number of observations. In this case, methods such as principal components analysis (PCA) and the mixture of factor analyzers model can be adopted to avoid these estimation problems. We examine these approaches applied to the Cabernet wine data set of Ashenfelter (1999), considering the clustering of both the wines and the judges, and comparing our results with another analysis. The mixture of factor analyzers model proves particularly effective in clustering the wines, accurately classifying many of the wines by location.

Issues of robustness and high dimensionality in cluster analysis

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena. While normal mixture models are often used to cluster data sets of continuous multivariate data, a more robust clustering can be obtained by considering the t mixture model-based approach. Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data where the number of observations n is very large relative to their dimension p. As the approach using the multivariate normal family of distributions is sensitive to outliers, it is more robust to adopt the multivariate t family for the component error and factor distributions. The computational aspects associated with robustness and high dimensionality in these approaches to cluster analysis are discussed and illustrated.

Heterogeneity in schizophrenia; mixture modelling of age-at-first-admission, gender and diagnosis

Relevância:

30.00% 30.00%

Publicador:

Modelling the Distribution of Ischaemic Stroke-Specific Survival Time Using an EM-based Mixture Approach with Random Effects Adjustment

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A two-component survival mixture model is proposed to analyse a set of ischaemic stroke-specific mortality data. The survival experience of stroke patients after index stroke may be described by a subpopulation of patients in the acute condition and another subpopulation of patients in the chronic phase. To adjust for the inherent correlation of observations due to random hospital effects, a mixture model of two survival functions with random effects is formulated. Assuming a Weibull hazard in both components, an EM algorithm is developed for the estimation of fixed effect parameters and variance components. A simulation study is conducted to assess the performance of the two-component survival mixture model estimators. Simulation results confirm the applicability of the proposed model in a small sample setting. Copyright (C) 2004 John Wiley Sons, Ltd.

Long-term survivor mixture model with random effects: application to a multi-centre clinical trial of carcinoma

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A mixture model incorporating long-term survivors has been adopted in the field of biostatistics where some individuals may never experience the failure event under study. The surviving fractions may be considered as cured. In most applications, the survival times are assumed to be independent. However, when the survival data are obtained from a multi-centre clinical trial, it is conceived that the environ mental conditions and facilities shared within clinic affects the proportion cured as well as the failure risk for the uncured individuals. It necessitates a long-term survivor mixture model with random effects. In this paper, the long-term survivor mixture model is extended for the analysis of multivariate failure time data using the generalized linear mixed model (GLMM) approach. The proposed model is applied to analyse a numerical data set from a multi-centre clinical trial of carcinoma as an illustration. Some simulation experiments are performed to assess the applicability of the model based on the average biases of the estimates formed. Copyright (C) 2001 John Wiley & Sons, Ltd.

Models involving two inverse Weibull distributions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we look at three models (mixture, competing risk and multiplicative) involving two inverse Weibull distributions. We study the shapes of the density and failure-rate functions and discuss graphical methods to determine if a given data set can be modelled by one of these models. (C) 2001 Elsevier Science Ltd. All rights reserved.

Maximum Likelihood Estimation of Mixture Densities for Binned and Truncated Multivariate Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.

Finite mixture regression model with random effects: application to neonatal hospital length of stay

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.

Modelling inpatient length of stay by a hierarchical mixture regression via the EM algorithm

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The modelling of inpatient length of stay (LOS) has important implications in health care studies. Finite mixture distributions are usually used to model the heterogeneous LOS distribution, due to a certain proportion of patients sustaining-a longer stay. However, the morbidity data are collected from hospitals, observations clustered within the same hospital are often correlated. The generalized linear mixed model approach is adopted to accommodate the inherent correlation via unobservable random effects. An EM algorithm is developed to obtain residual maximum quasi-likelihood estimation. The proposed hierarchical mixture regression approach enables the identification and assessment of factors influencing the long-stay proportion and the LOS for the long-stay patient subgroup. A neonatal LOS data set is used for illustration, (C) 2003 Elsevier Science Ltd. All rights reserved.

Impact of potential models on adsorption of linear molecules on carbon black

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we investigate the effects of potential models on the description of equilibria of linear molecules (ethylene and ethane) adsorption on graphitized thermal carbon black. GCMC simulation is used as a tool to give adsorption isotherms, isosteric heat of adsorption and the microscopic configurations of these molecules. At the heart of the GCMC are the potential models, describing fluid-fluid interaction and solid-fluid interaction. Here we studied the two potential models recently proposed in the literature, the UA-TraPPE and AUA4. Their impact in the description of adsorption behavior of pure components will be discussed. Mixtures of these components with nitrogen and argon are also studied. Nitrogen is modeled a two-site plus discrete charges while argon as a spherical particle. GCMC simulation is also used for generating simulation mixture isotherms. It is found that co-operation between species occurs when the surface is fractionally covered while competition is important when surface is fully loaded.

A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: An important problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. We provide a straightforward and easily implemented method for estimating the posterior probability that an individual gene is null. The problem can be expressed in a two-component mixture framework, using an empirical Bayes approach. Current methods of implementing this approach either have some limitations due to the minimal assumptions made or with more specific assumptions are computationally intensive. Results: By converting to a z-score the value of the test statistic used to test the significance of each gene, we propose a simple two-component normal mixture that models adequately the distribution of this score. The usefulness of our approach is demonstrated on three real datasets.

Testing predictions of macroscopic binary diffusion coefficients using lattice models with site heterogeneity

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quantitatively predicting mass transport rates for chemical mixtures in porous materials is important in applications of materials such as adsorbents, membranes, and catalysts. Because directly assessing mixture transport experimentally is challenging, theoretical models that can predict mixture diffusion coefficients using Only single-component information would have many uses. One such model was proposed by Skoulidas, Sholl, and Krishna (Langmuir, 2003, 19, 7977), and applications of this model to a variety of chemical mixtures in nanoporous materials have yielded promising results. In this paper, the accuracy of this model for predicting mixture diffusion coefficients in materials that exhibit a heterogeneous distribution of local binding energies is examined. To examine this issue, single-component and binary mixture diffusion coefficients are computed using kinetic Monte Carlo for a two-dimensional lattice model over a wide range of lattice occupancies and compositions. The approach suggested by Skoulidas, Sholl, and Krishna is found to be accurate in situations where the spatial distribution of binding site energies is relatively homogeneous, but is considerably less accurate for strongly heterogeneous energy distributions.

Performance analysis using equivariant kernel density estimator in nonlinear mixture

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates the performance analysis of separation of mutually independent sources in nonlinear models. The nonlinear mapping constituted by an unsupervised linear mixture is followed by an unknown and invertible nonlinear distortion, are found in many signal processing cases. Generally, blind separation of sources from their nonlinear mixtures is rather difficult. We propose using a kernel density estimator incorporated with equivariant gradient analysis to separate the sources with nonlinear distortion. The kernel density estimator parameters of which are iteratively updated to minimize the output independence expressed as a mutual information criterion. The equivariant gradient algorithm has the form of nonlinear decorrelation to perform the convergence analysis. Experiments are proposed to illustrate these results.

Defining a role for the subthalamic nucleus within operative theoretical models of subcortical participation in language

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective:To investigate the effects of bilateral, surgically induced functional inhibition of the subthalamic nucleus (STN) on general language, high level linguistic abilities, and semantic processing skills in a group of patients with Parkinson’s disease. Methods:Comprehensive linguistic profiles were obtained up to one month before and three months after bilateral implantation of electrodes in the STN during active deep brain stimulation (DBS) in five subjects with Parkinson’s disease (mean age, 63.2 years). Equivalent linguistic profiles were generated over a three month period for a non-surgical control cohort of 16 subjects with Parkinson’s disease (NSPD) (mean age, 64.4 years). Education and disease duration were similar in the two groups. Initial assessment and three month follow up performance profiles were compared within subjects by paired t tests. Reliability change indices (RCI), representing clinically significant alterations in performance over time, were calculated for each of the assessment scores achieved by the five STN-DBS cases and the 16 NSPD controls, relative to performance variability within a group of 16 non-neurologically impaired adults (mean age, 61.9 years). Proportions of reliable change were then compared between the STN-DBS and NSPD groups. Results:Paired comparisons within the STN-DBS group showed prolonged postoperative semantic processing reaction times for a range of word types coded for meanings and meaning relatedness. Case by case analyses of reliable change across language assessments and groups revealed differences in proportions of change over time within the STN-DBS and NSPD groups in the domains of high level linguistics and semantic processing. Specifically, when compared with the NSPD group, the STN-DBS group showed a proportionally significant (p

«
1
2
3
4
5
6
7
8
9
10
11
»