24 resultados para BTEX mixture
Finite mixture regression model with random effects: application to neonatal hospital length of stay
Resumo:
A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
The modelling of inpatient length of stay (LOS) has important implications in health care studies. Finite mixture distributions are usually used to model the heterogeneous LOS distribution, due to a certain proportion of patients sustaining-a longer stay. However, the morbidity data are collected from hospitals, observations clustered within the same hospital are often correlated. The generalized linear mixed model approach is adopted to accommodate the inherent correlation via unobservable random effects. An EM algorithm is developed to obtain residual maximum quasi-likelihood estimation. The proposed hierarchical mixture regression approach enables the identification and assessment of factors influencing the long-stay proportion and the LOS for the long-stay patient subgroup. A neonatal LOS data set is used for illustration, (C) 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
We consider the problem of assessing the number of clusters in a limited number of tissue samples containing gene expressions for possibly several thousands of genes. It is proposed to use a normal mixture model-based approach to the clustering of the tissue samples. One advantage of this approach is that the question on the number of clusters in the data can be formulated in terms of a test on the smallest number of components in the mixture model compatible with the data. This test can be carried out on the basis of the likelihood ratio test statistic, using resampling to assess its null distribution. The effectiveness of this approach is demonstrated on simulated data and on some microarray datasets, as considered previously in the bioinformatics literature. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Mixture models implemented via the expectation-maximization (EM) algorithm are being increasingly used in a wide range of problems in pattern recognition such as image segmentation. However, the EM algorithm requires considerable computational time in its application to huge data sets such as a three-dimensional magnetic resonance (MR) image of over 10 million voxels. Recently, it was shown that a sparse, incremental version of the EM algorithm could improve its rate of convergence. In this paper, we show how this modified EM algorithm can be speeded up further by adopting a multiresolution kd-tree structure in performing the E-step. The proposed algorithm outperforms some other variants of the EM algorithm for segmenting MR images of the human brain. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Resumo:
Cluster analysis via a finite mixture model approach is considered. With this approach to clustering, the data can be partitioned into a specified number of clusters g by first fitting a mixture model with g components. An outright clustering of the data is then obtained by assigning an observation to the component to which it has the highest estimated posterior probability of belonging; that is, the ith cluster consists of those observations assigned to the ith component (i = 1,..., g). The focus is on the use of mixtures of normal components for the cluster analysis of data that can be regarded as being continuous. But attention is also given to the case of mixed data, where the observations consist of both continuous and discrete variables.
Resumo:
Adsorption of pure nitrogen, argon, acetone, chloroform and acetone-chloroform mixture on graphitized thermal carbon black is considered at sub-critical conditions by means of molecular layer structure theory (MLST). In the present version of the MLST an adsorbed fluid is considered as a sequence of 2D molecular layers, whose Helmholtz free energies are obtained directly from the analysis of experimental adsorption isotherm of pure components. The interaction of the nearest layers is accounted for in the framework of mean field approximation. This approach allows quantitative correlating of experimental nitrogen and argon adsorption isotherm both in the monolayer region and in the range of multi-layer coverage up to 10 molecular layers. In the case of acetone and chloroform the approach also leads to excellent quantitative correlation of adsorption isotherms, while molecular approaches such as the non-local density functional theory (NLDFT) fail to describe those isotherms. We extend our new method to calculate the Helmholtz free energy of an adsorbed mixture using a simple mixing rule, and this allows us to predict mixture adsorption isotherms from pure component adsorption isotherms. The approach, which accounts for the difference in composition in different molecular layers, is tested against the experimental data of acetone-chloroform mixture (non-ideal mixture) adsorption on graphitized thermal carbon black at 50 degrees C. (C) 2005 Elsevier Ltd. All rights reserved.
Resumo:
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.
Resumo:
An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.
Resumo:
Motivation: An important problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. We provide a straightforward and easily implemented method for estimating the posterior probability that an individual gene is null. The problem can be expressed in a two-component mixture framework, using an empirical Bayes approach. Current methods of implementing this approach either have some limitations due to the minimal assumptions made or with more specific assumptions are computationally intensive. Results: By converting to a z-score the value of the test statistic used to test the significance of each gene, we propose a simple two-component normal mixture that models adequately the distribution of this score. The usefulness of our approach is demonstrated on three real datasets.
Resumo:
Motivation: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes. Results: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation) and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic dataset is considered too.
Resumo:
Computer modelling promises to. be an important tool for analysing and predicting interactions between trees within mixed species forest plantations. This study explored the use of an individual-based mechanistic model as a predictive tool for designing mixed species plantations of Australian tropical trees. The 'spatially explicit individually based-forest simulator' (SeXI-FS) modelling system was used to describe the spatial interaction of individual tree crowns within a binary mixed-species experiment. The three-dimensional model was developed and verified with field data from three forest tree species grown in tropical Australia. The model predicted the interactions within monocultures and binary mixtures of Flindersia brayleyana, Eucalyptus pellita and Elaeocarpus grandis, accounting for an average of 42% of the growth variation exhibited by species in different treatments. The model requires only structural dimensions and shade tolerance as species parameters. By modelling interactions in existing tree mixtures, the model predicted both increases and reductions in the growth of mixtures (up to +/- 50% of stem volume at 7 years) compared to monocultures. This modelling approach may be useful for designing mixed tree plantations. (c) 2006 Published by Elsevier B.V.
Resumo:
Molecular dynamics simulations have been used to study the phase behavior of a dipalmitoylphosphatidylcholine (DPPC)/palmitic acid (PA)/water 1:2:20 mixture in atomic detail. Starting from a random solution of DPPC and PA in water, the system adopts either a gel phase at temperatures below similar to 330 K or an inverted hexagonal phase above similar to 330 K in good agreement with experiment. It has also been possible to observe the direct transformation from a gel to an inverted hexagonal phase at elevated temperature (similar to 390 K). During this transformation, a metastable fluid lamellar intermediate is observed. Interlamellar connections or stalks form spontaneously on a nanosecond time scale and subsequently elongate, leading to the formation of an inverted hexagonal phase. This work opens the possibility of studying in detail how the formation of nonlamellar phases is affected by lipid composition and (fusion) peptides and, thus, is an important step toward understanding related biological processes, such as membrane fusion.
Resumo:
Knowledge of the adsorption behavior of coal-bed gases, mainly under supercritical high-pressure conditions, is important for optimum design of production processes to recover coal-bed methane and to sequester CO2 in coal-beds. Here, we compare the two most rigorous adsorption methods based on the statistical mechanics approach, which are Density Functional Theory (DFT) and Grand Canonical Monte Carlo (GCMC) simulation, for single and binary mixtures of methane and carbon dioxide in slit-shaped pores ranging from around 0.75 to 7.5 nm in width, for pressure up to 300 bar, and temperature range of 308-348 K, as a preliminary study for the CO2 sequestration problem. For single component adsorption, the isotherms generated by DFT, especially for CO2, do not match well with GCMC calculation, and simulation is subsequently pursued here to investigate the binary mixture adsorption. For binary adsorption, upon increase of pressure, the selectivity of carbon dioxide relative to methane in a binary mixture initially increases to a maximum value, and subsequently drops before attaining a constant value at pressures higher than 300 bar. While the selectivity increases with temperature in the initial pressure-sensitive region, the constant high-pressure value is also temperature independent. Optimum selectivity at any temperature is attained at a pressure of 90-100 bar at low bulk mole fraction of CO2, decreasing to approximately 35 bar at high bulk mole fractions. (c) 2005 American Institute of Chemical Engineers.