44 resultados para Mixture Distributions

em University of Queensland eSpace - Australia


Relevância:

70.00% 70.00%

Publicador:

Resumo:

A two-component mixture regression model that allows simultaneously for heterogeneity and dependency among observations is proposed. By specifying random effects explicitly in the linear predictor of the mixture probability and the mixture components, parameter estimation is achieved by maximising the corresponding best linear unbiased prediction type log-likelihood. Approximate residual maximum likelihood estimates are obtained via an EM algorithm in the manner of generalised linear mixed model (GLMM). The method can be extended to a g-component mixture regression model with the component density from the exponential family, leading to the development of the class of finite mixture GLMM. For illustration, the method is applied to analyse neonatal length of stay (LOS). It is shown that identification of pertinent factors that influence hospital LOS can provide important information for health care planning and resource allocation. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The modelling of inpatient length of stay (LOS) has important implications in health care studies. Finite mixture distributions are usually used to model the heterogeneous LOS distribution, due to a certain proportion of patients sustaining-a longer stay. However, the morbidity data are collected from hospitals, observations clustered within the same hospital are often correlated. The generalized linear mixed model approach is adopted to accommodate the inherent correlation via unobservable random effects. An EM algorithm is developed to obtain residual maximum quasi-likelihood estimation. The proposed hierarchical mixture regression approach enables the identification and assessment of factors influencing the long-stay proportion and the LOS for the long-stay patient subgroup. A neonatal LOS data set is used for illustration, (C) 2003 Elsevier Science Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we look at three models (mixture, competing risk and multiplicative) involving two inverse Weibull distributions. We study the shapes of the density and failure-rate functions and discuss graphical methods to determine if a given data set can be modelled by one of these models. (C) 2001 Elsevier Science Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When examining a rock mass, joint sets and their orientations can play a significant role with regard to how the rock mass will behave. To identify joint sets present in the rock mass, the orientation of individual fracture planer can be measured on exposed rock faces and the resulting data can be examined for heterogeneity. In this article, the expectation-maximization algorithm is used to lit mixtures of Kent component distributions to the fracture data to aid in the identification of joint sets. An additional uniform component is also included in the model to accommodate the noise present in the data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A large number of models have been derived from the two-parameter Weibull distribution and are referred to as Weibull models. They exhibit a wide range of shapes for the density and hazard functions, which makes them suitable for modelling complex failure data sets. The WPP and IWPP plot allows one to determine in a systematic manner if one or more of these models are suitable for modelling a given data set. This paper deals with this topic.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Potential errors in the application of mixture theory to the analysis of multiple-frequency bioelectrical impedance data for the determination of body fluid volumes are assessed. Potential sources of error include: conductive length; tissue fluid resistivity; body density; weight and technical errors of measurement. Inclusion of inaccurate estimates of body density and weight introduce errors of typically < +/-3% but incorrect assumptions regarding conductive length or fluid resistivities may each incur errors of up to 20%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of extracting pore size distributions from characterization data is solved here with particular reference to adsorption. The technique developed is based on a finite element collocation discretization of the adsorption integral, with fitting of the isotherm data by least squares using regularization. A rapid and simple technique for ensuring non-negativity of the solutions is also developed which modifies the original solution having some negativity. The technique yields stable and converged solutions, and is implemented in a package RIDFEC. The package is demonstrated to be robust, yielding results which are less sensitive to experimental error than conventional methods, with fitting errors matching the known data error. It is shown that the choice of relative or absolute error norm in the least-squares analysis is best based on the kind of error in the data. (C) 1998 Elsevier Science Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Poor root development due to constraining soil conditions could be an important factor influencing health of urban trees. Therefore, there is a need for efficient techniques to analyze the spatial distribution of tree roots. An analytical procedure for describing tree rooting patterns from X-ray computed tomography (CT) data is described and illustrated. Large irregularly shaped specimens of undisturbed sandy soil were sampled from Various positions around the base of trees using field impregnation with epoxy resin, to stabilize the cohesionless soil. Cores approximately 200 mm in diameter by 500 mm in height were extracted from these specimens. These large core samples were scanned with a medical X-ray CT device, and contiguous images of soil slices (2 mm thick) were thus produced. X-ray CT images are regarded as regularly-spaced sections through the soil although they are not actual 2D sections but matrices of voxels similar to 0.5 mm x 0.5 mm x 2 mm. The images were used to generate the equivalent of horizontal root contact maps from which three-dimensional objects, assumed to be roots, were reconstructed. The resulting connected objects were used to derive indices of the spatial organization of roots, namely: root length distribution, root length density, root growth angle distribution, root spatial distribution, and branching intensity. The successive steps of the method, from sampling to generation of indices of tree root organization, are illustrated through a case study examining rooting patterns of valuable urban trees. (C) 1999 Elsevier Science B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A mixture model for long-term survivors has been adopted in various fields such as biostatistics and criminology where some individuals may never experience the type of failure under study. It is directly applicable in situations where the only information available from follow-up on individuals who will never experience this type of failure is in the form of censored observations. In this paper, we consider a modification to the model so that it still applies in the case where during the follow-up period it becomes known that an individual will never experience failure from the cause of interest. Unless a model allows for this additional information, a consistent survival analysis will not be obtained. A partial maximum likelihood (ML) approach is proposed that preserves the simplicity of the long-term survival mixture model and provides consistent estimators of the quantities of interest. Some simulation experiments are performed to assess the efficiency of the partial ML approach relative to the full ML approach for survival in the presence of competing risks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

1. The spatial and temporal distribution of eggs laid by herbivorous insects is a crucial component of herbivore population stability, as it influences overall mortality within the population. Thus an ecologist studying populations of an endangered butterfly can do little to increase its numbers through habitat management without knowledge of its egg-laying patterns across individual host-plants under different habitat management regimes. At the other end of the spectrum, a knowledge of egg-laying behaviour can do much to control pest outbreaks by disrupting egg distributions that lead to rapid population growth. 2. The distribution of egg batches of the processionary caterpillar Ochrogaster lunifer on acacia trees was monitored in 21 habitats during 2 years in coastal Australia. The presence of egg batches on acacias was affected by host-tree 'quality' (tree size and foliar chemistry that led to increased caterpillar survival) and host-tree 'apparency' (the amount of vegetation surrounding host-trees). 3. In open homogeneous habitats, more egg batches were laid on high-quality trees, increasing potential population growth. In diverse mixed-species habitats, more egg batches were laid on low-quality highly apparent trees, reducing population growth and so reducing the potential for unstable population dynamics. The aggregation of batches on small apparent trees in diverse habitats led to outbreaks on these trees year after year, even when population levels were low, while site-wide outbreaks were rare. 4. These results predict that diverse habitats with mixed plant species should increase insect aggregation and increase population stability. In contrast, in open disturbed habitats or in regular plantations, where egg batches are more evenly distributed across high-quality hosts, populations should be more unstable, with site-wide outbreaks and extinctions being more common. 5. Mixed planting should be used on habitat regeneration sites to increase the population stability of immigrating or reintroduced insect species. Mixed planting also increases the diversity of resources, leading to higher herbivore species richness. With regard to the conservation of single species, different practices of habitat management will need to be employed depending on whether a project is concerned with methods of rapidly increasing the abundance of an endangered insect or concerned with the maintenance of a stable, established insect population that is perhaps endemic to an area. Suggestions for habitat management in these different cases are discussed. 6. Finally, intercropping can be highly effective in reducing pest outbreaks, although the economic gains of reduced pest attack may be outweighed by reduced crop yields in mixed-crop systems.