Biblioteca Digital

930 resultados para pattern-mixture model

A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.

Mixture Model-based Statistical Pattern Recognition of Clustered or Longitudinal Data

Relevância:

100.00% 100.00%

Publicador:

Speeding up the EM algorithm for mixture model-based segmentation of magnetic resonance images

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mixture models implemented via the expectation-maximization (EM) algorithm are being increasingly used in a wide range of problems in pattern recognition such as image segmentation. However, the EM algorithm requires considerable computational time in its application to huge data sets such as a three-dimensional magnetic resonance (MR) image of over 10 million voxels. Recently, it was shown that a sparse, incremental version of the EM algorithm could improve its rate of convergence. In this paper, we show how this modified EM algorithm can be speeded up further by adopting a multiresolution kd-tree structure in performing the E-step. The proposed algorithm outperforms some other variants of the EM algorithm for segmenting MR images of the human brain. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

The generalized log-gamma mixture model with covariates: local influence and residual analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a sample of censored survival times, the presence of an immune proportion of individuals who are not subject to death, failure or relapse, may be indicated by a relatively high number of individuals with large censored survival times. In this paper the generalized log-gamma model is modified for the possibility that long-term survivors may be present in the data. The model attempts to separately estimate the effects of covariates on the surviving fraction, that is, the proportion of the population for which the event never occurs. The logistic function is used for the regression model of the surviving fraction. Inference for the model parameters is considered via maximum likelihood. Some influence methods, such as the local influence and total local influence of an individual are derived, analyzed and discussed. Finally, a data set from the medical area is analyzed under the log-gamma generalized mixture model. A residual analysis is performed in order to select an appropriate model.

Heterogeneity in schizophrenia: A mixture model analysis based on age-of-onset, gender and diagnosis

Relevância:

100.00% 100.00%

Publicador:

On modifications to the long-term survival mixture model in the presence of competing risks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A mixture model for long-term survivors has been adopted in various fields such as biostatistics and criminology where some individuals may never experience the type of failure under study. It is directly applicable in situations where the only information available from follow-up on individuals who will never experience this type of failure is in the form of censored observations. In this paper, we consider a modification to the model so that it still applies in the case where during the follow-up period it becomes known that an individual will never experience failure from the cause of interest. Unless a model allows for this additional information, a consistent survival analysis will not be obtained. A partial maximum likelihood (ML) approach is proposed that preserves the simplicity of the long-term survival mixture model and provides consistent estimators of the quantities of interest. Some simulation experiments are performed to assess the efficiency of the partial ML approach relative to the full ML approach for survival in the presence of competing risks.

Long-term survivor mixture model with random effects: application to a multi-centre clinical trial of carcinoma

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A mixture model incorporating long-term survivors has been adopted in the field of biostatistics where some individuals may never experience the failure event under study. The surviving fractions may be considered as cured. In most applications, the survival times are assumed to be independent. However, when the survival data are obtained from a multi-centre clinical trial, it is conceived that the environ mental conditions and facilities shared within clinic affects the proportion cured as well as the failure risk for the uncured individuals. It necessitates a long-term survivor mixture model with random effects. In this paper, the long-term survivor mixture model is extended for the analysis of multivariate failure time data using the generalized linear mixed model (GLMM) approach. The proposed model is applied to analyse a numerical data set from a multi-centre clinical trial of carcinoma as an illustration. Some simulation experiments are performed to assess the applicability of the model based on the average biases of the estimates formed. Copyright (C) 2001 John Wiley & Sons, Ltd.

Fitting a mixture model to three-mode three-way data with missing information

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When the data consist of certain attributes measured on the same set of items in different situations, they would be described as a three-mode three-way array. A mixture likelihood approach can be implemented to cluster the items (i.e., one of the modes) on the basis of both of the other modes simultaneously (i.e,, the attributes measured in different situations). In this paper, it is shown that this approach can be extended to handle three-mode three-way arrays where some of the data values are missing at random in the sense of Little and Rubin (1987). The methodology is illustrated by clustering the genotypes in a three-way soybean data set where various attributes were measured on genotypes grown in several environments.

A mixture model-based approach to the clustering of microarray expression data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.

An EM-based Semi-Parametric Mixture Model Approach to the Regression Analysis of Competing-Risks Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.

Nonlinear mixture model for hyperspectral unmixing

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Proceedings of International Conference - SPIE 7477, Image and Signal Processing for Remote Sensing XV - 28 September 2009

Evaluation of a linear spectral mixture model and vegetation indices (NDVI and EVI) in a study of schistosomiasis mansoni and Biomphalaria glabrata distribution in the state of Minas Gerais, Brazil

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper analyses the associations between Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI) on the prevalence of schistosomiasis and the presence of Biomphalaria glabrata in the state of Minas Gerais (MG), Brazil. Additionally, vegetation, soil and shade fraction images were created using a Linear Spectral Mixture Model (LSMM) from the blue, red and infrared channels of the Moderate Resolution Imaging Spectroradiometer spaceborne sensor and the relationship between these images and the prevalence of schistosomiasis and the presence of B. glabrata was analysed. First, we found a high correlation between the vegetation fraction image and EVI and second, a high correlation between soil fraction image and NDVI. The results also indicate that there was a positive correlation between prevalence and the vegetation fraction image (July 2002), a negative correlation between prevalence and the soil fraction image (July 2002) and a positive correlation between B. glabrata and the shade fraction image (July 2002). This paper demonstrates that the LSMM variables can be used as a substitute for the standard vegetation indices (EVI and NDVI) to determine and delimit risk areas for B. glabrata and schistosomiasis in MG, which can be used to improve the allocation of resources for disease control.

Classification of Interest Rates Curves Using Gaussian Mixture Model

Relevância:

100.00% 100.00%

Publicador:

On the residual term in the linear mixture model and its dependence on the point spread function

Relevância:

100.00% 100.00%

Publicador:

On the effect of variable endmember spectra in the linear mixture model

Relevância:

100.00% 100.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
61
62
»