Biblioteca Digital

983 resultados para Mixture-models

Multivariate Density Estimation: An SVM Approach

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We formulate density estimation as an inverse operator problem. We then use convergence results of empirical distribution functions to true distribution functions to develop an algorithm for multivariate density estimation. The algorithm is based upon a Support Vector Machine (SVM) approach to solving inverse operator problems. The algorithm is implemented and tested on simulated data from different distributions and different dimensionalities, gaussians and laplacians in $R^2$ and $R^{12}$. A comparison in performance is made with Gaussian Mixture Models (GMMs). Our algorithm does as well or better than the GMMs for the simulations tested and has the added advantage of being automated with respect to parameters.

Behavioral differences in violence: the case of intra-group differences of paramilitaries and guerrillas in Colombia

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In most studies on civil wars, determinants of conflict have been hitherto explored assuming that actors involved were either unitary or stable. However, if this intra-group homogeneity assumption does not hold, empirical econometric estimates may be biased. We use Fixed Effects Finite Mixture Model (FE-FMM) approach to address this issue that provides a representation of heterogeneity when data originate from different latent classes and the affiliation is unknown. It allows to identify sub-populations within a population as well as the determinants of their behaviors. By combining various data sources for the period 2000-2005, we apply this methodology to the Colombian conflict. Our results highlight a behavioral heterogeneity in guerrilla’s armed groups and their distinct economic correlates. By contrast paramilitaries behave as a rather homogenous group.

Preferred structures in large-scale circulation and the effect of doubling greenhouse gas concentration in HadCM3

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Preferred structures in the surface pressure variability are investigated in and compared between two 100-year simulations of the Hadley Centre climate model HadCM3. In the first (control) simulation, the model is forced with pre-industrial carbon dioxide concentration (1×CO2) and in the second simulation the model is forced with doubled CO2 concentration (2×CO2). Daily winter (December-January-February) surface pressures over the Northern Hemisphere are analysed. The identification of preferred patterns is addressed using multivariate mixture models. For the control simulation, two significant flow regimes are obtained at 5% and 2.5% significance levels within the state space spanned by the leading two principal components. They show a high pressure centre over the North Pacific/Aleutian Islands associated with a low pressure centre over the North Atlantic, and its reverse. For the 2×CO2 simulation, no such behaviour is obtained. At higher-dimensional state space, flow patterns are obtained from both simulations. They are found to be significant at the 1% level for the control simulation and at the 2.5% level for the 2×CO2 simulation. Hence under CO2 doubling, regime behaviour in the large-scale wave dynamics weakens. Doubling greenhouse gas concentration affects both the frequency of occurrence of regimes and also the pattern structures. The less frequent regime becomes amplified and the more frequent regime weakens. The largest change is observed over the Pacific where a significant deepening of the Aleutian low is obtained under CO2 doubling.

Asymptotic normality in mixtures of power series distributions

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The problem of estimating the individual probabilities of a discrete distribution is considered. The true distribution of the independent observations is a mixture of a family of power series distributions. First, we ensure identifiability of the mixing distribution assuming mild conditions. Next, the mixing distribution is estimated by non-parametric maximum likelihood and an estimator for individual probabilities is obtained from the corresponding marginal mixture density. We establish asymptotic normality for the estimator of individual probabilities by showing that, under certain conditions, the difference between this estimator and the empirical proportions is asymptotically negligible. Our framework includes Poisson, negative binomial and logarithmic series as well as binomial mixture models. Simulations highlight the benefit in achieving normality when using the proposed marginal mixture density approach instead of the empirical one, especially for small sample sizes and/or when interest is in the tail areas. A real data example is given to illustrate the use of the methodology.

Modelling heterotachy in phylogenetic inference by reversible-jump Markov chain Monte Carlo

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.

Infant mortality model for lifetime data

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we introduce a parametric model for handling lifetime data where an early lifetime can be related to the infant-mortality failure or to the wear processes but we do not know which risk is responsible for the failure. The maximum likelihood approach and the sampling-based approach are used to get the inferences of interest. Some special cases of the proposed model are studied via Monte Carlo methods for size and power of hypothesis tests. To illustrate the proposed methodology, we introduce an example consisting of a real data set.

Bayesian density estimation using Skew Student-t-Normal mixtures

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a Bayesian approach for modeling heterogeneous data and estimate multimodal densities using mixtures of Skew Student-t-Normal distributions [Gomez, H.W., Venegas, O., Bolfarine, H., 2007. Skew-symmetric distributions generated by the distribution function of the normal distribution. Environmetrics 18, 395-407]. A stochastic representation that is useful for implementing a MCMC-type algorithm and results about existence of posterior moments are obtained. Marginal likelihood approximations are obtained, in order to compare mixture models with different number of component densities. Data sets concerning the Gross Domestic Product per capita (Human Development Report) and body mass index (National Health and Nutrition Examination Survey), previously studied in the related literature, are analyzed. (c) 2008 Elsevier B.V. All rights reserved.

Nonstationary feature extraction techniques for automatic classification of impact acoustic signals

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Condition monitoring of wooden railway sleepers applications are generallycarried out by visual inspection and if necessary some impact acoustic examination iscarried out intuitively by skilled personnel. In this work, a pattern recognition solutionhas been proposed to automate the process for the achievement of robust results. Thestudy presents a comparison of several pattern recognition techniques together withvarious nonstationary feature extraction techniques for classification of impactacoustic emissions. Pattern classifiers such as multilayer perceptron, learning cectorquantization and gaussian mixture models, are combined with nonstationary featureextraction techniques such as Short Time Fourier Transform, Continuous WaveletTransform, Discrete Wavelet Transform and Wigner-Ville Distribution. Due to thepresence of several different feature extraction and classification technqies, datafusion has been investigated. Data fusion in the current case has mainly beeninvestigated on two levels, feature level and classifier level respectively. Fusion at thefeature level demonstrated best results with an overall accuracy of 82% whencompared to the human operator.

Estimação clássica e Bayesiana em modelos de sobrevida com fração de cura

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In Survival Analysis, long duration models allow for the estimation of the healing fraction, which represents a portion of the population immune to the event of interest. Here we address classical and Bayesian estimation based on mixture models and promotion time models, using different distributions (exponential, Weibull and Pareto) to model failure time. The database used to illustrate the implementations is described in Kersey et al. (1987) and it consists of a group of leukemia patients who underwent a certain type of transplant. The specific implementations used were numeric optimization by BFGS as implemented in R (base::optim), Laplace approximation (own implementation) and Gibbs sampling as implemented in Winbugs. We describe the main features of the models used, the estimation methods and the computational aspects. We also discuss how different prior information can affect the Bayesian estimates

On the determination of epsilon during discriminative GMM training

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Discriminative training of Gaussian Mixture Models (GMMs) for speech or speaker recognition purposes is usually based on the gradient descent method, in which the iteration step-size, ε, uses to be defined experimentally. In this letter, we derive an equation to adaptively determine ε, by showing that the second-order Newton-Raphson iterative method to find roots of equations is equivalent to the gradient descent algorithm. © 2010 IEEE.

Análise isotópica do carbono em néctares de caju

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Discriminação de classes de cobertura vegetal utilizando técnicas de classificação digital de imagens de sensoriamento remoto

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The aim of this work is to discriminate vegetation classes throught remote sensing images from the satellite CBERS-2, related to winter and summer seasons in the Campos Gerais region Paraná State, Brazil. The vegetation cover of the region presents different kinds of vegetations: summer and winter cultures, reforestation areas, natural areas and pasture. Supervised classification techniques like Maximum Likelihood Classifier (MLC) and Decision Tree were evaluated, considering a set of attributes from images, composed by bands of the CCD sensor (1, 2, 3, 4), vegetation indices (CTVI, DVI, GEMI, NDVI, SR, SAVI, TVI), mixture models (soil, shadow, vegetation) and the two first main components. The evaluation of the classifications accuracy was made using the classification error matrix and the kappa coefficient. It was defined a high discriminatory level during the classes definition, in order to allow separation of different kinds of winter and summer crops. The classification accuracy by decision tree was 94.5% and the kappa coefficient was 0.9389 for the scene 157/128. For the scene 158/127, the values were 88% and 0.8667, respectively. The classification accuracy by MLC was 84.86% and the kappa coefficient was 0.8099 for the scene 157/128. For the scene 158/127, the values were 77.90% and 0.7476, respectively. The results showed a better performance of the Decision Tree classifier than MLC, especially to the classes related to cultivated crops, indicating the use of the Decision Tree classifier to the vegetation cover mapping including different kinds of crops.

A multiple time scale survival model with a cure fraction

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many recent survival studies propose modeling data with a cure fraction, i.e., data in which part of the population is not susceptible to the event of interest. This event may occur more than once for the same individual (recurrent event). We then have a scenario of recurrent event data in the presence of a cure fraction, which may appear in various areas such as oncology, finance, industries, among others. This paper proposes a multiple time scale survival model to analyze recurrent events using a cure fraction. The objective is analyzing the efficiency of certain interventions so that the studied event will not happen again in terms of covariates and censoring. All estimates were obtained using a sampling-based approach, which allows information to be input beforehand with lower computational effort. Simulations were done based on a clinical scenario in order to observe some frequentist properties of the estimation procedure in the presence of small and moderate sample sizes. An application of a well-known set of real mammary tumor data is provided.

A new long-term lifetime distribution induced by a latent complementary risk framework

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we proposed a new three-parameter long-term lifetime distribution induced by a latent complementary risk framework with decreasing, increasing and unimodal hazard function, the long-term complementary exponential geometric distribution. The new distribution arises from latent competing risk scenarios, where the lifetime associated scenario, with a particular risk, is not observable, rather we observe only the maximum lifetime value among all risks, and the presence of long-term survival. The properties of the proposed distribution are discussed, including its probability density function and explicit algebraic formulas for its reliability, hazard and quantile functions and order statistics. The parameter estimation is based on the usual maximum-likelihood approach. A simulation study assesses the performance of the estimation procedure. We compare the new distribution with its particular cases, as well as with the long-term Weibull distribution on three real data sets, observing its potential and competitiveness in comparison with some usual long-term lifetime distributions.

Empirical Null and False Discovery Rate Inference for Exponential Families

Relevância:

60.00% 60.00%

Publicador:

«
1
2
3
4
5
6
7
8
...
65
66
»