842 resultados para Hierarchical Bayes


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The high level of unemployment is one of the major problems in most European countries nowadays. Hence, the demand for small area labor market statistics has rapidly increased over the past few years. The Labour Force Survey (LFS) conducted by the Portuguese Statistical Office is the main source of official statistics on the labour market at the macro level (e.g. NUTS2 and national level). However, the LFS was not designed to produce reliable statistics at the micro level (e.g. NUTS3, municipalities or further disaggregate level) due to small sample sizes. Consequently, traditional design-based estimators are not appropriate. A solution to this problem is to consider model-based estimators that "borrow information" from related areas or past samples by using auxiliary information. This paper reviews, under the model-based approach, Best Linear Unbiased Predictors and an estimator based on the posterior predictive distribution of a Hierarchical Bayesian model. The goal of this paper is to analyze the possibility to produce accurate unemployment rate statistics at micro level from the Portuguese LFS using these kinds of stimators. This paper discusses the advantages of using each approach and the viability of its implementation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we investigate whether consideration of store-level heterogeneity in marketing mix effects improves the accuracy of the marketing mix elasticities, fit, and forecasting accuracy of the widely-applied SCAN*PRO model of store sales. Models with continuous and discrete representations of heterogeneity, estimated using hierarchical Bayes (HB) and finite mixture (FM) techniques, respectively, are empirically compared to the original model, which does not account for store-level heterogeneity in marketing mix effects, and is estimated using ordinary least squares (OLS). The empirical comparisons are conducted in two contexts: Dutch store-level scanner data for the shampoo product category, and an extensive simulation experiment. The simulation investigates how between- and within-segment variance in marketing mix effects, error variance, the number of weeks of data, and the number of stores impact the accuracy of marketing mix elasticities, model fit, and forecasting accuracy. Contrary to expectations, accommodating store-level heterogeneity does not improve the accuracy of marketing mix elasticities relative to the homogeneous SCAN*PRO model, suggesting that little may be lost by employing the original homogeneous SCAN*PRO model estimated using ordinary least squares. Improvements in fit and forecasting accuracy are also fairly modest. We pursue an explanation for this result since research in other contexts has shown clear advantages from assuming some type of heterogeneity in market response models. In an Afterthought section, we comment on the controversial nature of our result, distinguishing factors inherent to household-level data and associated models vs. general store-level data and associated models vs. the unique SCAN*PRO model specification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study proposes a full Bayes (FB) hierarchical modeling approach in traffic crash hotspot identification. The FB approach is able to account for all uncertainties associated with crash risk and various risk factors by estimating a posterior distribution of the site safety on which various ranking criteria could be based. Moreover, by use of hierarchical model specification, FB approach is able to flexibly take into account various heterogeneities of crash occurrence due to spatiotemporal effects on traffic safety. Using Singapore intersection crash data(1997-2006), an empirical evaluate was conducted to compare the proposed FB approach to the state-of-the-art approaches. Results show that the Bayesian hierarchical models with accommodation for site specific effect and serial correlation have better goodness-of-fit than non hierarchical models. Furthermore, all model-based approaches perform significantly better in safety ranking than the naive approach using raw crash count. The FB hierarchical models were found to significantly outperform the standard EB approach in correctly identifying hotspots.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a hierarchical probabilistic model for ordinal matrix factorization. Unlike previous approaches, we model the ordinal nature of the data and take a principled approach to incorporating priors for the hidden variables. Two algorithms are presented for inference, one based on Gibbs sampling and one based on variational Bayes. Importantly, these algorithms may be implemented in the factorization of very large matrices with missing entries. The model is evaluated on a collaborative filtering task, where users have rated a collection of movies and the system is asked to predict their ratings for other movies. The Netflix data set is used for evaluation, which consists of around 100 million ratings. Using root mean-squared error (RMSE) as an evaluation metric, results show that the suggested model outperforms alternative factorization techniques. Results also show how Gibbs sampling outperforms variational Bayes on this task, despite the large number of ratings and model parameters. Matlab implementations of the proposed algorithms are available from cogsys.imm.dtu.dk/ordinalmatrixfactorization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Business Process Management (BPM) has increased in popularity and maturity in recent years. Large enterprises engage use process management approaches to model, manage and refine repositories of process models that detail the whole enterprise. These process models can run to the thousands in number, and may contain large hierarchies of tasks and control structures that become cumbersome to maintain. Tools are therefore needed to effectively traverse this process model space in an efficient manner, otherwise the repositories remain hard to use, and thus are lowered in their effectiveness. In this paper we analyse a range of BPM tools for their effectiveness in handling large process models. We establish that the present set of commercial tools is lacking in key areas regarding visualisation of, and interaction with, large process models. We then present six tool functionalities for the development of advanced business process visualisation and interaction, presenting a design for a tool that will exploit the latest advances in 2D and 3D computer graphics to enable fast and efficient search, traversal and modification of process models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes the use of the Bayes Factor to replace the Bayesian Information Criterion (BIC) as a criterion for speaker clustering within a speaker diarization system. The BIC is one of the most popular decision criteria used in speaker diarization systems today. However, it will be shown in this paper that the BIC is only an approximation to the Bayes factor of marginal likelihoods of the data given each hypothesis. This paper uses the Bayes factor directly as a decision criterion for speaker clustering, thus removing the error introduced by the BIC approximation. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, leading to a 14.7% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.