947 resultados para stochastic search variable selection


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper develops methods for Stochastic Search Variable Selection (currently popular with regression and Vector Autoregressive models) for Vector Error Correction models where there are many possible restrictions on the cointegration space. We show how this allows the researcher to begin with a single unrestricted model and either do model selection or model averaging in an automatic and computationally efficient manner. We apply our methods to a large UK macroeconomic model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper develops stochastic search variable selection (SSVS) for zero-inflated count models which are commonly used in health economics. This allows for either model averaging or model selection in situations with many potential regressors. The proposed techniques are applied to a data set from Germany considering the demand for health care. A package for the free statistical software environment R is provided.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is motivated by the recent interest in the use of Bayesian VARs for forecasting, even in cases where the number of dependent variables is large. In such cases, factor methods have been traditionally used but recent work using a particular prior suggests that Bayesian VAR methods can forecast better. In this paper, we consider a range of alternative priors which have been used with small VARs, discuss the issues which arise when they are used with medium and large VARs and examine their forecast performance using a US macroeconomic data set containing 168 variables. We nd that Bayesian VARs do tend to forecast better than factor methods and provide an extensive comparison of the strengths and weaknesses of various approaches. Our empirical results show the importance of using forecast metrics which use the entire predictive density, instead of using only point forecasts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop methods for Bayesian model averaging (BMA) or selection (BMS) in Panel Vector Autoregressions (PVARs). Our approach allows us to select between or average over all possible combinations of restricted PVARs where the restrictions involve interdependencies between and heterogeneities across cross-sectional units. The resulting BMA framework can find a parsimonious PVAR specification, thus dealing with overparameterization concerns. We use these methods in an application involving the euro area sovereign debt crisis and show that our methods perform better than alternatives. Our findings contradict a simple view of the sovereign debt crisis which divides the euro zone into groups of core and peripheral countries and worries about financial contagion within the latter group.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop methods for Bayesian model averaging (BMA) or selection (BMS) in Panel Vector Autoregressions (PVARs). Our approach allows us to select between or average over all possible combinations of restricted PVARs where the restrictions involve interdependencies between and heterogeneities across cross-sectional units. The resulting BMA framework can find a parsimonious PVAR specification, thus dealing with overparameterization concerns. We use these methods in an application involving the euro area sovereign debt crisis and show that our methods perform better than alternatives. Our findings contradict a simple view of the sovereign debt crisis which divides the euro zone into groups of core and peripheral countries and worries about financial contagion within the latter group.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vector Autoregressive Moving Average (VARMA) models have many theoretical properties which should make them popular among empirical macroeconomists. However, they are rarely used in practice due to over-parameterization concerns, difficulties in ensuring identification and computational challenges. With the growing interest in multivariate time series models of high dimension, these problems with VARMAs become even more acute, accounting for the dominance of VARs in this field. In this paper, we develop a Bayesian approach for inference in VARMAs which surmounts these problems. It jointly ensures identification and parsimony in the context of an efficient Markov chain Monte Carlo (MCMC) algorithm. We use this approach in a macroeconomic application involving up to twelve dependent variables. We find our algorithm to work successfully and provide insights beyond those provided by VARs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complex diseases, such as cancer, are caused by various genetic and environmental factors, and their interactions. Joint analysis of these factors and their interactions would increase the power to detect risk factors but is statistically. Bayesian generalized linear models using student-t prior distributions on coefficients, is a novel method to simultaneously analyze genetic factors, environmental factors, and interactions. I performed simulation studies using three different disease models and demonstrated that the variable selection performance of Bayesian generalized linear models is comparable to that of Bayesian stochastic search variable selection, an improved method for variable selection when compared to standard methods. I further evaluated the variable selection performance of Bayesian generalized linear models using different numbers of candidate covariates and different sample sizes, and provided a guideline for required sample size to achieve a high power of variable selection using Bayesian generalize linear models, considering different scales of number of candidate covariates. ^ Polymorphisms in folate metabolism genes and nutritional factors have been previously associated with lung cancer risk. In this study, I simultaneously analyzed 115 tag SNPs in folate metabolism genes, 14 nutritional factors, and all possible genetic-nutritional interactions from 1239 lung cancer cases and 1692 controls using Bayesian generalized linear models stratified by never, former, and current smoking status. SNPs in MTRR were significantly associated with lung cancer risk across never, former, and current smokers. In never smokers, three SNPs in TYMS and three gene-nutrient interactions, including an interaction between SHMT1 and vitamin B12, an interaction between MTRR and total fat intake, and an interaction between MTR and alcohol use, were also identified as associated with lung cancer risk. These lung cancer risk factors are worthy of further investigation.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Causal inference methods - mainly path analysis and structural equation modeling - offer plant physiologists information about cause-and-effect relationships among plant traits. Recently, an unusual approach to causal inference through stepwise variable selection has been proposed and used in various works on plant physiology. The approach should not be considered correct from a biological point of view. Here, it is explained why stepwise variable selection should not be used for causal inference, and shown what strange conclusions can be drawn based upon the former analysis when one aims to interpret cause-and-effect relationships among plant traits.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Copyright © 2013 Springer Netherlands.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this paper is to predict time series of SO2 concentrations emitted by coal-fired power stations in order to estimate in advance emission episodes and analyze the influence of some meteorological variables in the prediction. An emission episode is said to occur when the series of bi-hourly means of SO2 is greater than a specific level. For coal-fired power stations it is essential to predict emission epi- sodes sufficiently in advance so appropriate preventive measures can be taken. We proposed a meth- odology to predict SO2 emission episodes based on using an additive model and an algorithm for variable selection. The methodology was applied to the estimation of SO2 emissions registered in sampling lo- cations near a coal-fired power station located in Northern Spain. The results obtained indicate a good performance of the model considering only two terms of the time series and that the inclusion of the meteorological variables in the model is not significant.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A technique is presented for locating and tracking objects in cluttered environments. Agents are randomly distributed across the image, and subsequently grouped around targets. Each agent uses a weightless neural network and a histogram intersection technique to score its location. The system has been used to locate and track a head in 320x240 resolution video at up to 15fps.