48 resultados para Bayesian model selection

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Alpine tree-line ecotones are characterized by marked changes at small spatial scales that may result in a variety of physiognomies. A set of alternative individual-based models was tested with data from four contrasting Pinus uncinata ecotones in the central Spanish Pyrenees to reveal the minimal subset of processes required for tree-line formation. A Bayesian approach combined with Markov chain Monte Carlo methods was employed to obtain the posterior distribution of model parameters, allowing the use of model selection procedures. The main features of real tree lines emerged only in models considering nonlinear responses in individual rates of growth or mortality with respect to the altitudinal gradient. Variation in tree-line physiognomy reflected mainly changes in the relative importance of these nonlinear responses, while other processes, such as dispersal limitation and facilitation, played a secondary role. Different nonlinear responses also determined the presence or absence of krummholz, in agreement with recent findings highlighting a different response of diffuse and abrupt or krummholz tree lines to climate change. The method presented here can be widely applied in individual-based simulation models and will turn model selection and evaluation in this type of models into a more transparent, effective, and efficient exercise.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper a novel rank estimation technique for trajectories motion segmentation within the Local Subspace Affinity (LSA) framework is presented. This technique, called Enhanced Model Selection (EMS), is based on the relationship between the estimated rank of the trajectory matrix and the affinity matrix built by LSA. The results on synthetic and real data show that without any a priori knowledge, EMS automatically provides an accurate and robust rank estimation, improving the accuracy of the final motion segmentation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical {\sc vc} dimension, empirical {\sc vc} entropy, andmargin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this paper is twofold. First, we study the determinants of economic growth among a wide set of potential variables for the Spanish provinces (NUTS3). Among others, we include various types of private, public and human capital in the group of growth factors. Also,we analyse whether Spanish provinces have converged in economic terms in recent decades. Thesecond objective is to obtain cross-section and panel data parameter estimates that are robustto model speci¯cation. For this purpose, we use a Bayesian Model Averaging (BMA) approach.Bayesian methodology constructs parameter estimates as a weighted average of linear regression estimates for every possible combination of included variables. The weight of each regression estimate is given by the posterior probability of each model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this paper is twofold. First, we study the determinants of economic growth among a wide set of potential variables for the Spanish provinces (NUTS3). Among others, we include various types of private, public and human capital in the group of growth factors. Also,we analyse whether Spanish provinces have converged in economic terms in recent decades. Thesecond objective is to obtain cross-section and panel data parameter estimates that are robustto model speci¯cation. For this purpose, we use a Bayesian Model Averaging (BMA) approach.Bayesian methodology constructs parameter estimates as a weighted average of linear regression estimates for every possible combination of included variables. The weight of each regression estimate is given by the posterior probability of each model.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We conduct a large-scale comparative study on linearly combining superparent-one-dependence estimators (SPODEs), a popular family of seminaive Bayesian classifiers. Altogether, 16 model selection and weighing schemes, 58 benchmark data sets, and various statistical tests are employed. This paper's main contributions are threefold. First, it formally presents each scheme's definition, rationale, and time complexity and hence can serve as a comprehensive reference for researchers interested in ensemble learning. Second, it offers bias-variance analysis for each scheme's classification error performance. Third, it identifies effective schemes that meet various needs in practice. This leads to accurate and fast classification algorithms which have an immediate and significant impact on real-world applications. Another important feature of our study is using a variety of statistical tests to evaluate multiple learning methods across multiple data sets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A novel technique for estimating the rank of the trajectory matrix in the local subspace affinity (LSA) motion segmentation framework is presented. This new rank estimation is based on the relationship between the estimated rank of the trajectory matrix and the affinity matrix built with LSA. The result is an enhanced model selection technique for trajectory matrix rank estimation by which it is possible to automate LSA, without requiring any a priori knowledge, and to improve the final segmentation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a new general concentration-of-measure inequality and illustrate its power by applications in random combinatorics. The results find direct applications in some problems of learning theory.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Daily precipitation is recorded as the total amount of water collected by a rain-gauge in 24 h. Events are modelled as a Poisson process and the 24 h precipitation by a Generalised Pareto Distribution (GPD) of excesses. Hazard assessment is complete when estimates of the Poisson rate and the distribution parameters, together with a measure of their uncertainty, are obtained. The shape parameter of the GPD determines the support of the variable: Weibull domain of attraction (DA) corresponds to finite support variables as should be for natural phenomena. However, Fréchet DA has been reported for daily precipitation, which implies an infinite support and a heavy-tailed distribution. Bayesian techniques are used to estimate the parameters. The approach is illustrated with precipitation data from the Eastern coast of the Iberian Peninsula affected by severe convective precipitation. The estimated GPD is mainly in the Fréchet DA, something incompatible with the common sense assumption of that precipitation is a bounded phenomenon. The bounded character of precipitation is then taken as a priori hypothesis. Consistency of this hypothesis with the data is checked in two cases: using the raw-data (in mm) and using log-transformed data. As expected, a Bayesian model checking clearly rejects the model in the raw-data case. However, log-transformed data seem to be consistent with the model. This fact may be due to the adequacy of the log-scale to represent positive measurements for which differences are better relative than absolute

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Salmonella is distributed worldwide and is a pathogen of economic and public health importance. As a multi-host pathogen with a long environmental persistence, it is a suitable model for the study of wildlife-livestock interactions. In this work, we aim to explore the spill-over of Salmonella between free-ranging wild boar and livestock in a protected natural area in NE Spain and the presence of antimicrobial resistance. Salmonella prevalence, serotypes and diversity were compared between wild boars, sympatric cattle and wild boars from cattle-free areas. The effect of age, sex, cattle presence and cattle herd size on Salmonella probability of infection in wild boars was explored by means of Generalized Linear Models and a model selection based on the Akaike’s Information Criterion. Prevalence was higher in wild boars co-habiting with cattle (35.67%, CI 95% 28.19–43.70) than in wild boar from cattle-free areas (17.54%, CI 95% 8.74–29.91). Probability of a wild boar being a Salmonella carrier increased with cattle herd size but decreased with the host age. Serotypes Meleagridis, Anatum and Othmarschen were isolated concurrently from cattle and sympatric wild boars. Apart from serotypes shared with cattle, wild boars appear to have their own serotypes, which are also found in wild boars from cattle-free areas (Enteritidis, Mikawasima, 4:b:- and 35:r:z35). Serotype richness (diversity) was higher in wild boars co-habiting with cattle, but evenness was not altered by the introduction of serotypes from cattle. The finding of a S. Mbandaka strain resistant to sulfamethoxazole, streptomycin and chloramphenicol and a S. Enteritidis strain resistant to ciprofloxacin and nalidixic acid in wild boars is cause for public health concern.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Health and inequalities in health among inhabitants of European cities are of major importance for European public health and there is great interest in how different health care systems in Europe perform in the reduction of health inequalities. However, evidence on the spatial distribution of cause-specific mortality across neighbourhoods of European cities is scarce. This study presents maps of avoidable mortality in European cities and analyses differences in avoidable mortality between neighbourhoods with different levels of deprivation. Methods: We determined the level of mortality from 14 avoidable causes of death for each neighbourhood of 15 large cities in different European regions. To address the problems associated with Standardised Mortality Ratios for small areas we smooth them using the Bayesian model proposed by Besag, York and Mollié. Ecological regression analysis was used to assess the association between social deprivation and mortality. Results: Mortality from avoidable causes of death is higher in deprived neighbourhoods and mortality rate ratios between areas with different levels of deprivation differ between gender and cities. In most cases rate ratios are lower among women. While Eastern and Southern European cities show higher levels of avoidable mortality, the association of mortality with social deprivation tends to be higher in Northern and lower in Southern Europe. Conclusions: There are marked differences in the level of avoidable mortality between neighbourhoods of European cities and the level of avoidable mortality is associated with social deprivation. There is no systematic difference in the magnitude of this association between European cities or regions. Spatial patterns of avoidable mortality across small city areas can point to possible local problems and specific strategies to reduce health inequality which is important for the development of urban areas and the well-being of their inhabitants

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper explores the earnings return to Catalan knowledge for public and private workers in Catalonia. In doing so, we allow for a double simultaneous selection process. We consider, on the one hand, the non-random allocation of workers into one sector or another, and on the other, the potential self-selection into Catalan proficiency. In addition, when correcting the earnings equations, we take into account the correlation between the two selectivity rules. Our findings suggest that the apparent higher language return for public sector workers is entirely accounted for by selection effects, whereas knowledge of Catalan has a significant positive return in the private sector, which is somewhat higher when the selection processes are taken into account.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper characterizes the relationship between entrepreneurial wealth and aggregate investment under adverse selection. Its main finding is that such a relationship need not be monotonic. In particular, three results emerge from the analysis: (i) pooling equilibria, in which investment is independent of entrepreneurial wealth, are more likely to arise when entrepreneurial wealth is relatively low; (ii) separating equilibria, in which investment is increasing in entrepreneurial wealth, are most likely to arise when entrepreneurial wealth is relatively high and; (iii) for a given interest rate, an increase in entrepreneurial wealth may generate a discontinuous fall in investment.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We provide methods for forecasting variables and predicting turning points in panel Bayesian VARs. We specify a flexible model which accounts for both interdependencies in the cross section and time variations in the parameters. Posterior distributions for the parameters are obtained for a particular type of diffuse, for Minnesota-type and for hierarchical priors. Formulas for multistep, multiunit point and average forecasts are provided. An application to the problem of forecasting the growth rate of output and of predicting turning points in the G-7 illustrates the approach. A comparison with alternative forecasting methods is also provided.