Biblioteca Digital

923 resultados para model selection in binary regression

A general model for host plant selection in phytophagous insects

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop a general theoretical framework for exploring the host plant selection behaviour of herbivorous insects. This model can be used to address a number of questions, including the evolution of specialists, generalists, preference hierarchies, and learning. We use our model to: (i) demonstrate the consequences of the extent to which the reproductive success of a foraging female is limited by the rate at which they find host plants (host limitation) or the number of eggs they carry (egg limitation); (ii) emphasize the different consequences of variation in behaviour before and after landing on (locating) a host (termed pre- and post-alighting, respectively); (iii) show that, in contrast to previous predictions, learning can be favoured in post-alighting behaviour-in particular, individuals can be selected to concentrate oviposition on an abundant low-quality host, whilst ignoring a rare higher-quality host; (iv) emphasize the importance of interactions between mechanisms in favouring specialization or learning. (C) 2002 Elsevier Science Ltd.

Data analysis model selection for estimating local population size of the Risso’s dolphin (Grampus griseus) in the Azores

Relevância:

100.00% 100.00%

Publicador:

Resumo:

27th Annual Conference of the European Cetacean Society. Setúbal, Portugal, 8-10 April 2013.

Model Switching and Model Averaging in Time- Varying Parameter Regression Models

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates the usefulness of switching Gaussian state space models as a tool for implementing dynamic model selecting (DMS) or averaging (DMA) in time-varying parameter regression models. DMS methods allow for model switching, where a different model can be chosen at each point in time. Thus, they allow for the explanatory variables in the time-varying parameter regression model to change over time. DMA will carry out model averaging in a time-varying manner. We compare our exact approach to DMA/DMS to a popular existing procedure which relies on the use of forgetting factor approximations. In an application, we use DMS to select different predictors in an in ation forecasting application. We also compare different ways of implementing DMA/DMS and investigate whether they lead to similar results.

Identifying genetic signatures of selection in a non-model species, alpine gentian (Gentiana nivalis L.), using a landscape genetic approach

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is generally accepted that most plant populations are locally adapted. Yet, understanding how environmental forces give rise to adaptive genetic variation is a challenge in conservation genetics and crucial to the preservation of species under rapidly changing climatic conditions. Environmental variation, phylogeographic history, and population demographic processes all contribute to spatially structured genetic variation, however few current models attempt to separate these confounding effects. To illustrate the benefits of using a spatially-explicit model for identifying potentially adaptive loci, we compared outlier locus detection methods with a recently-developed landscape genetic approach. We analyzed 157 loci from samples of the alpine herb Gentiana nivalis collected across the European Alps. Principle coordinates of neighbor matrices (PCNM), eigenvectors that quantify multi-scale spatial variation present in a data set, were incorporated into a landscape genetic approach relating AFLP frequencies with 23 environmental variables. Four major findings emerged. 1) Fifteen loci were significantly correlated with at least one predictor variable (R (adj) (2) > 0.5). 2) Models including PCNM variables identified eight more potentially adaptive loci than models run without spatial variables. 3) When compared to outlier detection methods, the landscape genetic approach detected four of the same loci plus 11 additional loci. 4) Temperature, precipitation, and solar radiation were the three major environmental factors driving potentially adaptive genetic variation in G. nivalis. Techniques presented in this paper offer an efficient method for identifying potentially adaptive genetic variation and associated environmental forces of selection, providing an important step forward for the conservation of non-model species under global change.

An adult thymic stromal-cell suspension model for in vitro positive selection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presented here is a cell-suspension model for positive selection using thymocytes from alphabeta-TCR (H-2Db-restricted) transgenic mice specific to the lymphocytic choriomeningitis virus (LCMV) on a nonselecting MHC background (H-2d or TAP-1 -/-), cocultured with freshly isolated adult thymus stromal cells of the selecting MHC type. The thymic stromal cells alone induced positive selection of functional CD4- CD8+ cells whose kinetics and efficiency were enhanced by nominal peptide. Fibroblasts expressing the selecting MHC alone did not induce positive selection; however, together with nonselecting stroma and nominal peptide, there was inefficient positive. These results suggest multiple signaling in positive selection with selection events able to occur on multiple-cell types. The ease with which this model can be manipulated should greatly facilitate the resolution of the mechanisms of positive selection in normal and pathological states.

Adaptive model selection using empirical complexities

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.

Exogeneity, weak identification and instrument selection in econometrics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La dernière décennie a connu un intérêt croissant pour les problèmes posés par les variables instrumentales faibles dans la littérature économétrique, c’est-à-dire les situations où les variables instrumentales sont faiblement corrélées avec la variable à instrumenter. En effet, il est bien connu que lorsque les instruments sont faibles, les distributions des statistiques de Student, de Wald, du ratio de vraisemblance et du multiplicateur de Lagrange ne sont plus standard et dépendent souvent de paramètres de nuisance. Plusieurs études empiriques portant notamment sur les modèles de rendements à l’éducation [Angrist et Krueger (1991, 1995), Angrist et al. (1999), Bound et al. (1995), Dufour et Taamouti (2007)] et d’évaluation des actifs financiers (C-CAPM) [Hansen et Singleton (1982,1983), Stock et Wright (2000)], où les variables instrumentales sont faiblement corrélées avec la variable à instrumenter, ont montré que l’utilisation de ces statistiques conduit souvent à des résultats peu fiables. Un remède à ce problème est l’utilisation de tests robustes à l’identification [Anderson et Rubin (1949), Moreira (2002), Kleibergen (2003), Dufour et Taamouti (2007)]. Cependant, il n’existe aucune littérature économétrique sur la qualité des procédures robustes à l’identification lorsque les instruments disponibles sont endogènes ou à la fois endogènes et faibles. Cela soulève la question de savoir ce qui arrive aux procédures d’inférence robustes à l’identification lorsque certaines variables instrumentales supposées exogènes ne le sont pas effectivement. Plus précisément, qu’arrive-t-il si une variable instrumentale invalide est ajoutée à un ensemble d’instruments valides? Ces procédures se comportent-elles différemment? Et si l’endogénéité des variables instrumentales pose des difficultés majeures à l’inférence statistique, peut-on proposer des procédures de tests qui sélectionnent les instruments lorsqu’ils sont à la fois forts et valides? Est-il possible de proposer les proédures de sélection d’instruments qui demeurent valides même en présence d’identification faible? Cette thèse se focalise sur les modèles structurels (modèles à équations simultanées) et apporte des réponses à ces questions à travers quatre essais. Le premier essai est publié dans Journal of Statistical Planning and Inference 138 (2008) 2649 – 2661. Dans cet essai, nous analysons les effets de l’endogénéité des instruments sur deux statistiques de test robustes à l’identification: la statistique d’Anderson et Rubin (AR, 1949) et la statistique de Kleibergen (K, 2003), avec ou sans instruments faibles. D’abord, lorsque le paramètre qui contrôle l’endogénéité des instruments est fixe (ne dépend pas de la taille de l’échantillon), nous montrons que toutes ces procédures sont en général convergentes contre la présence d’instruments invalides (c’est-à-dire détectent la présence d’instruments invalides) indépendamment de leur qualité (forts ou faibles). Nous décrivons aussi des cas où cette convergence peut ne pas tenir, mais la distribution asymptotique est modifiée d’une manière qui pourrait conduire à des distorsions de niveau même pour de grands échantillons. Ceci inclut, en particulier, les cas où l’estimateur des double moindres carrés demeure convergent, mais les tests sont asymptotiquement invalides. Ensuite, lorsque les instruments sont localement exogènes (c’est-à-dire le paramètre d’endogénéité converge vers zéro lorsque la taille de l’échantillon augmente), nous montrons que ces tests convergent vers des distributions chi-carré non centrées, que les instruments soient forts ou faibles. Nous caractérisons aussi les situations où le paramètre de non centralité est nul et la distribution asymptotique des statistiques demeure la même que dans le cas des instruments valides (malgré la présence des instruments invalides). Le deuxième essai étudie l’impact des instruments faibles sur les tests de spécification du type Durbin-Wu-Hausman (DWH) ainsi que le test de Revankar et Hartley (1973). Nous proposons une analyse en petit et grand échantillon de la distribution de ces tests sous l’hypothèse nulle (niveau) et l’alternative (puissance), incluant les cas où l’identification est déficiente ou faible (instruments faibles). Notre analyse en petit échantillon founit plusieurs perspectives ainsi que des extensions des précédentes procédures. En effet, la caractérisation de la distribution de ces statistiques en petit échantillon permet la construction des tests de Monte Carlo exacts pour l’exogénéité même avec les erreurs non Gaussiens. Nous montrons que ces tests sont typiquement robustes aux intruments faibles (le niveau est contrôlé). De plus, nous fournissons une caractérisation de la puissance des tests, qui exhibe clairement les facteurs qui déterminent la puissance. Nous montrons que les tests n’ont pas de puissance lorsque tous les instruments sont faibles [similaire à Guggenberger(2008)]. Cependant, la puissance existe tant qu’au moins un seul instruments est fort. La conclusion de Guggenberger (2008) concerne le cas où tous les instruments sont faibles (un cas d’intérêt mineur en pratique). Notre théorie asymptotique sous les hypothèses affaiblies confirme la théorie en échantillon fini. Par ailleurs, nous présentons une analyse de Monte Carlo indiquant que: (1) l’estimateur des moindres carrés ordinaires est plus efficace que celui des doubles moindres carrés lorsque les instruments sont faibles et l’endogenéité modérée [conclusion similaire à celle de Kiviet and Niemczyk (2007)]; (2) les estimateurs pré-test basés sur les tests d’exogenété ont une excellente performance par rapport aux doubles moindres carrés. Ceci suggère que la méthode des variables instrumentales ne devrait être appliquée que si l’on a la certitude d’avoir des instruments forts. Donc, les conclusions de Guggenberger (2008) sont mitigées et pourraient être trompeuses. Nous illustrons nos résultats théoriques à travers des expériences de simulation et deux applications empiriques: la relation entre le taux d’ouverture et la croissance économique et le problème bien connu du rendement à l’éducation. Le troisième essai étend le test d’exogénéité du type Wald proposé par Dufour (1987) aux cas où les erreurs de la régression ont une distribution non-normale. Nous proposons une nouvelle version du précédent test qui est valide même en présence d’erreurs non-Gaussiens. Contrairement aux procédures de test d’exogénéité usuelles (tests de Durbin-Wu-Hausman et de Rvankar- Hartley), le test de Wald permet de résoudre un problème courant dans les travaux empiriques qui consiste à tester l’exogénéité partielle d’un sous ensemble de variables. Nous proposons deux nouveaux estimateurs pré-test basés sur le test de Wald qui performent mieux (en terme d’erreur quadratique moyenne) que l’estimateur IV usuel lorsque les variables instrumentales sont faibles et l’endogénéité modérée. Nous montrons également que ce test peut servir de procédure de sélection de variables instrumentales. Nous illustrons les résultats théoriques par deux applications empiriques: le modèle bien connu d’équation du salaire [Angist et Krueger (1991, 1999)] et les rendements d’échelle [Nerlove (1963)]. Nos résultats suggèrent que l’éducation de la mère expliquerait le décrochage de son fils, que l’output est une variable endogène dans l’estimation du coût de la firme et que le prix du fuel en est un instrument valide pour l’output. Le quatrième essai résout deux problèmes très importants dans la littérature économétrique. D’abord, bien que le test de Wald initial ou étendu permette de construire les régions de confiance et de tester les restrictions linéaires sur les covariances, il suppose que les paramètres du modèle sont identifiés. Lorsque l’identification est faible (instruments faiblement corrélés avec la variable à instrumenter), ce test n’est en général plus valide. Cet essai développe une procédure d’inférence robuste à l’identification (instruments faibles) qui permet de construire des régions de confiance pour la matrices de covariances entre les erreurs de la régression et les variables explicatives (possiblement endogènes). Nous fournissons les expressions analytiques des régions de confiance et caractérisons les conditions nécessaires et suffisantes sous lesquelles ils sont bornés. La procédure proposée demeure valide même pour de petits échantillons et elle est aussi asymptotiquement robuste à l’hétéroscédasticité et l’autocorrélation des erreurs. Ensuite, les résultats sont utilisés pour développer les tests d’exogénéité partielle robustes à l’identification. Les simulations Monte Carlo indiquent que ces tests contrôlent le niveau et ont de la puissance même si les instruments sont faibles. Ceci nous permet de proposer une procédure valide de sélection de variables instrumentales même s’il y a un problème d’identification. La procédure de sélection des instruments est basée sur deux nouveaux estimateurs pré-test qui combinent l’estimateur IV usuel et les estimateurs IV partiels. Nos simulations montrent que: (1) tout comme l’estimateur des moindres carrés ordinaires, les estimateurs IV partiels sont plus efficaces que l’estimateur IV usuel lorsque les instruments sont faibles et l’endogénéité modérée; (2) les estimateurs pré-test ont globalement une excellente performance comparés à l’estimateur IV usuel. Nous illustrons nos résultats théoriques par deux applications empiriques: la relation entre le taux d’ouverture et la croissance économique et le modèle de rendements à l’éducation. Dans la première application, les études antérieures ont conclu que les instruments n’étaient pas trop faibles [Dufour et Taamouti (2007)] alors qu’ils le sont fortement dans la seconde [Bound (1995), Doko et Dufour (2009)]. Conformément à nos résultats théoriques, nous trouvons les régions de confiance non bornées pour la covariance dans le cas où les instruments sont assez faibles.

A decision support model for partner selection in sustainable construction

Relevância:

100.00% 100.00%

Publicador:

Model selection approaches for non-linear system identification: a review

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The identification of non-linear systems using only observed finite datasets has become a mature research area over the last two decades. A class of linear-in-the-parameter models with universal approximation capabilities have been intensively studied and widely used due to the availability of many linear-learning algorithms and their inherent convergence conditions. This article presents a systematic overview of basic research on model selection approaches for linear-in-the-parameter models. One of the fundamental problems in non-linear system identification is to find the minimal model with the best model generalisation performance from observational data only. The important concepts in achieving good model generalisation used in various non-linear system-identification algorithms are first reviewed, including Bayesian parameter regularisation and models selective criteria based on the cross validation and experimental design. A significant advance in machine learning has been the development of the support vector machine as a means for identifying kernel models based on the structural risk minimisation principle. The developments on the convex optimisation-based model construction algorithms including the support vector regression algorithms are outlined. Input selection algorithms and on-line system identification algorithms are also included in this review. Finally, some industrial applications of non-linear models are discussed.

Improved maximum-likelihood estimation in a regression model with general parametrization

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We analyse the finite-sample behaviour of two second-order bias-corrected alternatives to the maximum-likelihood estimator of the parameters in a multivariate normal regression model with general parametrization proposed by Patriota and Lemonte [A. G. Patriota and A. J. Lemonte, Bias correction in a multivariate regression model with genereal parameterization, Stat. Prob. Lett. 79 (2009), pp. 1655-1662]. The two finite-sample corrections we consider are the conventional second-order bias-corrected estimator and the bootstrap bias correction. We present the numerical results comparing the performance of these estimators. Our results reveal that analytical bias correction outperforms numerical bias corrections obtained from bootstrapping schemes.

Model selection, estimation and forecasting in VAR models with short-run and long-run restrictions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study the joint determination of the lag length, the dimension of the cointegrating space and the rank of the matrix of short-run parameters of a vector autoregressive (VAR) model using model selection criteria. We consider model selection criteria which have data-dependent penalties for a lack of parsimony, as well as the traditional ones. We suggest a new procedure which is a hybrid of traditional criteria and criteria with data-dependant penalties. In order to compute the fit of each model, we propose an iterative procedure to compute the maximum likelihood estimates of parameters of a VAR model with short-run and long-run restrictions. Our Monte Carlo simulations measure the improvements in forecasting accuracy that can arise from the joint determination of lag-length and rank, relative to the commonly used procedure of selecting the lag-length only and then testing for cointegration.

Model selection, estimation and forecasting in VAR models with short-run and long-run restrictions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study the joint determination of the lag length, the dimension of the cointegrating space and the rank of the matrix of short-run parameters of a vector autoregressive (VAR) model using model selection criteria. We consider model selection criteria which have data-dependent penalties as well as the traditional ones. We suggest a new two-step model selection procedure which is a hybrid of traditional criteria and criteria with data-dependant penalties and we prove its consistency. Our Monte Carlo simulations measure the improvements in forecasting accuracy that can arise from the joint determination of lag-length and rank using our proposed procedure, relative to an unrestricted VAR or a cointegrated VAR estimated by the commonly used procedure of selecting the lag-length only and then testing for cointegration. Two empirical applications forecasting Brazilian inflation and U.S. macroeconomic aggregates growth rates respectively show the usefulness of the model-selection strategy proposed here. The gains in different measures of forecasting accuracy are substantial, especially for short horizons.

Model selection, estimation and forecasting in VAR models with short-run and long-run restrictions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study the joint determination of the lag length, the dimension of the cointegrating space and the rank of the matrix of short-run parameters of a vector autoregressive (VAR) model using model selection criteria. We consider model selection criteria which have data-dependent penalties as well as the traditional ones. We suggest a new two-step model selection procedure which is a hybrid of traditional criteria and criteria with data-dependant penalties and we prove its consistency. Our Monte Carlo simulations measure the improvements in forecasting accuracy that can arise from the joint determination of lag-length and rank using our proposed procedure, relative to an unrestricted VAR or a cointegrated VAR estimated by the commonly used procedure of selecting the lag-length only and then testing for cointegration. Two empirical applications forecasting Brazilian in ation and U.S. macroeconomic aggregates growth rates respectively show the usefulness of the model-selection strategy proposed here. The gains in di¤erent measures of forecasting accuracy are substantial, especially for short horizons.

Model selection, estimation and forecasting in VAR models with short-run and long-run restrictions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We study the joint determination of the lag length, the dimension of the cointegrating space and the rank of the matrix of short-run parameters of a vector autoregressive (VAR) model using model selection criteria. We suggest a new two-step model selection procedure which is a hybrid of traditional criteria and criteria with data-dependant penalties and we prove its consistency. A Monte Carlo study explores the finite sample performance of this procedure and evaluates the forecasting accuracy of models selected by this procedure. Two empirical applications confirm the usefulness of the model selection procedure proposed here for forecasting.

A comparison of statistical methods for genomic selection in a mice population

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

«
1
2
3
4
5
6
7
8
...
61
62
»