996 resultados para model misspecification


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Birds represent the most diverse extant tetrapod clade, with ca. 10,000 extant species, and the timing of the crown avian radiation remains hotly debated. The fossil record supports a primarily Cenozoic radiation of crown birds, whereas molecular divergence dating analyses generally imply that this radiation was well underway during the Cretaceous. Furthermore, substantial differences have been noted between published divergence estimates. These have been variously attributed to clock model, calibration regime, and gene type. One underappreciated phenomenon is that disparity between fossil ages and molecular dates tends to be proportionally greater for shallower nodes in the avian Tree of Life. Here, we explore potential drivers of disparity in avian divergence dates through a set of analyses applying various calibration strategies and coding methods to a mitochondrial genome dataset and an 18-gene nuclear dataset, both sampled across 72 taxa. Our analyses support the occurrence of two deep divergences (i.e., the Palaeognathae/Neognathae split and the Galloanserae/Neoaves split) well within the Cretaceous, followed by a rapid radiation of Neoaves near the K-Pg boundary. However, 95% highest posterior density intervals for most basal divergences in Neoaves cross the boundary, and we emphasize that, barring unreasonably strict prior distributions, distinguishing between a rapid Early Paleocene radiation and a Late Cretaceous radiation may be beyond the resolving power of currently favored divergence dating methods. In contrast to recent observations for placental mammals, constraining all divergences within Neoaves to occur in the Cenozoic does not result in unreasonably high inferred substitution rates. Comparisons of nuclear DNA (nDNA) versus mitochondrial DNA (mtDNA) datasets and NT- versus RY-coded mitochondrial data reveal patterns of disparity that are consistent with substitution model misspecifications that result in tree compression/tree extension artifacts, which may explain some discordance between previous divergence estimates based on different sequence types. Comparisons of fully calibrated and nominally calibrated trees support a correlation between body mass and apparent dating error. Overall, our results are consistent with (but do not require) a Paleogene radiation for most major clades of crown birds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates the performance of the tests proposed by Hadri and by Hadri and Larsson for testing for stationarity in heterogeneous panel data under model misspecification. The panel tests are based on the well known KPSS test (cf. Kwiatkowski et al.) which considers two models: stationarity around a deterministic level and stationarity around a deterministic trend. There is no study, as far as we know, on the statistical properties of the test when the wrong model is used. We also consider the case of the simultaneous presence of the two types of models in a panel. We employ two asymptotics: joint asymptotic, T, N -> infinity simultaneously, and T fixed and N allowed to grow indefinitely. We use Monte Carlo experiments to investigate the effects of misspecification in sample sizes usually used in practice. The results indicate that the assumption that T is fixed rather than asymptotic leads to tests that have less size distortions, particularly for relatively small T with large N panels (micro-panels) than the tests derived under the joint asymptotics. We also find that choosing a deterministic trend when a deterministic level is true does not significantly affect the properties of the test. But, choosing a deterministic level when a deterministic trend is true leads to extreme over-rejections. Therefore, when unsure about which model has generated the data, it is suggested to use the model with a trend. We also propose a new statistic for testing for stationarity in mixed panel data where the mixture is known. The performance of this new test is very good for both cases of T asymptotic and T fixed. The statistic for T asymptotic is slightly undersized when T is very small (

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper is concerned with using the bootstrap to obtain improved critical values for the error correction model (ECM) cointegration test in dynamic models. In the paper we investigate the effects of dynamic specification on the size and power of the ECM cointegration test with bootstrap critical values. The results from a Monte Carlo study show that the size of the bootstrap ECM cointegration test is close to the nominal significance level. We find that overspecification of the lag length results in a loss of power. Underspecification of the lag length results in size distortion. The performance of the bootstrap ECM cointegration test deteriorates if the correct lag length is not used in the ECM. The bootstrap ECM cointegration test is therefore not robust to model misspecification.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For over half a century, it has been known that the rate of morphological evolution appears to vary with the time frame of measurement. Rates of microevolutionary change, measured between successive generations, were found to be far higher than rates of macroevolutionary change inferred from the fossil record. More recently, it has been suggested that rates of molecular evolution are also time dependent, with the estimated rate depending on the timescale of measurement. This followed surprising observations that estimates of mutation rates, obtained in studies of pedigrees and laboratory mutation-accumulation lines, exceeded long-term substitution rates by an order of magnitude or more. Although a range of studies have provided evidence for such a pattern, the hypothesis remains relatively contentious. Furthermore, there is ongoing discussion about the factors that can cause molecular rate estimates to be dependent on time. Here we present an overview of our current understanding of time-dependent rates. We provide a summary of the evidence for time-dependent rates in animals, bacteria and viruses. We review the various biological and methodological factors that can cause rates to be time dependent, including the effects of natural selection, calibration errors, model misspecification and other artefacts. We also describe the challenges in calibrating estimates of molecular rates, particularly on the intermediate timescales that are critical for an accurate characterization of time-dependent rates. This has important consequences for the use of molecular-clock methods to estimate timescales of recent evolutionary events.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Despite recent methodological advances in inferring the time-scale of biological evolution from molecular data, the fundamental question of whether our substitution models are sufficiently well specified to accurately estimate branch-lengths has received little attention. I examine this implicit assumption of all molecular dating methods, on a vertebrate mitochondrial protein-coding dataset. Comparison with analyses in which the data are RY-coded (AG → R; CT → Y) suggests that even rates-across-sites maximum likelihood greatly under-compensates for multiple substitutions among the standard (ACGT) NT-coded data, which has been subject to greater phylogenetic signal erosion. Accordingly, the fossil record indicates that branch-lengths inferred from the NT-coded data translate into divergence time overestimates when calibrated from deeper in the tree. Intriguingly, RY-coding led to the opposite result. The underlying NT and RY substitution model misspecifications likely relate respectively to “hidden” rate heterogeneity and changes in substitution processes across the tree, for which I provide simulated examples. Given the magnitude of the inferred molecular dating errors, branch-length estimation biases may partly explain current conflicts with some palaeontological dating estimates.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Statistical analyses of health program participation seek to address a number of objectives compatible with the evaluation of demand for current resources. In this spirit, a spatial hierarchical model is developed for disentangling patterns in participation at the small area level, as a function of population-based demand and additional variation. For the former, a constrained gravity model is proposed to quantify factors associated with spatial choice and account for competition effects, for programs delivered by multiple clinics. The implications of gravity model misspecification within a mixed effects framework are also explored. The proposed model is applied to participation data from a no-fee mammography program in Brisbane, Australia. Attention is paid to the interpretation of various model outputs and their relevance for public health policy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recreational fisheries in the waters off the northeast U.S. target a variety of pelagic and demersal fish species, and catch and effort data sampled from recreational fisheries are a critical component of the information used in resource evaluation and management. Standardized indices of stock abundance developed from recreational fishery catch rates are routinely used in stock assessments. The statistical properties of both simulated and empirical recreational fishery catch-rate data such as those collected by the National Marine Fisheries Service (NMFS) Marine Recreational Fishery Statistics Survey (MRFSS) are examined, and the potential effects of different assumptions about the error structure of the catch-rate frequency distributions in computing indices of stock abundance are evaluated. Recreational fishery catch distributions sampled by the MRFSS are highly contagious and overdispersed in relation to the normal distribution and are generally best characterized by the Poisson or negative binomial distributions. The modeling of both the simulated and empirical MRFSS catch rates indicates that one may draw erroneous conclusions about stock trends by assuming the wrong error distribution in procedures used to developed standardized indices of stock abundance. The results demonstrate the importance of considering not only the overall model fit and significance of classification effects, but also the possible effects of model misspecification, when determining the most appropriate model construction.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

© 2014, The International Biometric Society.A potential venue to improve healthcare efficiency is to effectively tailor individualized treatment strategies by incorporating patient level predictor information such as environmental exposure, biological, and genetic marker measurements. Many useful statistical methods for deriving individualized treatment rules (ITR) have become available in recent years. Prior to adopting any ITR in clinical practice, it is crucial to evaluate its value in improving patient outcomes. Existing methods for quantifying such values mainly consider either a single marker or semi-parametric methods that are subject to bias under model misspecification. In this article, we consider a general setting with multiple markers and propose a two-step robust method to derive ITRs and evaluate their values. We also propose procedures for comparing different ITRs, which can be used to quantify the incremental value of new markers in improving treatment selection. While working models are used in step I to approximate optimal ITRs, we add a layer of calibration to guard against model misspecification and further assess the value of the ITR non-parametrically, which ensures the validity of the inference. To account for the sampling variability of the estimated rules and their corresponding values, we propose a resampling procedure to provide valid confidence intervals for the value functions as well as for the incremental value of new markers for treatment selection. Our proposals are examined through extensive simulation studies and illustrated with the data from a clinical trial that studies the effects of two drug combinations on HIV-1 infected patients.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tests for business cycle asymmetries are developed for Markov-switching autoregressive models. The tests of deepness, steepness, and sharpness are Wald statistics, which have standard asymptotics. For the standard two-regime model of expansions and contractions, deepness is shown to imply sharpness (and vice versa), whereas the process is always nonsteep. Two and three-state models of U.S. GNP growth are used to illustrate the approach, along with models of U.S. investment and consumption growth. The robustness of the tests to model misspecification, and the effects of regime-dependent heteroscedasticity, are investigated.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The concept of stochastic discount factor pervades the Modern Theory of Asset Pricing. Initially, such object allows unattached pricing models to be discussed under the same terms. However, Hansen and Jagannathan have shown there is worthy information to be brought forth from such powerful concept which undelies asset pricing models. From security market data sets, one is able to explore the behavior of such random variable, determining a useful variance bound. Furthermore, through that instrument, they explore one pitfall on modern asset pricing: model misspecification. Those major contributions, alongside with some of its extensions, are thoroughly investigated in this exposition.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The questlon of the crowding-out of private !nvestment by public expenditure, public investment in particular , ln the Brazilian economy has been discussed more in ideological terrns than on empirical grounds. The present paper tries to avoid the limitation of previous studies by estlmatlng an equation for private investment whlch makes it possible to evaluate the effect of economic policies on prlvate investment. The private lnvestment equation was deduced modifylng the optimal flexible accelerator medel (OFAM) incorporating some channels through which public expendlture influences privateinvestment. The OFAM consists in adding adjustment costs to the neoclassical theory of investrnent. The investment fuction deduced is quite general and has the following explanatory variables: relative prices (user cost of capitaljimput prices ratios), real interest rates, real product, public expenditures and lagged private stock of capital. The model was estimated for private manufacturing industry data. The procedure adopted in estimating the model was to begin with a model as general as possible and apply restrictions to the model ' s parameters and test their statistical significance. A complete diagnostic testing was also made in order to test the stability of estirnated equations. This procedure avoids ' the shortcomings of estimating a model with a apriori restrictions on its parameters , which may lead to model misspecification. The main findings of the present study were: the increase in public expenditure, at least in the long run, has in general a positive expectation effect on private investment greater than its crowding-out effect on priva te investment owing to the simultaneous rise in interst rates; a change in economlc policy, such as that one of Geisel administration, may have an important effect on private lnvestment; and reI ative prices are relevant in determining the leveI of desired stock of capital and private investrnent.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Contracts paying a guaranteed minimum rate of return and a fraction of a positive excess rate, which is specified relative to a benchmark portfolio, are closely related to unit-linked life-insurance products and can be considered as alternatives to direct investment in the underlying benchmark. They contain an embedded power option, and the key issue is the tractable and realistic hedging of this option, in order to rigorously justify valuation by arbitrage arguments and prevent the guarantees from becoming uncontrollable liabilities to the issuer. We show how to determine the contract parameters conservatively and implement robust risk-management strategies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

My dissertation has three chapters which develop and apply microeconometric tech- niques to empirically relevant problems. All the chapters examines the robustness issues (e.g., measurement error and model misspecification) in the econometric anal- ysis. The first chapter studies the identifying power of an instrumental variable in the nonparametric heterogeneous treatment effect framework when a binary treat- ment variable is mismeasured and endogenous. I characterize the sharp identified set for the local average treatment effect under the following two assumptions: (1) the exclusion restriction of an instrument and (2) deterministic monotonicity of the true treatment variable in the instrument. The identification strategy allows for general measurement error. Notably, (i) the measurement error is nonclassical, (ii) it can be endogenous, and (iii) no assumptions are imposed on the marginal distribution of the measurement error, so that I do not need to assume the accuracy of the measure- ment. Based on the partial identification result, I provide a consistent confidence interval for the local average treatment effect with uniformly valid size control. I also show that the identification strategy can incorporate repeated measurements to narrow the identified set, even if the repeated measurements themselves are endoge- nous. Using the the National Longitudinal Study of the High School Class of 1972, I demonstrate that my new methodology can produce nontrivial bounds for the return to college attendance when attendance is mismeasured and endogenous.

The second chapter, which is a part of a coauthored project with Federico Bugni, considers the problem of inference in dynamic discrete choice problems when the structural model is locally misspecified. We consider two popular classes of estimators for dynamic discrete choice models: K-step maximum likelihood estimators (K-ML) and K-step minimum distance estimators (K-MD), where K denotes the number of policy iterations employed in the estimation problem. These estimator classes include popular estimators such as Rust (1987)’s nested fixed point estimator, Hotz and Miller (1993)’s conditional choice probability estimator, Aguirregabiria and Mira (2002)’s nested algorithm estimator, and Pesendorfer and Schmidt-Dengler (2008)’s least squares estimator. We derive and compare the asymptotic distributions of K- ML and K-MD estimators when the model is arbitrarily locally misspecified and we obtain three main results. In the absence of misspecification, Aguirregabiria and Mira (2002) show that all K-ML estimators are asymptotically equivalent regardless of the choice of K. Our first result shows that this finding extends to a locally misspecified model, regardless of the degree of local misspecification. As a second result, we show that an analogous result holds for all K-MD estimators, i.e., all K- MD estimator are asymptotically equivalent regardless of the choice of K. Our third and final result is to compare K-MD and K-ML estimators in terms of asymptotic mean squared error. Under local misspecification, the optimally weighted K-MD estimator depends on the unknown asymptotic bias and is no longer feasible. In turn, feasible K-MD estimators could have an asymptotic mean squared error that is higher or lower than that of the K-ML estimators. To demonstrate the relevance of our asymptotic analysis, we illustrate our findings using in a simulation exercise based on a misspecified version of Rust (1987) bus engine problem.

The last chapter investigates the causal effect of the Omnibus Budget Reconcil- iation Act of 1993, which caused the biggest change to the EITC in its history, on unemployment and labor force participation among single mothers. Unemployment and labor force participation are difficult to define for a few reasons, for example, be- cause of marginally attached workers. Instead of searching for the unique definition for each of these two concepts, this chapter bounds unemployment and labor force participation by observable variables and, as a result, considers various competing definitions of these two concepts simultaneously. This bounding strategy leads to partial identification of the treatment effect. The inference results depend on the construction of the bounds, but they imply positive effect on labor force participa- tion and negligible effect on unemployment. The results imply that the difference- in-difference result based on the BLS definition of unemployment can be misleading

due to misclassification of unemployment.