992 resultados para Semiparametric estimation
Resumo:
This paper develops a semiparametric estimation approach for mixed count regression models based on series expansion for the unknown density of the unobserved heterogeneity. We use the generalized Laguerre series expansion around a gamma baseline density to model unobserved heterogeneity in a Poisson mixture model. We establish the consistency of the estimator and present a computational strategy to implement the proposed estimation techniques in the standard count model as well as in truncated, censored, and zero-inflated count regression models. Monte Carlo evidence shows that the finite sample behavior of the estimator is quite good. The paper applies the method to a model of individual shopping behavior. © 1999 Elsevier Science S.A. All rights reserved.
Resumo:
This thesis is composed of an introductory chapter and four applications each of them constituting an own chapter. The common element underlying each of the chapters is the econometric methodology. The applications rely mostly on the leading econometric techniques related to estimation of causal effects. The first chapter introduces the econometric techniques that are employed in the remaining chapters. Chapter 2 studies the effects of shocking news on student performance. It exploits the fact that the school shooting in Kauhajoki in 2008 coincided with the matriculation examination period of that fall. It shows that the performance of men declined due to the news of the school shooting. For women the similar pattern remains unobserved. Chapter 3 studies the effects of minimum wage on employment by employing the original Card and Krueger (1994; CK) and Neumark and Wascher (2000; NW) data together with the changes-in-changes (CIC) estimator. As the main result it shows that the employment effect of an increase in the minimum wage is positive for small fast-food restaurants and negative for big fast-food restaurants. Therefore, it shows that the controversial positive employment effect reported by CK is overturned for big fast-food restaurants and that the NW data are shown, in contrast to their original results, to provide support for the positive employment effect. Chapter 4 employs the state-specific U.S. data (collected by Cohen and Einav [2003; CE]) on traffic fatalities to re-evaluate the effects of seat belt laws on the traffic fatalities by using the CIC estimator. It confirms the CE results that on the average an implementation of a mandatory seat belt law results in an increase in the seat belt usage rate and a decrease in the total fatality rate. In contrast to CE, it also finds evidence on compensating-behavior theory, which is observed especially in the states by the border of the U.S. Chapter 5 studies the life cycle consumption in Finland, with the special interest laid on the baby boomers and the older households. It shows that the baby boomers smooth their consumption over the life cycle more than other generations. It also shows that the old households smoothed their life cycle consumption more as a result of the recession in the 1990s, compared to young households.
Resumo:
We study semiparametric two-step estimators which have the same structure as parametric doubly robust estimators in their second step. The key difference is that we do not impose any parametric restriction on the nuisance functions that are estimated in a first stage, but retain a fully nonparametric model instead. We call these estimators semiparametric doubly robust estimators (SDREs), and show that they possess superior theoretical and practical properties compared to generic semiparametric two-step estimators. In particular, our estimators have substantially smaller first-order bias, allow for a wider range of nonparametric first-stage estimates, rate-optimal choices of smoothing parameters and data-driven estimates thereof, and their stochastic behavior can be well-approximated by classical first-order asymptotics. SDREs exist for a wide range of parameters of interest, particularly in semiparametric missing data and causal inference models. We illustrate our method with a simulation exercise.
Resumo:
This paper presents calculations of semiparametric efficiency bounds for quantile treatment effects parameters when se1ection to treatment is based on observable characteristics. The paper also presents three estimation procedures forthese parameters, alI ofwhich have two steps: a nonparametric estimation and a computation ofthe difference between the solutions of two distinct minimization problems. Root-N consistency, asymptotic normality, and the achievement ofthe semiparametric efficiency bound is shown for one ofthe three estimators. In the final part ofthe paper, an empirical application to a job training program reveals the importance of heterogeneous treatment effects, showing that for this program the effects are concentrated in the upper quantiles ofthe earnings distribution.
Resumo:
This paper considers a wide class of semiparametric problems with a parametric part for some covariate effects and repeated evaluations of a nonparametric function. Special cases in our approach include marginal models for longitudinal/clustered data, conditional logistic regression for matched case-control studies, multivariate measurement error models, generalized linear mixed models with a semiparametric component, and many others. We propose profile-kernel and backfitting estimation methods for these problems, derive their asymptotic distributions, and show that in likelihood problems the methods are semiparametric efficient. While generally not true, with our methods profiling and backfitting are asymptotically equivalent. We also consider pseudolikelihood methods where some nuisance parameters are estimated from a different algorithm. The proposed methods are evaluated using simulation studies and applied to the Kenya hemoglobin data.
Resumo:
In recent years, researchers in the health and social sciences have become increasingly interested in mediation analysis. Specifically, upon establishing a non-null total effect of an exposure, investigators routinely wish to make inferences about the direct (indirect) pathway of the effect of the exposure not through (through) a mediator variable that occurs subsequently to the exposure and prior to the outcome. Natural direct and indirect effects are of particular interest as they generally combine to produce the total effect of the exposure and therefore provide insight on the mechanism by which it operates to produce the outcome. A semiparametric theory has recently been proposed to make inferences about marginal mean natural direct and indirect effects in observational studies (Tchetgen Tchetgen and Shpitser, 2011), which delivers multiply robust locally efficient estimators of the marginal direct and indirect effects, and thus generalizes previous results for total effects to the mediation setting. In this paper we extend the new theory to handle a setting in which a parametric model for the natural direct (indirect) effect within levels of pre-exposure variables is specified and the model for the observed data likelihood is otherwise unrestricted. We show that estimation is generally not feasible in this model because of the curse of dimensionality associated with the required estimation of auxiliary conditional densities or expectations, given high-dimensional covariates. We thus consider multiply robust estimation and propose a more general model which assumes a subset but not all of several working models holds.
Semiparametric estimates of the supply and demand effects of disability on labor force participation
Resumo:
This paper modifies and uses the semiparametric methods of Ichimura and Lee (1991) on standard cross-section data to decompose the effect of disability on labor force participation into a demand and a supply effect. It shows that straightforward use of Ichimura and Lee leads to meaningless results while imposing monotonicity on the unknown function leads to substantial results. The paper finds that supply effects dominate the demand effects of disability.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
The paper addresses the issue of choice of bandwidth in the application of semiparametric estimation of the long memory parameter in a univariate time series process. The focus is on the properties of forecasts from the long memory model. A variety of cross-validation methods based on out of sample forecasting properties are proposed. These procedures are used for the choice of bandwidth and subsequent model selection. Simulation evidence is presented that demonstrates the advantage of the proposed new methodology.
Resumo:
In many clinical trials to evaluate treatment efficacy, it is believed that there may exist latent treatment effectiveness lag times after which medical procedure or chemical compound would be in full effect. In this article, semiparametric regression models are proposed and studied to estimate the treatment effect accounting for such latent lag times. The new models take advantage of the invariance property of the additive hazards model in marginalizing over random effects, so parameters in the models are easy to be estimated and interpreted, while the flexibility without specifying baseline hazard function is kept. Monte Carlo simulation studies demonstrate the appropriateness of the proposed semiparametric estimation procedure. Data collected in the actual randomized clinical trial, which evaluates the effectiveness of biodegradable carmustine polymers for treatment of recurrent brain tumors, are analyzed.
Resumo:
We investigate whether relative contributions of genetic and shared environmental factors are associated with an increased risk in melanoma. Data from the Queensland Familial Melanoma Project comprising 15,907 subjects arising from 1912 families were analyzed to estimate the additive genetic, common and unique environmental contributions to variation in the age at onset of melanoma. Two complementary approaches for analyzing correlated time-to-onset family data were considered: the generalized estimating equations (GEE) method in which one can estimate relationship-specific dependence simultaneously with regression coefficients that describe the average population response to changing covariates; and a subject-specific Bayesian mixed model in which heterogeneity in regression parameters is explicitly modeled and the different components of variation may be estimated directly. The proportional hazards and Weibull models were utilized, as both produce natural frameworks for estimating relative risks while adjusting for simultaneous effects of other covariates. A simple Markov Chain Monte Carlo method for covariate imputation of missing data was used and the actual implementation of the Bayesian model was based on Gibbs sampling using the free ware package BUGS. In addition, we also used a Bayesian model to investigate the relative contribution of genetic and environmental effects on the expression of naevi and freckles, which are known risk factors for melanoma.
Resumo:
Dada la persistencia de las diferencias en ingresos laborales por regiones en Colombia, el presente artículo propone cuantificar la magnitud de este diferencial que es atribuida a la diferencia en estructuras de mercado laboral, entendiendo esta última como la diferencia en los retornos a las características de la fuerza laboral. Para ello se propone el uso de un método de descomposición del tipo Oaxaca- Blinder y se compara a Bogotá –la ciudad con mayores ingresos laborales- con otras ciudades principales. Los resultados obtenidos al conducir el ejercicio de descomposición muestran que las diferencias en estructura están a favor de Bogotá y que estas explican más de la mitad de la diferencia total, indicando que si se quieren reducir las disparidades de ingresos laborales entre ciudades no es suficiente con calificar la fuerza laboral y que es necesario indagar por las causas que hacen que los retornos a las características difieran entre ciudades.
Resumo:
We consider estimating the total load from frequent flow data but less frequent concentration data. There are numerous load estimation methods available, some of which are captured in various online tools. However, most estimators are subject to large biases statistically, and their associated uncertainties are often not reported. This makes interpretation difficult and the estimation of trends or determination of optimal sampling regimes impossible to assess. In this paper, we first propose two indices for measuring the extent of sampling bias, and then provide steps for obtaining reliable load estimates that minimizes the biases and makes use of informative predictive variables. The key step to this approach is in the development of an appropriate predictive model for concentration. This is achieved using a generalized rating-curve approach with additional predictors that capture unique features in the flow data, such as the concept of the first flush, the location of the event on the hydrograph (e.g. rise or fall) and the discounted flow. The latter may be thought of as a measure of constituent exhaustion occurring during flood events. Forming this additional information can significantly improve the predictability of concentration, and ultimately the precision with which the pollutant load is estimated. We also provide a measure of the standard error of the load estimate which incorporates model, spatial and/or temporal errors. This method also has the capacity to incorporate measurement error incurred through the sampling of flow. We illustrate this approach for two rivers delivering to the Great Barrier Reef, Queensland, Australia. One is a data set from the Burdekin River, and consists of the total suspended sediment (TSS) and nitrogen oxide (NO(x)) and gauged flow for 1997. The other dataset is from the Tully River, for the period of July 2000 to June 2008. For NO(x) Burdekin, the new estimates are very similar to the ratio estimates even when there is no relationship between the concentration and the flow. However, for the Tully dataset, by incorporating the additional predictive variables namely the discounted flow and flow phases (rising or recessing), we substantially improved the model fit, and thus the certainty with which the load is estimated.
Resumo:
This paper provides a root-n consistent, asymptotically normal weighted least squares estimator of the coefficients in a truncated regression model. The distribution of the errors is unknown and permits general forms of unknown heteroskedasticity. Also provided is an instrumental variables based two-stage least squares estimator for this model, which can be used when some regressors are endogenous, mismeasured, or otherwise correlated with the errors. A simulation study indicates that the new estimators perform well in finite samples. Our limiting distribution theory includes a new asymptotic trimming result addressing the boundary bias in first-stage density estimation without knowledge of the support boundary. © 2007 Cambridge University Press.