Essays on Machine Learning and Hedonic Models


Autoria(s): Yang, Miaoyu
Contribuinte(s)

Bajari, Patrick

Data(s)

22/09/2016

01/08/2016

Resumo

Thesis (Ph.D.)--University of Washington, 2016-08

Chapter 1 and 2: We survey and apply several techniques from the statistical and computer science literature to the problem of demand estimation. We derive novel asymptotic properties for several of these models. To improve out-of-sample prediction accuracy and obtain parametric rates of convergence, we propose a method of combining the underlying models via linear regression. We illustrate our method using a standard scanner panel data set to estimate promotional lift and find that our estimates are considerably more accurate in out-of-sample predictions of demand than some commonly-used alternatives. While demand estimation is our motivating application, these methods are widely applicable to other microeconometric problems. Chapter 3: We collect high dimensional data and extract features from house descriptions and images to use as controls within a hedonic model to estimate the impact of fracking on house prices in Pennsylvania. Supplementing a structured dataset with high dimensional unstructured data in the form of descriptive words and images of homes can help to close the gap caused by omitted variable bias. We construct curb appeal scores based on aesthetic features of home images. We then compare four models: OLS, LASSO - OLS, random forest and gradient boosting. The ensemble tree models (random forest and gradient boosting) yield 10% improvements in prediction accuracy compared to LASSO and OLS. Our results imply that royalty payments exactly compensate for the negative environmental effects on homes within 1 km of fracking wells but increase the price of houses farther away by up to 5%.

Formato

application/pdf

Identificador

Yang_washington_0250E_16367.pdf

http://hdl.handle.net/1773/37084

Idioma(s)

en_US

Palavras-Chave #Applied Microeconometrics #Demand Estimation #Environmental Economics #Hedonic Models #High Dimensional Data #Machine Learning #Economics #economics
Tipo

Thesis