2 resultados para linear mixed-effects models
em DRUM (Digital Repository at the University of Maryland)
Resumo:
This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.
Resumo:
Causal inference with a continuous treatment is a relatively under-explored problem. In this dissertation, we adopt the potential outcomes framework. Potential outcomes are responses that would be seen for a unit under all possible treatments. In an observational study where the treatment is continuous, the potential outcomes are an uncountably infinite set indexed by treatment dose. We parameterize this unobservable set as a linear combination of a finite number of basis functions whose coefficients vary across units. This leads to new techniques for estimating the population average dose-response function (ADRF). Some techniques require a model for the treatment assignment given covariates, some require a model for predicting the potential outcomes from covariates, and some require both. We develop these techniques using a framework of estimating functions, compare them to existing methods for continuous treatments, and simulate their performance in a population where the ADRF is linear and the models for the treatment and/or outcomes may be misspecified. We also extend the comparisons to a data set of lottery winners in Massachusetts. Next, we describe the methods and functions in the R package causaldrf using data from the National Medical Expenditure Survey (NMES) and Infant Health and Development Program (IHDP) as examples. Additionally, we analyze the National Growth and Health Study (NGHS) data set and deal with the issue of missing data. Lastly, we discuss future research goals and possible extensions.