2 resultados para Data Analytics
em Université de Montréal, Canada
Resumo:
People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.
Resumo:
Objectives: An email information literacy program has been effective for over a decade at Université de Montréal’s Health Library. Students periodically receive messages highlighting the content of guides on the library’s website. We wish to evaluate, using Google Analytics, the effects of the program on specific webpage statistics. Using the data collected, we may pinpoint popular guides as well as others that need improvement. Methods: In the program, first and second-year medical (MD) or dental (DMD) students receive eight bi-monthly email messages. The DMD mailing list also includes graduate students and professors. Enrollment to the program is optional for MDs, but mandatory for DMDs. Google Analytics (GA) profiles have been configured for the libraries websites to collect visitor statistics since June 2009. The GA Links Builder was used to design unique links specifically associated with the originating emails. This approach allowed us to gather information on guide usage, such as the visitor’s program of study, duration of page viewing, number of pages viewed per visit, as well as browsing data. We also followed the evolution of clicks on GA unique links over time, as we believed that users may keep the library's emails and refer to them to access specific information. Results: The proportion of students who actually clicked the email links was, on average, less than 5%. MD and DMD students behaved differently regarding guide views, number of pages visited and length of time on the site. The CINAHL guide was the most visited for DMD students whereas MD students consulted the Pharmaceutical information guide most often. We noted that some students visited referred guides several weeks after receiving messages, thus keeping them for future reference; browsing to additional pages on the library website was also frequent. Conclusion: The mitigated success of the program prompted us to directly survey students on the format, frequency and usefulness of messages. The information gathered from GA links as well as from the survey will allow us to redesign our web content and modify our email information literacy program so that messages are more attractive, timely and useful for students.