956 resultados para Multinomial logit models with random coefficients (RCL)
Resumo:
Com este trabalho objetivou-se determinar parâmetros genéticos para peso corporal de perdizes em cativeiro. Foram utilizados modelos de regressão aleatória na análise dos dados considerando os efeitos genéticos aditivos diretos (AD) e de ambiente permanente de animal (AP) como aleatórios. As variâncias residuais foram modeladas utilizando-se funções de variância de ordem 5. A curva média da população foi ajustada por polinômios ortogonais de Legendre de ordem 6. Os efeitos genéticos aditivos diretos e de ambiente permanente de animal foram modelados utilizando-se polinômios de Legendre de segunda a nona ordem. Os melhores resultados foram obtidos pelos modelos de ordem 6 de ajuste para os efeitos genéticos aditivos diretos e de ordem 3 para os de ambiente permanente pelo Critério de Informação de Akaike e ordem 3 para ambos os efeitos pelos Critério de Informação Bayesiano de Schwartz e Teste de Razão de Verossimilhança. As herdabilidades estimadas variaram de 0,02 a 0,57. O primeiro autovalor respondeu por 94 e 90% da variação decorrente de efeitos aditivos diretos e de ambiente permanente, respectivamente. A seleção de perdizes para peso corporal é mais efetiva a partir de 112 dias de idade.
Resumo:
Random regression models have been widely used to estimate genetic parameters that influence milk production in Bos taurus breeds, and more recently in B. indicus breeds. With the aim of finding appropriate random regression model to analyze milk yield, different parametric functions were compared, applied to 20,524 test-day milk yield records of 2816 first-lactation Guzerat (B. indicus) cows in Brazilian herds. The records were analyzed by random regression models whose random effects were additive genetic, permanent environmental and residual, and whose fixed effects were contemporary group, the covariable cow age at calving (linear and quadratic effects), and the herd lactation curve. The additive genetic and permanent environmental effects were modeled by the Wilmink function, a modified Wilmink function (with the second term divided by 100), a function that combined third-order Legendre polynomials with the last term of the Wilmink function, and the Ali and Schaeffer function. The residual variances were modeled by means of 1, 4, 6, or 10 heterogeneous classes, with the exception of the last term of the Wilmink function, for which there were 1, from 0.20 to 0.33. Genetic correlations between adjacent records were high values (0.83-0.99), but they declined when the interval between the test-day records increased, and were negative between the first and last records. The model employing the Ali and Schaeffer function with six residual variance classes was the most suitable for fitting the data. © FUNPEC-RP.
Resumo:
The objective of this research was to estimate (co) variance functions and genetic parameters for body weight in Colombian buffalo populations using random regression models with Legendre polynomials. Data consisted of 34,738 weight records from birth to 900 days of age from 7815 buffaloes. Fixed effects in the model were contemporary group and parity order of the mother. Random effects were direct and maternal additive genetic, as well as animal and maternal permanent environmental effects. A cubic orthogonal Legendre polynomial was used to model the mean curve of the population. Eleven models with first to sixth order polynomials were used to describe additive genetic direct and maternal effects, and animal and maternal permanent environmental effects. The residual was modeled considering five variance classes. The best model included fourth and sixth order polynomials for direct additive genetic and animal permanent environmental effects, respectively, and third-order polynomials for maternal genetic and maternal permanent environmental effects. The direct heritability increased from birth until 120 days of age (0.32 +/- 0.05), decreasing thereafter until one year of age (0.18 +/- 0.04) and increased again, reaching 0.39 +/- 0.09, at the end of the evaluated period. The highest maternal heritability estimates (0.11 +/- 0.05), were obtained for weights around weaning age (weaning age range is between 8 and 9.5 months). Maternal genetic and maternal permanent environmental variances increased from birth until about one year of age, decreasing at later ages. Direct genetic correlations ranged from moderate (0.60 +/- 0.060) to high (0.99 +/- 0.001), maternal genetic correlations showed a similar range (0.41 +/- 0.401 and 0.99 +/- 0.003), and all of them decreased as time between weighings increased. Direct genetic correlations suggested that selecting buffalos for heavier weights at any age would increase weights from birth through 900 days of age. However, higher heritabilities for direct genetic weights effects after 600 days of age suggested that selection for these effects would be more effective if done during this age period. A greater response to selection for maternal ability would be expected if selection used maternal genetic predictions for weights near weaning. (C) 2013 Elsevier B.V. All rights reserved.
Resumo:
Most superdiffusive Non-Markovian random walk models assume that correlations are maintained at all time scales, e. g., fractional Brownian motion, Levy walks, the Elephant walk and Alzheimer walk models. In the latter two models the random walker can always "remember" the initial times near t = 0. Assuming jump size distributions with finite variance, the question naturally arises: is superdiffusion possible if the walker is unable to recall the initial times? We give a conclusive answer to this general question, by studying a non-Markovian model in which the walker's memory of the past is weighted by a Gaussian centered at time t/2, at which time the walker had one half the present age, and with a standard deviation sigma t which grows linearly as the walker ages. For large widths we find that the model behaves similarly to the Elephant model, but for small widths this Gaussian memory profile model behaves like the Alzheimer walk model. We also report that the phenomenon of amnestically induced persistence, known to occur in the Alzheimer walk model, arises in the Gaussian memory profile model. We conclude that memory of the initial times is not a necessary condition for generating (log-periodic) superdiffusion. We show that the phenomenon of amnestically induced persistence extends to the case of a Gaussian memory profile.
Resumo:
Generalized linear mixed models with semiparametric random effects are useful in a wide variety of Bayesian applications. When the random effects arise from a mixture of Dirichlet process (MDP) model, normal base measures and Gibbs sampling procedures based on the Pólya urn scheme are often used to simulate posterior draws. These algorithms are applicable in the conjugate case when (for a normal base measure) the likelihood is normal. In the non-conjugate case, the algorithms proposed by MacEachern and Müller (1998) and Neal (2000) are often applied to generate posterior samples. Some common problems associated with simulation algorithms for non-conjugate MDP models include convergence and mixing difficulties. This paper proposes an algorithm based on the Pólya urn scheme that extends the Gibbs sampling algorithms to non-conjugate models with normal base measures and exponential family likelihoods. The algorithm proceeds by making Laplace approximations to the likelihood function, thereby reducing the procedure to that of conjugate normal MDP models. To ensure the validity of the stationary distribution in the non-conjugate case, the proposals are accepted or rejected by a Metropolis-Hastings step. In the special case where the data are normally distributed, the algorithm is identical to the Gibbs sampler.
Resumo:
Various inference procedures for linear regression models with censored failure times have been studied extensively. Recent developments on efficient algorithms to implement these procedures enhance the practical usage of such models in survival analysis. In this article, we present robust inferences for certain covariate effects on the failure time in the presence of "nuisance" confounders under a semiparametric, partial linear regression setting. Specifically, the estimation procedures for the regression coefficients of interest are derived from a working linear model and are valid even when the function of the confounders in the model is not correctly specified. The new proposals are illustrated with two examples and their validity for cases with practical sample sizes is demonstrated via a simulation study.
Resumo:
There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.
Resumo:
Many destination marketing organizations in the United States and elsewhere are facing budget retrenchment for tourism marketing, especially for advertising. This study evaluates a three-stage model using Random Coefficient Logit (RCL) approach which controls for correlations between different non-independent alternatives and considers heterogeneity within individual’s responses to advertising. The results of this study indicate that the proposed RCL model results in a significantly better fit as compared to traditional logit models, and indicates that tourism advertising significantly influences tourist decisions with several variables (age, income, distance and Internet access) moderating these decisions differently depending on decision stage and product type. These findings suggest that this approach provides a better foundation for assessing, and in turn, designing more effective advertising campaigns.
Resumo:
Many variables that are of interest in social science research are nominal variables with two or more categories, such as employment status, occupation, political preference, or self-reported health status. With longitudinal survey data it is possible to analyse the transitions of individuals between different employment states or occupations (for example). In the statistical literature, models for analysing categorical dependent variables with repeated observations belong to the family of models known as generalized linear mixed models (GLMMs). The specific GLMM for a dependent variable with three or more categories is the multinomial logit random effects model. For these models, the marginal distribution of the response does not have a closed form solution and hence numerical integration must be used to obtain maximum likelihood estimates for the model parameters. Techniques for implementing the numerical integration are available but are computationally intensive requiring a large amount of computer processing time that increases with the number of clusters (or individuals) in the data and are not always readily accessible to the practitioner in standard software. For the purposes of analysing categorical response data from a longitudinal social survey, there is clearly a need to evaluate the existing procedures for estimating multinomial logit random effects model in terms of accuracy, efficiency and computing time. The computational time will have significant implications as to the preferred approach by researchers. In this paper we evaluate statistical software procedures that utilise adaptive Gaussian quadrature and MCMC methods, with specific application to modeling employment status of women using a GLMM, over three waves of the HILDA survey.
Resumo:
In order to generate sales promotion response predictions, marketing analysts estimate demand models using either disaggregated (consumer-level) or aggregated (store-level) scanner data. Comparison of predictions from these demand models is complicated by the fact that models may accommodate different forms of consumer heterogeneity depending on the level of data aggregation. This study shows via simulation that demand models with various heterogeneity specifications do not produce more accurate sales response predictions than a homogeneous demand model applied to store-level data, with one major exception: a random coefficients model designed to capture within-store heterogeneity using store-level data produced significantly more accurate sales response predictions (as well as better fit) compared to other model specifications. An empirical application to the paper towel product category adds additional insights. This article has supplementary material online.
Resumo:
This paper compares the UK/US exchange rate forecasting performance of linear and nonlinear models based on monetary fundamentals, to a random walk (RW) model. Structural breaks are identified and taken into account. The exchange rate forecasting framework is also used for assessing the relative merits of the official Simple Sum and the weighted Divisia measures of money. Overall, there are four main findings. First, the majority of the models with fundamentals are able to beat the RW model in forecasting the UK/US exchange rate. Second, the most accurate forecasts of the UK/US exchange rate are obtained with a nonlinear model. Third, taking into account structural breaks reveals that the Divisia aggregate performs better than its Simple Sum counterpart. Finally, Divisia-based models provide more accurate forecasts than Simple Sum-based models provided they are constructed within a nonlinear framework.
Resumo:
Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.
Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.
One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.
Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.
In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.
Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.
The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.
Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.
Resumo:
People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.
Resumo:
People go through their life making all kinds of decisions, and some of these decisions affect their demand for transportation, for example, their choices of where to live and where to work, how and when to travel and which route to take. Transport related choices are typically time dependent and characterized by large number of alternatives that can be spatially correlated. This thesis deals with models that can be used to analyze and predict discrete choices in large-scale networks. The proposed models and methods are highly relevant for, but not limited to, transport applications. We model decisions as sequences of choices within the dynamic discrete choice framework, also known as parametric Markov decision processes. Such models are known to be difficult to estimate and to apply to make predictions because dynamic programming problems need to be solved in order to compute choice probabilities. In this thesis we show that it is possible to explore the network structure and the flexibility of dynamic programming so that the dynamic discrete choice modeling approach is not only useful to model time dependent choices, but also makes it easier to model large-scale static choices. The thesis consists of seven articles containing a number of models and methods for estimating, applying and testing large-scale discrete choice models. In the following we group the contributions under three themes: route choice modeling, large-scale multivariate extreme value (MEV) model estimation and nonlinear optimization algorithms. Five articles are related to route choice modeling. We propose different dynamic discrete choice models that allow paths to be correlated based on the MEV and mixed logit models. The resulting route choice models become expensive to estimate and we deal with this challenge by proposing innovative methods that allow to reduce the estimation cost. For example, we propose a decomposition method that not only opens up for possibility of mixing, but also speeds up the estimation for simple logit models, which has implications also for traffic simulation. Moreover, we compare the utility maximization and regret minimization decision rules, and we propose a misspecification test for logit-based route choice models. The second theme is related to the estimation of static discrete choice models with large choice sets. We establish that a class of MEV models can be reformulated as dynamic discrete choice models on the networks of correlation structures. These dynamic models can then be estimated quickly using dynamic programming techniques and an efficient nonlinear optimization algorithm. Finally, the third theme focuses on structured quasi-Newton techniques for estimating discrete choice models by maximum likelihood. We examine and adapt switching methods that can be easily integrated into usual optimization algorithms (line search and trust region) to accelerate the estimation process. The proposed dynamic discrete choice models and estimation methods can be used in various discrete choice applications. In the area of big data analytics, models that can deal with large choice sets and sequential choices are important. Our research can therefore be of interest in various demand analysis applications (predictive analytics) or can be integrated with optimization models (prescriptive analytics). Furthermore, our studies indicate the potential of dynamic programming techniques in this context, even for static models, which opens up a variety of future research directions.
Resumo:
Objectives: The aim of this study was to determine the insulin-delivery system and the attributes of insulin therapy that best meet patients` preferences, and to estimate patients` willingness-to-pay (WTP) for them. Methods: This was a cross-sectional discrete choice experiment (DCE) study involving 378 Canadian patients with type 1 or type 2 diabetes. Patients were asked to choose between two hypothetical insulin treatment options made up of different combinations of the attribute levels. Regression coefficients derived using conditional logit models were used to calculate patients` WTP. Stratification of the sample was performed to evaluate WTP by predefined subgroups. Results: A total of 274 patients successfully completed the survey. Overall, patients were willing to pay the most for better blood glucose control followed by weight gain. Surprisingly, route of insulin administration was the least important attribute overall. Segmented models indicated that insulin naive diabetics were willing to pay significantly more for both oral and inhaled short-acting insulin compared with insulin users. Surprisingly, type 1 diabetics were willing to pay $C11.53 for subcutaneous short-acting insulin, while type 2 diabetics were willing to pay $C47.23 to avoid subcutaneous short-acting insulin (p < .05). These findings support the hypothesis of a psychological barrier to initiating insulin therapy, but once that this barrier has been overcome, they accommodate and accept injectable therapy as a treatment option. Conclusions: By understanding and addressing patients` preferences for insulin therapy, diabetes educators can use this information to find an optimal treatment approach for each individual patient, which may ultimately lead to improved control, through improved compliance, and better diabetes outcomes.