965 resultados para Latent variable models
Resumo:
The most influential theoretical account in time psychophysics assumes the existence of a unitary internal clock based on neural counting. The distinct timing hypothesis, on the other hand, suggests an automatic timing mechanism for processing of durations in the sub-second range and a cognitively controlled timing mechanism for processing of durations in the range of seconds. Although several psychophysical approaches can be applied for identifying the internal structure of interval timing in the second and sub-second range, the existing data provide a puzzling picture of rather inconsistent results. In the present chapter, we introduce confirmatory factor analysis (CFA) to further elucidate the internal structure of interval timing performance in the sub-second and second range. More specifically, we investigated whether CFA would rather support the notion of a unitary timing mechanism or of distinct timing mechanisms underlying interval timing in the sub-second and second range, respectively. The assumption of two distinct timing mechanisms which are completely independent of each other was not supported by our data. The model assuming a unitary timing mechanism underlying interval timing in both the sub-second and second range fitted the empirical data much better. Eventually, we also tested a third model assuming two distinct, but functionally related mechanisms. The correlation between the two latent variables representing the hypothesized timing mechanisms was rather high and comparison of fit indices indicated that the assumption of two associated timing mechanisms described the observed data better than only one latent variable. Models are discussed in the light of the existing psychophysical and neurophysiological data.
Resumo:
Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis which is based on a linear transformations between the latent space and the data space. In this paper we introduce a form of non-linear latent variable model called the Generative Topographic Mapping, for which the parameters of the model can be determined using the EM algorithm. GTM provides a principled alternative to the widely used Self-Organizing Map (SOM) of Kohonen (1982), and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multi-phase oil pipeline.
Resumo:
Latent variable models represent the probability density of data in a space of several dimensions in terms of a smaller number of latent, or hidden, variables. A familiar example is factor analysis which is based on a linear transformations between the latent space and the data space. In this paper we introduce a form of non-linear latent variable model called the Generative Topographic Mapping, for which the parameters of the model can be determined using the EM algorithm. GTM provides a principled alternative to the widely used Self-Organizing Map (SOM) of Kohonen (1982), and overcomes most of the significant limitations of the SOM. We demonstrate the performance of the GTM algorithm on a toy problem and on simulated data from flow diagnostics for a multi-phase oil pipeline.
Resumo:
Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.
Resumo:
Integrated choice and latent variable (ICLV) models represent a promising new class of models which merge classic choice models with the structural equation approach (SEM) for latent variables. Despite their conceptual appeal, applications of ICLV models in marketing remain rare. We extend previous ICLV applications by first estimating a multinomial choice model and, second, by estimating hierarchical relations between latent variables. An empirical study on travel mode choice clearly demonstrates the value of ICLV models to enhance the understanding of choice processes. In addition to the usually studied directly observable variables such as travel time, we show how abstract motivations such as power and hedonism as well as attitudes such as a desire for flexibility impact on travel mode choice. Furthermore, we show that it is possible to estimate such a complex ICLV model with the widely available structural equation modeling package Mplus. This finding is likely to encourage more widespread application of this appealing model class in the marketing field.
Resumo:
This thesis examines the performance of Canadian fixed-income mutual funds in the context of an unobservable market factor that affects mutual fund returns. We use various selection and timing models augmented with univariate and multivariate regime-switching structures. These models assume a joint distribution of an unobservable latent variable and fund returns. The fund sample comprises six Canadian value-weighted portfolios with different investing objectives from 1980 to 2011. These are the Canadian fixed-income funds, the Canadian inflation protected fixed-income funds, the Canadian long-term fixed-income funds, the Canadian money market funds, the Canadian short-term fixed-income funds and the high yield fixed-income funds. We find strong evidence that more than one state variable is necessary to explain the dynamics of the returns on Canadian fixed-income funds. For instance, Canadian fixed-income funds clearly show that there are two regimes that can be identified with a turning point during the mid-eighties. This structural break corresponds to an increase in the Canadian bond index from its low values in the early 1980s to its current high values. Other fixed-income funds results show latent state variables that mimic the behaviour of the general economic activity. Generally, we report that Canadian bond fund alphas are negative. In other words, fund managers do not add value through their selection abilities. We find evidence that Canadian fixed-income fund portfolio managers are successful market timers who shift portfolio weights between risky and riskless financial assets according to expected market conditions. Conversely, Canadian inflation protected funds, Canadian long-term fixed-income funds and Canadian money market funds have no market timing ability. We conclude that these managers generally do not have positive performance by actively managing their portfolios. We also report that the Canadian fixed-income fund portfolios perform asymmetrically under different economic regimes. In particular, these portfolio managers demonstrate poorer selection skills during recessions. Finally, we demonstrate that the multivariate regime-switching model is superior to univariate models given the dynamic market conditions and the correlation between fund portfolios.
Resumo:
Ma thèse est composée de trois chapitres reliés à l'estimation des modèles espace-état et volatilité stochastique. Dans le première article, nous développons une procédure de lissage de l'état, avec efficacité computationnelle, dans un modèle espace-état linéaire et gaussien. Nous montrons comment exploiter la structure particulière des modèles espace-état pour tirer les états latents efficacement. Nous analysons l'efficacité computationnelle des méthodes basées sur le filtre de Kalman, l'algorithme facteur de Cholesky et notre nouvelle méthode utilisant le compte d'opérations et d'expériences de calcul. Nous montrons que pour de nombreux cas importants, notre méthode est plus efficace. Les gains sont particulièrement grands pour les cas où la dimension des variables observées est grande ou dans les cas où il faut faire des tirages répétés des états pour les mêmes valeurs de paramètres. Comme application, on considère un modèle multivarié de Poisson avec le temps des intensités variables, lequel est utilisé pour analyser le compte de données des transactions sur les marchés financières. Dans le deuxième chapitre, nous proposons une nouvelle technique pour analyser des modèles multivariés à volatilité stochastique. La méthode proposée est basée sur le tirage efficace de la volatilité de son densité conditionnelle sachant les paramètres et les données. Notre méthodologie s'applique aux modèles avec plusieurs types de dépendance dans la coupe transversale. Nous pouvons modeler des matrices de corrélation conditionnelles variant dans le temps en incorporant des facteurs dans l'équation de rendements, où les facteurs sont des processus de volatilité stochastique indépendants. Nous pouvons incorporer des copules pour permettre la dépendance conditionnelle des rendements sachant la volatilité, permettant avoir différent lois marginaux de Student avec des degrés de liberté spécifiques pour capturer l'hétérogénéité des rendements. On tire la volatilité comme un bloc dans la dimension du temps et un à la fois dans la dimension de la coupe transversale. Nous appliquons la méthode introduite par McCausland (2012) pour obtenir une bonne approximation de la distribution conditionnelle à posteriori de la volatilité d'un rendement sachant les volatilités d'autres rendements, les paramètres et les corrélations dynamiques. Le modèle est évalué en utilisant des données réelles pour dix taux de change. Nous rapportons des résultats pour des modèles univariés de volatilité stochastique et deux modèles multivariés. Dans le troisième chapitre, nous évaluons l'information contribuée par des variations de volatilite réalisée à l'évaluation et prévision de la volatilité quand des prix sont mesurés avec et sans erreur. Nous utilisons de modèles de volatilité stochastique. Nous considérons le point de vue d'un investisseur pour qui la volatilité est une variable latent inconnu et la volatilité réalisée est une quantité d'échantillon qui contient des informations sur lui. Nous employons des méthodes bayésiennes de Monte Carlo par chaîne de Markov pour estimer les modèles, qui permettent la formulation, non seulement des densités a posteriori de la volatilité, mais aussi les densités prédictives de la volatilité future. Nous comparons les prévisions de volatilité et les taux de succès des prévisions qui emploient et n'emploient pas l'information contenue dans la volatilité réalisée. Cette approche se distingue de celles existantes dans la littérature empirique en ce sens que ces dernières se limitent le plus souvent à documenter la capacité de la volatilité réalisée à se prévoir à elle-même. Nous présentons des applications empiriques en utilisant les rendements journaliers des indices et de taux de change. Les différents modèles concurrents sont appliqués à la seconde moitié de 2008, une période marquante dans la récente crise financière.
Resumo:
The objective of this study was to evaluate the use of probit and logit link functions for the genetic evaluation of early pregnancy using simulated data. The following simulation/analysis structures were constructed: logit/logit, logit/probit, probit/logit, and probit/probit. The percentages of precocious females were 5, 10, 15, 20, 25 and 30% and were adjusted based on a change in the mean of the latent variable. The parametric heritability (h²) was 0.40. Simulation and genetic evaluation were implemented in the R software. Heritability estimates (ĥ²) were compared with h² using the mean squared error. Pearson correlations between predicted and true breeding values and the percentage of coincidence between true and predicted ranking, considering the 10% of bulls with the highest breeding values (TOP10) were calculated. The mean ĥ² values were under- and overestimated for all percentages of precocious females when logit/probit and probit/logit models used. In addition, the mean squared errors of these models were high when compared with those obtained with the probit/probit and logit/logit models. Considering ĥ², probit/probit and logit/logit were also superior to logit/probit and probit/logit, providing values close to the parametric heritability. Logit/probit and probit/logit presented low Pearson correlations, whereas the correlations obtained with probit/probit and logit/logit ranged from moderate to high. With respect to the TOP10 bulls, logit/probit and probit/logit presented much lower percentages than probit/probit and logit/logit. The genetic parameter estimates and predictions of breeding values of the animals obtained with the logit/logit and probit/probit models were similar. In contrast, the results obtained with probit/logit and logit/probit were not satisfactory. There is need to compare the estimation and prediction ability of logit and probit link functions.
Resumo:
The aim of the thesi is to formulate a suitable Item Response Theory (IRT) based model to measure HRQoL (as latent variable) using a mixed responses questionnaire and relaxing the hypothesis of normal distributed latent variable. The new model is a combination of two models already presented in literature, that is, a latent trait model for mixed responses and an IRT model for Skew Normal latent variable. It is developed in a Bayesian framework, a Markov chain Monte Carlo procedure is used to generate samples of the posterior distribution of the parameters of interest. The proposed model is test on a questionnaire composed by 5 discrete items and one continuous to measure HRQoL in children, the EQ-5D-Y questionnaire. A large sample of children collected in the schools was used. In comparison with a model for only discrete responses and a model for mixed responses and normal latent variable, the new model has better performances, in term of deviance information criterion (DIC), chain convergences times and precision of the estimates.
Resumo:
Clustered data analysis is characterized by the need to describe both systematic variation in a mean model and cluster-dependent random variation in an association model. Marginalized multilevel models embrace the robustness and interpretations of a marginal mean model, while retaining the likelihood inference capabilities and flexible dependence structures of a conditional association model. Although there has been increasing recognition of the attractiveness of marginalized multilevel models, there has been a gap in their practical application arising from a lack of readily available estimation procedures. We extend the marginalized multilevel model to allow for nonlinear functions in both the mean and association aspects. We then formulate marginal models through conditional specifications to facilitate estimation with mixed model computational solutions already in place. We illustrate this approach on a cerebrovascular deficiency crossover trial.
Resumo:
The discrete-time Markov chain is commonly used in describing changes of health states for chronic diseases in a longitudinal study. Statistical inferences on comparing treatment effects or on finding determinants of disease progression usually require estimation of transition probabilities. In many situations when the outcome data have some missing observations or the variable of interest (called a latent variable) can not be measured directly, the estimation of transition probabilities becomes more complicated. In the latter case, a surrogate variable that is easier to access and can gauge the characteristics of the latent one is usually used for data analysis. ^ This dissertation research proposes methods to analyze longitudinal data (1) that have categorical outcome with missing observations or (2) that use complete or incomplete surrogate observations to analyze the categorical latent outcome. For (1), different missing mechanisms were considered for empirical studies using methods that include EM algorithm, Monte Carlo EM and a procedure that is not a data augmentation method. For (2), the hidden Markov model with the forward-backward procedure was applied for parameter estimation. This method was also extended to cover the computation of standard errors. The proposed methods were demonstrated by the Schizophrenia example. The relevance of public health, the strength and limitations, and possible future research were also discussed. ^
Resumo:
La relación entre la estructura urbana y la movilidad ha sido estudiada desde hace más de 70 años. El entorno urbano incluye múltiples dimensiones como por ejemplo: la estructura urbana, los usos de suelo, la distribución de instalaciones diversas (comercios, escuelas y zonas de restauración, parking, etc.). Al realizar una revisión de la literatura existente en este contexto, se encuentran distintos análisis, metodologías, escalas geográficas y dimensiones, tanto de la movilidad como de la estructura urbana. En este sentido, se trata de una relación muy estudiada pero muy compleja, sobre la que no existe hasta el momento un consenso sobre qué dimensión del entorno urbano influye sobre qué dimensión de la movilidad, y cuál es la manera apropiada de representar esta relación. Con el propósito de contestar estas preguntas investigación, la presente tesis tiene los siguientes objetivos generales: (1) Contribuir al mejor entendimiento de la compleja relación estructura urbana y movilidad. y (2) Entender el rol de los atributos latentes en la relación entorno urbano y movilidad. El objetivo específico de la tesis es analizar la influencia del entorno urbano sobre dos dimensiones de la movilidad: número de viajes y tipo de tour. Vista la complejidad de la relación entorno urbano y movilidad, se pretende contribuir al mejor entendimiento de la relación a través de la utilización de 3 escalas geográficas de las variables y del análisis de la influencia de efectos inobservados en la movilidad. Para el análisis se utiliza una base de datos conformada por tres tipos de datos: (1) Una encuesta de movilidad realizada durante los años 2006 y 2007. Se obtuvo un total de 943 encuestas, en 3 barrios de Madrid: Chamberí, Pozuelo y Algete. (2) Información municipal del Instituto Nacional de Estadística: dicha información se encuentra enlazada con los orígenes y destinos de los viajes recogidos en la encuesta. Y (3) Información georeferenciada en Arc-GIS de los hogares participantes en la encuesta: la base de datos contiene información respecto a la estructura de las calles, localización de escuelas, parking, centros médicos y lugares de restauración. Se analizó la correlación entre e intra-grupos y se modelizaron 4 casos de atributos bajo la estructura ordinal logit. Posteriormente se evalúa la auto-selección a través de la estimación conjunta de las elecciones de tipo de barrio y número de viajes. La elección del tipo de barrio consta de 3 alternativas: CBD, Urban y Suburban, según la zona de residencia recogida en las encuestas. Mientras que la elección del número de viajes consta de 4 categorías ordinales: 0 viajes, 1-2 viajes, 3-4 viajes y 5 o más viajes. A partir de la mejor especificación del modelo ordinal logit. Se desarrolló un modelo joint mixed-ordinal conjunto. Los resultados indican que las variables exógenas requieren un análisis exhaustivo de correlaciones con el fin de evitar resultados sesgados. ha determinado que es importante medir los atributos del BE donde se realiza el viaje, pero también la información municipal es muy explicativa de la movilidad individual. Por tanto, la percepción de las zonas de destino a nivel municipal es considerada importante. En el contexto de la Auto-selección (self-selection) es importante modelizar conjuntamente las decisiones. La Auto-selección existe, puesto que los parámetros estimados conjuntamente son significativos. Sin embargo, sólo ciertos atributos del entorno urbano son igualmente importantes sobre la elección de la zona de residencia y frecuencia de viajes. Para analizar la Propensión al Viaje, se desarrolló un modelo híbrido, formado por: una variable latente, un indicador y un modelo de elección discreta. La variable latente se denomina “Propensión al Viaje”, cuyo indicador en ecuación de medida es el número de viajes; la elección discreta es el tipo de tour. El modelo de elección consiste en 5 alternativas, según la jerarquía de actividades establecida en la tesis: HOME, no realiza viajes durante el día de estudio, HWH tour cuya actividad principal es el trabajo o estudios, y no se realizan paradas intermedias; HWHs tour si el individuo reaiza paradas intermedias; HOH tour cuya actividad principal es distinta a trabajo y estudios, y no se realizan paradas intermedias; HOHs donde se realizan paradas intermedias. Para llegar a la mejor especificación del modelo, se realizó un trabajo importante considerando diferentes estructuras de modelos y tres tipos de estimaciones. De tal manera, se obtuvieron parámetros consistentes y eficientes. Los resultados muestran que la modelización de los tours, representa una ventaja sobre la modelización de los viajes, puesto que supera las limitaciones de espacio y tiempo, enlazando los viajes realizados por la misma persona en el día de estudio. La propensión al viaje (PT) existe y es específica para cada tipo de tour. Los parámetros estimados en el modelo híbrido resultaron significativos y distintos para cada alternativa de tipo de tour. Por último, en la tesis se verifica que los modelos híbridos representan una mejora sobre los modelos tradicionales de elección discreta, dando como resultado parámetros consistentes y más robustos. En cuanto a políticas de transporte, se ha demostrado que los atributos del entorno urbano son más importantes que los LOS (Level of Service) en la generación de tours multi-etapas. la presente tesis representa el primer análisis empírico de la relación entre los tipos de tours y la propensión al viaje. El concepto Propensity to Travel ha sido desarrollado exclusivamente para la tesis. Igualmente, el desarrollo de un modelo conjunto RC-Number of trips basado en tres escalas de medida representa innovación en cuanto a la comparación de las escalas geográficas, que no había sido hecha en la modelización de la self-selection. The relationship between built environment (BE) and travel behaviour (TB) has been studied in a number of cases, using several methods - aggregate and disaggregate approaches - and different focuses – trip frequency, automobile use, and vehicle miles travelled and so on. Definitely, travel is generated by the need to undertake activities and obtain services, and there is a general consensus that urban components affect TB. However researches are still needed to better understand which components of the travel behaviour are affected most and by which of the urban components. In order to fill the gap in the research, the present dissertation faced two main objectives: (1) To contribute to the better understanding of the relationship between travel demand and urban environment. And (2) To develop an econometric model for estimating travel demand with urban environment attributes. With this purpose, the present thesis faced an exhaustive research and computation of land-use variables in order to find the best representation of BE for modelling trip frequency. In particular two empirical analyses are carried out: 1. Estimation of three dimensions of travel demand using dimensions of urban environment. We compare different travel dimensions and geographical scales, and we measure self-selection contribution following the joint models. 2. Develop a hybrid model, integrated latent variable and discrete choice model. The implementation of hybrid models is new in the analysis of land-use and travel behaviour. BE and TB explicitly interact and allow richness information about a specific individual decision process For all empirical analysis is used a data-base from a survey conducted in 2006 and 2007 in Madrid. Spatial attributes describing neighbourhood environment are derived from different data sources: National Institute of Statistics-INE (Administrative: municipality and district) and GIS (circular units). INE provides raw data for such spatial units as: municipality and district. The construction of census units is trivial as the census bureau provides tables that readily define districts and municipalities. The construction of circular units requires us to determine the radius and associate the spatial information to our households. The first empirical part analyzes trip frequency by applying an ordered logit model. In this part is studied the effect of socio-economic, transport and land use characteristics on two travel dimensions: trip frequency and type of tour. In particular the land use is defined in terms of type of neighbourhoods and types of dwellers. Three neighbourhood representations are explored, and described three for constructing neighbourhood attributes. In particular administrative units are examined to represent neighbourhood and circular – unit representation. Ordered logit models are applied, while ordinal logit models are well-known, an intensive work for constructing a spatial attributes was carried out. On the other hand, the second empirical analysis consists of the development of an innovative econometric model that considers a latent variable called “propensity to travel”, and choice model is the choice of type of tour. The first two specifications of ordinal models help to estimate this latent variable. The latent variable is unobserved but the manifestation is called “indicators”, then the probability of choosing an alternative of tour is conditional to the probability of latent variable and type of tour. Since latent variable is unknown we fit the integral over its distribution. Four “sets of best variables” are specified, following the specification obtained from the correlation analysis. The results evidence that the relative importance of SE variables versus BE variables depends on how BE variables are measured. We found that each of these three spatial scales has its intangible qualities and drawbacks. Spatial scales play an important role on predicting travel demand due to the variability in measures at trip origin/destinations within the same administrative unit (municipality, district and so on). Larger units will produce less variation in data; but it does not affect certain variables, such as public transport supply, that are more significant at municipality level. By contrast, land-use measures are more efficient at district level. Self-selection in this context, is weak. Thus, the influence of BE attributes is true. The results of the hybrid model show that unobserved factors affect the choice of tour complexity. The latent variable used in this model is propensity to travel that is explained by socioeconomic aspects and neighbourhood attributes. The results show that neighbourhood attributes have indeed a significant impact on the choice of the type of tours either directly and through the propensity to travel. The propensity to travel has a different impact depending on the structure of each tour and increases the probability of choosing more complex tours, such as tours with many intermediate stops. The integration of choice and latent variable model shows that omitting important perception and attitudes leads to inconsistent estimates. The results also indicate that goodness of fit improves by adding the latent variable in both sequential and simultaneous estimation. There are significant differences in the sensitivity to the latent variable across alternatives. In general, as expected, the hybrid models show a major improvement into the goodness of fit of the model, compared to a classical discrete choice model that does not incorporate latent effects. The integrated model leads to a more detailed analysis of the behavioural process. Summarizing, the effect that built environment characteristics on trip frequency studied is deeply analyzed. In particular we tried to better understand how land use characteristics can be defined and measured and which of these measures do have really an impact on trip frequency. We also tried to test the superiority of HCM on this field. We can concluded that HCM shows a major improvement into the goodness of fit of the model, compared to classical discrete choice model that does not incorporate latent effects. And consequently, the application of HCM shows the importance of LV on the decision of tour complexity. People are more elastic to built environment attributes than level of services. Thus, policy implications must take place to develop more mixed areas, work-places in combination with commercial retails.
Resumo:
Independent Components Analysis is a Blind Source Separation method that aims to find the pure source signals mixed together in unknown proportions in the observed signals under study. It does this by searching for factors which are mutually statistically independent. It can thus be classified among the latent-variable based methods. Like other methods based on latent variables, a careful investigation has to be carried out to find out which factors are significant and which are not. Therefore, it is important to dispose of a validation procedure to decide on the optimal number of independent components to include in the final model. This can be made complicated by the fact that two consecutive models may differ in the order and signs of similarly-indexed ICs. As well, the structure of the extracted sources can change as a function of the number of factors calculated. Two methods for determining the optimal number of ICs are proposed in this article and applied to simulated and real datasets to demonstrate their performance.
Resumo:
Several international studies have analyzed the acceptability of road pricing schemes by means of an attitude survey in combination with the results of a stated choice experiment using both a descriptive analysis and a discrete-choice model with binary choice (?accept? or ?not accept? the toll). However, the use of hybrid discrete choice models constitutes an innovative alternative for integrating subjective attitudes and perceptions deriving from the survey of attitudes with the more objective variables from the stated choice experiment. This paper analyzes the results of applying these models to measure the acceptability of interurban road pricing among different groups of stakeholders (road freight and passenger operators, highway concessionaires, and associations of private car users) with qualitatively significant opinions on road pricing measures. Our results show that hybrid models are better suited to explaining the acceptability of a road pricing scheme by different groups of stakeholders than a separate analysis of the survey of attitudes and a discrete-choice model applied on a stated choice experiment. A particular finding was that the strong psycho-social latent variable of the perception of fairness explains the rejection or acceptance of a toll scheme by road stakeholders.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06