86 resultados para Classical measurement error model

em Université de Lausanne, Switzerland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Zero correlation between measurement error and model error has been assumed in existing panel data models dealing specifically with measurement error. We extend this literature and propose a simple model where one regressor is mismeasured, allowing the measurement error to correlate with model error. Zero correlation between measurement error and model error is a special case in our model where correlated measurement error equals zero. We ask two research questions. First, we wonder if the correlated measurement error can be identified in the context of panel data. Second, we wonder if classical instrumental variables in panel data need to be adjusted when correlation between measurement error and model error cannot be ignored. Under some regularity conditions the answer is yes to both questions. We then propose a two-step estimation corresponding to the two questions. The first step estimates correlated measurement error from a reverse regression; and the second step estimates usual coefficients of interest using adjusted instruments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Summary points: - The bias introduced by random measurement error will be different depending on whether the error is in an exposure variable (risk factor) or outcome variable (disease) - Random measurement error in an exposure variable will bias the estimates of regression slope coefficients towards the null - Random measurement error in an outcome variable will instead increase the standard error of the estimates and widen the corresponding confidence intervals, making results less likely to be statistically significant - Increasing sample size will help minimise the impact of measurement error in an outcome variable but will only make estimates more precisely wrong when the error is in an exposure variable

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The evolution of continuous traits is the central component of comparative analyses in phylogenetics, and the comparison of alternative models of trait evolution has greatly improved our understanding of the mechanisms driving phenotypic differentiation. Several factors influence the comparison of models, and we explore the effects of random errors in trait measurement on the accuracy of model selection. We simulate trait data under a Brownian motion model (BM) and introduce different magnitudes of random measurement error. We then evaluate the resulting statistical support for this model against two alternative models: Ornstein-Uhlenbeck (OU) and accelerating/decelerating rates (ACDC). Our analyses show that even small measurement errors (10%) consistently bias model selection towards erroneous rejection of BM in favour of more parameter-rich models (most frequently the OU model). Fortunately, methods that explicitly incorporate measurement errors in phylogenetic analyses considerably improve the accuracy of model selection. Our results call for caution in interpreting the results of model selection in comparative analyses, especially when complex models garner only modest additional support. Importantly, as measurement errors occur in most trait data sets, we suggest that estimation of measurement errors should always be performed during comparative analysis to reduce chances of misidentification of evolutionary processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When researchers introduce a new test they have to demonstrate that it is valid, using unbiased designs and suitable statistical procedures. In this article we use Monte Carlo analyses to highlight how incorrect statistical procedures (i.e., stepwise regression, extreme scores analyses) or ignoring regression assumptions (e.g., heteroscedasticity) contribute to wrong validity estimates. Beyond these demonstrations, and as an example, we re-examined the results reported by Warwick, Nettelbeck, and Ward (2010) concerning the validity of the Ability Emotional Intelligence Measure (AEIM). Warwick et al. used the wrong statistical procedures to conclude that the AEIM was incrementally valid beyond intelligence and personality traits in predicting various outcomes. In our re-analysis, we found that the reliability-corrected multiple correlation of their measures with personality and intelligence was up to .69. Using robust statistical procedures and appropriate controls, we also found that the AEIM did not predict incremental variance in GPA, stress, loneliness, or well-being, demonstrating the importance for testing validity instead of looking for it.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current measures of ability emotional intelligence (EI)--including the well-known Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT)--suffer from several limitations, including low discriminant validity and questionable construct and incremental validity. We show that the MSCEIT is largely predicted by personality dimensions, general intelligence, and demographics having multiple R's with the MSCEIT branches up to .66; for the general EI factor this relation was even stronger (Multiple R = .76). As concerns the factor structure of the MSCEIT, we found support for four first-order factors, which had differential relations with personality, but no support for a higher-order global EI factor. We discuss implications for employing the MSCEIT, including (a) using the single branches scores rather than the total score, (b) always controlling for personality and general intelligence to ensure unbiased parameter estimates in the EI factors, and (c) correcting for measurement error. Failure to account for these methodological aspects may severely compromise predictive validity testing. We also discuss avenues for the improvement of ability-based tests.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Notre consommation en eau souterraine, en particulier comme eau potable ou pour l'irrigation, a considérablement augmenté au cours des années. De nombreux problèmes font alors leur apparition, allant de la prospection de nouvelles ressources à la remédiation des aquifères pollués. Indépendamment du problème hydrogéologique considéré, le principal défi reste la caractérisation des propriétés du sous-sol. Une approche stochastique est alors nécessaire afin de représenter cette incertitude en considérant de multiples scénarios géologiques et en générant un grand nombre de réalisations géostatistiques. Nous rencontrons alors la principale limitation de ces approches qui est le coût de calcul dû à la simulation des processus d'écoulements complexes pour chacune de ces réalisations. Dans la première partie de la thèse, ce problème est investigué dans le contexte de propagation de l'incertitude, oú un ensemble de réalisations est identifié comme représentant les propriétés du sous-sol. Afin de propager cette incertitude à la quantité d'intérêt tout en limitant le coût de calcul, les méthodes actuelles font appel à des modèles d'écoulement approximés. Cela permet l'identification d'un sous-ensemble de réalisations représentant la variabilité de l'ensemble initial. Le modèle complexe d'écoulement est alors évalué uniquement pour ce sousensemble, et, sur la base de ces réponses complexes, l'inférence est faite. Notre objectif est d'améliorer la performance de cette approche en utilisant toute l'information à disposition. Pour cela, le sous-ensemble de réponses approximées et exactes est utilisé afin de construire un modèle d'erreur, qui sert ensuite à corriger le reste des réponses approximées et prédire la réponse du modèle complexe. Cette méthode permet de maximiser l'utilisation de l'information à disposition sans augmentation perceptible du temps de calcul. La propagation de l'incertitude est alors plus précise et plus robuste. La stratégie explorée dans le premier chapitre consiste à apprendre d'un sous-ensemble de réalisations la relation entre les modèles d'écoulement approximé et complexe. Dans la seconde partie de la thèse, cette méthodologie est formalisée mathématiquement en introduisant un modèle de régression entre les réponses fonctionnelles. Comme ce problème est mal posé, il est nécessaire d'en réduire la dimensionnalité. Dans cette optique, l'innovation du travail présenté provient de l'utilisation de l'analyse en composantes principales fonctionnelles (ACPF), qui non seulement effectue la réduction de dimensionnalités tout en maximisant l'information retenue, mais permet aussi de diagnostiquer la qualité du modèle d'erreur dans cet espace fonctionnel. La méthodologie proposée est appliquée à un problème de pollution par une phase liquide nonaqueuse et les résultats obtenus montrent que le modèle d'erreur permet une forte réduction du temps de calcul tout en estimant correctement l'incertitude. De plus, pour chaque réponse approximée, une prédiction de la réponse complexe est fournie par le modèle d'erreur. Le concept de modèle d'erreur fonctionnel est donc pertinent pour la propagation de l'incertitude, mais aussi pour les problèmes d'inférence bayésienne. Les méthodes de Monte Carlo par chaîne de Markov (MCMC) sont les algorithmes les plus communément utilisés afin de générer des réalisations géostatistiques en accord avec les observations. Cependant, ces méthodes souffrent d'un taux d'acceptation très bas pour les problèmes de grande dimensionnalité, résultant en un grand nombre de simulations d'écoulement gaspillées. Une approche en deux temps, le "MCMC en deux étapes", a été introduite afin d'éviter les simulations du modèle complexe inutiles par une évaluation préliminaire de la réalisation. Dans la troisième partie de la thèse, le modèle d'écoulement approximé couplé à un modèle d'erreur sert d'évaluation préliminaire pour le "MCMC en deux étapes". Nous démontrons une augmentation du taux d'acceptation par un facteur de 1.5 à 3 en comparaison avec une implémentation classique de MCMC. Une question reste sans réponse : comment choisir la taille de l'ensemble d'entrainement et comment identifier les réalisations permettant d'optimiser la construction du modèle d'erreur. Cela requiert une stratégie itérative afin que, à chaque nouvelle simulation d'écoulement, le modèle d'erreur soit amélioré en incorporant les nouvelles informations. Ceci est développé dans la quatrième partie de la thèse, oú cette méthodologie est appliquée à un problème d'intrusion saline dans un aquifère côtier. -- Our consumption of groundwater, in particular as drinking water and for irrigation, has considerably increased over the years and groundwater is becoming an increasingly scarce and endangered resource. Nofadays, we are facing many problems ranging from water prospection to sustainable management and remediation of polluted aquifers. Independently of the hydrogeological problem, the main challenge remains dealing with the incomplete knofledge of the underground properties. Stochastic approaches have been developed to represent this uncertainty by considering multiple geological scenarios and generating a large number of realizations. The main limitation of this approach is the computational cost associated with performing complex of simulations in each realization. In the first part of the thesis, we explore this issue in the context of uncertainty propagation, where an ensemble of geostatistical realizations is identified as representative of the subsurface uncertainty. To propagate this lack of knofledge to the quantity of interest (e.g., the concentration of pollutant in extracted water), it is necessary to evaluate the of response of each realization. Due to computational constraints, state-of-the-art methods make use of approximate of simulation, to identify a subset of realizations that represents the variability of the ensemble. The complex and computationally heavy of model is then run for this subset based on which inference is made. Our objective is to increase the performance of this approach by using all of the available information and not solely the subset of exact responses. Two error models are proposed to correct the approximate responses follofing a machine learning approach. For the subset identified by a classical approach (here the distance kernel method) both the approximate and the exact responses are knofn. This information is used to construct an error model and correct the ensemble of approximate responses to predict the "expected" responses of the exact model. The proposed methodology makes use of all the available information without perceptible additional computational costs and leads to an increase in accuracy and robustness of the uncertainty propagation. The strategy explored in the first chapter consists in learning from a subset of realizations the relationship between proxy and exact curves. In the second part of this thesis, the strategy is formalized in a rigorous mathematical framework by defining a regression model between functions. As this problem is ill-posed, it is necessary to reduce its dimensionality. The novelty of the work comes from the use of functional principal component analysis (FPCA), which not only performs the dimensionality reduction while maximizing the retained information, but also allofs a diagnostic of the quality of the error model in the functional space. The proposed methodology is applied to a pollution problem by a non-aqueous phase-liquid. The error model allofs a strong reduction of the computational cost while providing a good estimate of the uncertainty. The individual correction of the proxy response by the error model leads to an excellent prediction of the exact response, opening the door to many applications. The concept of functional error model is useful not only in the context of uncertainty propagation, but also, and maybe even more so, to perform Bayesian inference. Monte Carlo Markov Chain (MCMC) algorithms are the most common choice to ensure that the generated realizations are sampled in accordance with the observations. Hofever, this approach suffers from lof acceptance rate in high dimensional problems, resulting in a large number of wasted of simulations. This led to the introduction of two-stage MCMC, where the computational cost is decreased by avoiding unnecessary simulation of the exact of thanks to a preliminary evaluation of the proposal. In the third part of the thesis, a proxy is coupled to an error model to provide an approximate response for the two-stage MCMC set-up. We demonstrate an increase in acceptance rate by a factor three with respect to one-stage MCMC results. An open question remains: hof do we choose the size of the learning set and identify the realizations to optimize the construction of the error model. This requires devising an iterative strategy to construct the error model, such that, as new of simulations are performed, the error model is iteratively improved by incorporating the new information. This is discussed in the fourth part of the thesis, in which we apply this methodology to a problem of saline intrusion in a coastal aquifer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social scientists often estimate models from correlational data, where the independent variable has not been exogenously manipulated; they also make implicit or explicit causal claims based on these models. When can these claims be made? We answer this question by first discussing design and estimation conditions under which model estimates can be interpreted, using the randomized experiment as the gold standard. We show how endogeneity--which includes omitted variables, omitted selection, simultaneity, common methods bias, and measurement error--renders estimates causally uninterpretable. Second, we present methods that allow researchers to test causal claims in situations where randomization is not possible or when causal interpretation is confounded, including fixed-effects panel, sample selection, instrumental variable, regression discontinuity, and difference-in-differences models. Third, we take stock of the methodological rigor with which causal claims are being made in a social sciences discipline by reviewing a representative sample of 110 articles on leadership published in the previous 10 years in top-tier journals. Our key finding is that researchers fail to address at least 66 % and up to 90 % of design and estimation conditions that make causal claims invalid. We conclude by offering 10 suggestions on how to improve non-experimental research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

According to the most widely accepted Cattell-Horn-Carroll (CHC) model of intelligence measurement, each subtest score of the Wechsler Intelligence Scale for Adults (3rd ed.; WAIS-III) should reflect both 1st- and 2nd-order factors (i.e., 4 or 5 broad abilities and 1 general factor). To disentangle the contribution of each factor, we applied a Schmid-Leiman orthogonalization transformation (SLT) to the standardization data published in the French technical manual for the WAIS-III. Results showed that the general factor accounted for 63% of the common variance and that the specific contributions of the 1st-order factors were weak (4.7%-15.9%). We also addressed this issue by using confirmatory factor analysis. Results indicated that the bifactor model (with 1st-order group and general factors) better fit the data than did the traditional higher order structure. Models based on the CHC framework were also tested. Results indicated that a higher order CHC model showed a better fit than did the classical 4-factor model; however, the WAIS bifactor structure was the most adequate. We recommend that users do not discount the Full Scale IQ when interpreting the index scores of the WAIS-III because the general factor accounts for the bulk of the common variance in the French WAIS-III. The 4 index scores cannot be considered to reflect only broad ability because they include a strong contribution of the general factor.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Analyzing the relationship between the baseline value and subsequent change of a continuous variable is a frequent matter of inquiry in cohort studies. These analyses are surprisingly complex, particularly if only two waves of data are available. It is unclear for non-biostatisticians where the complexity of this analysis lies and which statistical method is adequate.With the help of simulated longitudinal data of body mass index in children,we review statistical methods for the analysis of the association between the baseline value and subsequent change, assuming linear growth with time. Key issues in such analyses are mathematical coupling, measurement error, variability of change between individuals, and regression to the mean. Ideally, it is better to rely on multiple repeated measurements at different times and a linear random effects model is a standard approach if more than two waves of data are available. If only two waves of data are available, our simulations show that Blomqvist's method - which consists in adjusting for measurement error variance the estimated regression coefficient of observed change on baseline value - provides accurate estimates. The adequacy of the methods to assess the relationship between the baseline value and subsequent change depends on the number of data waves, the availability of information on measurement error, and the variability of change between individuals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social scientists often estimate models from correlational data, where the independent variable has not been exogenously manipulated; they also make implicit or explicit causal claims based on these models. When can these claims be made? We answer this question by first discussing design and estimation conditions under which model estimates can be interpreted, using the randomized experiment as the gold standard. We show how endogeneity--which includes omitted variables, omitted selection, simultaneity, common methods bias, and measurement error--renders estimates causally uninterpretable. Second, we present methods that allow researchers to test causal claims in situations where randomization is not possible or when causal interpretation is confounded, including fixed-effects panel, sample selection, instrumental variable, regression discontinuity, and difference-in-differences models. Third, we take stock of the methodological rigor with which causal claims are being made in a social sciences discipline by reviewing a representative sample of 110 articles on leadership published in the previous 10 years in top-tier journals. Our key finding is that researchers fail to address at least 66 % and up to 90 % of design and estimation conditions that make causal claims invalid. We conclude by offering 10 suggestions on how to improve non-experimental research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many practical applications the state of field soils is monitored by recording the evolution of temperature and soil moisture at discrete depths. We theoretically investigate the systematic errors that arise when mass and energy balances are computed directly from these measurements. We show that, even with no measurement or model errors, large residuals might result when finite difference approximations are used to compute fluxes and storage term. To calculate the limits set by the use of spatially discrete measurements on the accuracy of balance closure, we derive an analytical solution to estimate the residual on the basis of the two key parameters: the penetration depth and the distance between the measurements. When the thickness of the control layer for which the balance is computed is comparable to the penetration depth of the forcing (which depends on the thermal diffusivity and on the forcing period) large residuals arise. The residual is also very sensitive to the distance between the measurements, which requires accurately controlling the position of the sensors in field experiments. We also demonstrate that, for the same experimental setup, mass residuals are sensitively larger than the energy residuals due to the nonlinearity of the moisture transport equation. Our analysis suggests that a careful assessment of the systematic mass error introduced by the use of spatially discrete data is required before using fluxes and residuals computed directly from field measurements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In groundwater applications, Monte Carlo methods are employed to model the uncertainty on geological parameters. However, their brute-force application becomes computationally prohibitive for highly detailed geological descriptions, complex physical processes, and a large number of realizations. The Distance Kernel Method (DKM) overcomes this issue by clustering the realizations in a multidimensional space based on the flow responses obtained by means of an approximate (computationally cheaper) model; then, the uncertainty is estimated from the exact responses that are computed only for one representative realization per cluster (the medoid). Usually, DKM is employed to decrease the size of the sample of realizations that are considered to estimate the uncertainty. We propose to use the information from the approximate responses for uncertainty quantification. The subset of exact solutions provided by DKM is then employed to construct an error model and correct the potential bias of the approximate model. Two error models are devised that both employ the difference between approximate and exact medoid solutions, but differ in the way medoid errors are interpolated to correct the whole set of realizations. The Local Error Model rests upon the clustering defined by DKM and can be seen as a natural way to account for intra-cluster variability; the Global Error Model employs a linear interpolation of all medoid errors regardless of the cluster to which the single realization belongs. These error models are evaluated for an idealized pollution problem in which the uncertainty of the breakthrough curve needs to be estimated. For this numerical test case, we demonstrate that the error models improve the uncertainty quantification provided by the DKM algorithm and are effective in correcting the bias of the estimate computed solely from the MsFV results. The framework presented here is not specific to the methods considered and can be applied to other combinations of approximate models and techniques to select a subset of realizations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of the present article is to take stock of a recent exchange in Organizational Research Methods between critics (Rönkkö & Evermann, 2013) and proponents (Henseler et al., 2014) of partial least squares path modeling (PLS-PM). The two target articles were centered around six principal issues, namely whether PLS-PM: (1) can be truly characterized as a technique for structural equation modeling (SEM); (2) is able to correct for measurement error; (3) can be used to validate measurement models; (4) accommodates small sample sizes; (5) is able to provide null hypothesis tests for path coefficients; and (6) can be employed in an exploratory, model-building fashion. We summarize and elaborate further on the key arguments underlying the exchange, drawing from the broader methodological and statistical literature in order to offer additional thoughts concerning the utility of PLS-PM and ways in which the technique might be improved. We conclude with recommendations as to whether and how PLS-PM serves as a viable contender to SEM approaches for estimating and evaluating theoretical models.