166 resultados para Instrumental variable regression

em Université de Lausanne, Switzerland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social scientists often estimate models from correlational data, where the independent variable has not been exogenously manipulated; they also make implicit or explicit causal claims based on these models. When can these claims be made? We answer this question by first discussing design and estimation conditions under which model estimates can be interpreted, using the randomized experiment as the gold standard. We show how endogeneity--which includes omitted variables, omitted selection, simultaneity, common methods bias, and measurement error--renders estimates causally uninterpretable. Second, we present methods that allow researchers to test causal claims in situations where randomization is not possible or when causal interpretation is confounded, including fixed-effects panel, sample selection, instrumental variable, regression discontinuity, and difference-in-differences models. Third, we take stock of the methodological rigor with which causal claims are being made in a social sciences discipline by reviewing a representative sample of 110 articles on leadership published in the previous 10 years in top-tier journals. Our key finding is that researchers fail to address at least 66 % and up to 90 % of design and estimation conditions that make causal claims invalid. We conclude by offering 10 suggestions on how to improve non-experimental research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social scientists often estimate models from correlational data, where the independent variable has not been exogenously manipulated; they also make implicit or explicit causal claims based on these models. When can these claims be made? We answer this question by first discussing design and estimation conditions under which model estimates can be interpreted, using the randomized experiment as the gold standard. We show how endogeneity--which includes omitted variables, omitted selection, simultaneity, common methods bias, and measurement error--renders estimates causally uninterpretable. Second, we present methods that allow researchers to test causal claims in situations where randomization is not possible or when causal interpretation is confounded, including fixed-effects panel, sample selection, instrumental variable, regression discontinuity, and difference-in-differences models. Third, we take stock of the methodological rigor with which causal claims are being made in a social sciences discipline by reviewing a representative sample of 110 articles on leadership published in the previous 10 years in top-tier journals. Our key finding is that researchers fail to address at least 66 % and up to 90 % of design and estimation conditions that make causal claims invalid. We conclude by offering 10 suggestions on how to improve non-experimental research.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Zero correlation between measurement error and model error has been assumed in existing panel data models dealing specifically with measurement error. We extend this literature and propose a simple model where one regressor is mismeasured, allowing the measurement error to correlate with model error. Zero correlation between measurement error and model error is a special case in our model where correlated measurement error equals zero. We ask two research questions. First, we wonder if the correlated measurement error can be identified in the context of panel data. Second, we wonder if classical instrumental variables in panel data need to be adjusted when correlation between measurement error and model error cannot be ignored. Under some regularity conditions the answer is yes to both questions. We then propose a two-step estimation corresponding to the two questions. The first step estimates correlated measurement error from a reverse regression; and the second step estimates usual coefficients of interest using adjusted instruments.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND: Obesity has been shown to be associated with depression and it has been suggested that higher body mass index (BMI) increases the risk of depression and other common mental disorders. However, the causal relationship remains unclear and Mendelian randomisation, a form of instrumental variable analysis, has recently been employed to attempt to resolve this issue. AIMS: To investigate whether higher BMI increases the risk of major depression. METHOD: Two instrumental variable analyses were conducted to test the causal relationship between obesity and major depression in RADIANT, a large case-control study of major depression. We used a single nucleotide polymorphism (SNP) in FTO and a genetic risk score (GRS) based on 32 SNPs with well-established associations with BMI. RESULTS: Linear regression analysis, as expected, showed that individuals carrying more risk alleles of FTO or having higher score of GRS had a higher BMI. Probit regression suggested that higher BMI is associated with increased risk of major depression. However, our two instrumental variable analyses did not support a causal relationship between higher BMI and major depression (FTO genotype: coefficient -0.03, 95% CI -0.18 to 0.13, P = 0.73; GRS: coefficient -0.02, 95% CI -0.11 to 0.07, P = 0.62). CONCLUSIONS: Our instrumental variable analyses did not support a causal relationship between higher BMI and major depression. The positive associations of higher BMI with major depression in probit regression analyses might be explained by reverse causality and/or residual confounding.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Cognitive impairment has emerged as a major driver of disability in old age, with profound effects on individual well-being and decision making at older ages. In the light of policies aimed at postponing retirement ages, an important question is whether continued labour supply helps to maintain high levels of cognition at older ages. We use data of older men from the US Health and Retirement Study to estimate the effect of continued labour market participation at older ages on later-life cognition. As retirement itself is likely to depend on cognitive functioning and may thus be endogenous, we use offers of early retirement windows as instruments for retirement in econometric models for later-life cognitive functioning. These offers of early retirement are legally required to be nondiscriminatory and thus, inter alia, unrelated to cognitive functioning. At the same time, these offers of early retirement options are significant predictors of retirement. Although the simple ordinary least squares estimates show a negative relationship between retirement duration and various measures of cognitive functioning, instrumental variable estimates suggest that these associations may not be causal effects. Specifically, we find no clear relationship between retirement duration and later-life cognition for white-collar workers and, if anything, a positive relationship for blue-collar workers. Copyright © 2011 John Wiley & Sons, Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The method of instrumental variable (referred to as Mendelian randomization when the instrument is a genetic variant) has been initially developed to infer on a causal effect of a risk factor on some outcome of interest in a linear model. Adapting this method to nonlinear models, however, is known to be problematic. In this paper, we consider the simple case when the genetic instrument, the risk factor, and the outcome are all binary. We compare via simulations the usual two-stages estimate of a causal odds-ratio and its adjusted version with a recently proposed estimate in the context of a clinical trial with noncompliance. In contrast to the former two, we confirm that the latter is (under some conditions) a valid estimate of a causal odds-ratio defined in the subpopulation of compliers, and we propose its use in the context of Mendelian randomization. By analogy with a clinical trial with noncompliance, compliers are those individuals for whom the presence/absence of the risk factor X is determined by the presence/absence of the genetic variant Z (i.e., for whom we would observe X = Z whatever the alleles randomly received at conception). We also recall and illustrate the huge variability of instrumental variable estimates when the instrument is weak (i.e., with a low percentage of compliers, as is typically the case with genetic instruments for which this proportion is frequently smaller than 10%) where the inter-quartile range of our simulated estimates was up to 18 times higher compared to a conventional (e.g., intention-to-treat) approach. We thus conclude that the need to find stronger instruments is probably as important as the need to develop a methodology allowing to consistently estimate a causal odds-ratio.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A mediator is a dependent variable, m (e.g., charisma), that is thought to channel the effect of an independent variable, x (e.g., receiving training or not), on another dependent variable (e.g., subordinate satisfaction), y. In experimental settings x is manipulated-subjects are randomized to treatment-to isolate the causal effect of x on other variables. If m is not or cannot be manipulated, which is often the case, its causal effect on other variables cannot be determined; thus, standard mediation tests cannot inform policy or practice. I will show how an econometric procedure, called instrumental-variable estimation, can examine mediation in such cases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Individual signs and symptoms are of limited value for the diagnosis of influenza. Objective To develop a decision tree for the diagnosis of influenza based on a classification and regression tree (CART) analysis. Methods Data from two previous similar cohort studies were assembled into a single dataset. The data were randomly divided into a development set (70%) and a validation set (30%). We used CART analysis to develop three models that maximize the number of patients who do not require diagnostic testing prior to treatment decisions. The validation set was used to evaluate overfitting of the model to the training set. Results Model 1 has seven terminal nodes based on temperature, the onset of symptoms and the presence of chills, cough and myalgia. Model 2 was a simpler tree with only two splits based on temperature and the presence of chills. Model 3 was developed with temperature as a dichotomous variable (≥38°C) and had only two splits based on the presence of fever and myalgia. The area under the receiver operating characteristic curves (AUROCC) for the development and validation sets, respectively, were 0.82 and 0.80 for Model 1, 0.75 and 0.76 for Model 2 and 0.76 and 0.77 for Model 3. Model 2 classified 67% of patients in the validation group into a high- or low-risk group compared with only 38% for Model 1 and 54% for Model 3. Conclusions A simple decision tree (Model 2) classified two-thirds of patients as low or high risk and had an AUROCC of 0.76. After further validation in an independent population, this CART model could support clinical decision making regarding influenza, with low-risk patients requiring no further evaluation for influenza and high-risk patients being candidates for empiric symptomatic or drug therapy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing the relationship between the baseline value and subsequent change of a continuous variable is a frequent matter of inquiry in cohort studies. These analyses are surprisingly complex, particularly if only two waves of data are available. It is unclear for non-biostatisticians where the complexity of this analysis lies and which statistical method is adequate.With the help of simulated longitudinal data of body mass index in children,we review statistical methods for the analysis of the association between the baseline value and subsequent change, assuming linear growth with time. Key issues in such analyses are mathematical coupling, measurement error, variability of change between individuals, and regression to the mean. Ideally, it is better to rely on multiple repeated measurements at different times and a linear random effects model is a standard approach if more than two waves of data are available. If only two waves of data are available, our simulations show that Blomqvist's method - which consists in adjusting for measurement error variance the estimated regression coefficient of observed change on baseline value - provides accurate estimates. The adequacy of the methods to assess the relationship between the baseline value and subsequent change depends on the number of data waves, the availability of information on measurement error, and the variability of change between individuals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Leaders must scan the internal and external environment, chart strategic and task objectives, and provide performance feedback. These instrumental leadership (IL) functions go beyond the motivational and quid-pro quo leader behaviors that comprise the full-range-transformational, transactional, and laissez faire-leadership model. In four studies we examined the construct validity of IL. We found evidence for a four-factor IL model that was highly prototypical of good leadership. IL predicted top-level leader emergence controlling for the full-range factors, initiating structure, and consideration. It also explained unique variance in outcomes beyond the full-range factors; the effects of transformational leadership were vastly overstated when IL was omitted from the model. We discuss the importance of a "fuller full-range" leadership theory for theory and practice. We also showcase our methodological contributions regarding corrections for common method variance (i.e., endogeneity) bias using two-stage least squares (2SLS) regression and Monte Carlo split-sample designs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE: The aim of this work is to investigate the characteristics of eyes failing to maintain visual acuity (VA) receiving variable dosing ranibizumab for neovascular age-related macular degeneration (nAMD) after three initial loading doses. METHODS: A consecutive series of patients with nAMD, who, after three loading doses of intravitreal ranibizumab (0.5 mg each), were re-treated for fluid seen on optical coherence tomography. After exclusion of eyes with previous treatment, follow-up less than 12 months, or missed visits, 99 patients were included in the analysis. The influence of baseline characteristics, initial VA response, and central retinal thickness (CRT) fluctuations on the VA stability from month 3 to month 24 were analyzed using subgroups and multiple regression analyses. RESULTS: Mean follow-up duration was 21.3 months (range 12-40 months, 32 patients followed-up for ≥24 months). Secondary loss of VA (loss of five letters or more) after month 3 was seen in 30 patients (mean VA improvement from baseline +5.8 letters at month 3, mean loss from baseline -5.3 letters at month 12 and -9.7 at final visit up to month 24), while 69 patients maintained vision (mean gain +8.9 letters at month 3, +10.4 letters at month 12, and +12.8 letters at final visit up to month 24). Secondary loss of VA was associated with the presence of pigment epithelial detachment (PED) at baseline (p 0.01), but not with baseline fibrosis/atrophy/hemorrhage, CRT fluctuations, or initial VA response. Chart analysis revealed additional individual explanations for the secondary loss of VA, including retinal pigment epithelial tears, progressive fibrosis, and atrophy. CONCLUSIONS: Tissue damage due to degeneration of PED, retinal pigment epithelial tears, progressive fibrosis, progressive atrophy, or massive hemorrhage, appears to be relevant in causing secondary loss of VA despite vascular endothelial growth factor suppression. PED at baseline may represent a risk factor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aim This study used data from temperate forest communities to assess: (1) five different stepwise selection methods with generalized additive models, (2) the effect of weighting absences to ensure a prevalence of 0.5, (3) the effect of limiting absences beyond the environmental envelope defined by presences, (4) four different methods for incorporating spatial autocorrelation, and (5) the effect of integrating an interaction factor defined by a regression tree on the residuals of an initial environmental model. Location State of Vaud, western Switzerland. Methods Generalized additive models (GAMs) were fitted using the grasp package (generalized regression analysis and spatial predictions, http://www.cscf.ch/grasp). Results Model selection based on cross-validation appeared to be the best compromise between model stability and performance (parsimony) among the five methods tested. Weighting absences returned models that perform better than models fitted with the original sample prevalence. This appeared to be mainly due to the impact of very low prevalence values on evaluation statistics. Removing zeroes beyond the range of presences on main environmental gradients changed the set of selected predictors, and potentially their response curve shape. Moreover, removing zeroes slightly improved model performance and stability when compared with the baseline model on the same data set. Incorporating a spatial trend predictor improved model performance and stability significantly. Even better models were obtained when including local spatial autocorrelation. A novel approach to include interactions proved to be an efficient way to account for interactions between all predictors at once. Main conclusions Models and spatial predictions of 18 forest communities were significantly improved by using either: (1) cross-validation as a model selection method, (2) weighted absences, (3) limited absences, (4) predictors accounting for spatial autocorrelation, or (5) a factor variable accounting for interactions between all predictors. The final choice of model strategy should depend on the nature of the available data and the specific study aims. Statistical evaluation is useful in searching for the best modelling practice. However, one should not neglect to consider the shapes and interpretability of response curves, as well as the resulting spatial predictions in the final assessment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Summary points: - The bias introduced by random measurement error will be different depending on whether the error is in an exposure variable (risk factor) or outcome variable (disease) - Random measurement error in an exposure variable will bias the estimates of regression slope coefficients towards the null - Random measurement error in an outcome variable will instead increase the standard error of the estimates and widen the corresponding confidence intervals, making results less likely to be statistically significant - Increasing sample size will help minimise the impact of measurement error in an outcome variable but will only make estimates more precisely wrong when the error is in an exposure variable

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the general regression neural networks (GRNN) as a nonlinear regression method for the interpolation of monthly wind speeds in complex Alpine orography. GRNN is trained using data coming from Swiss meteorological networks to learn the statistical relationship between topographic features and wind speed. The terrain convexity, slope and exposure are considered by extracting features from the digital elevation model at different spatial scales using specialised convolution filters. A database of gridded monthly wind speeds is then constructed by applying GRNN in prediction mode during the period 1968-2008. This study demonstrates that using topographic features as inputs in GRNN significantly reduces cross-validation errors with respect to low-dimensional models integrating only geographical coordinates and terrain height for the interpolation of wind speed. The spatial predictability of wind speed is found to be lower in summer than in winter due to more complex and weaker wind-topography relationships. The relevance of these relationships is studied using an adaptive version of the GRNN algorithm which allows to select the useful terrain features by eliminating the noisy ones. This research provides a framework for extending the low-dimensional interpolation models to high-dimensional spaces by integrating additional features accounting for the topographic conditions at multiple spatial scales. Copyright (c) 2012 Royal Meteorological Society.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. METHOD: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). RESULTS: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. CONCLUSION: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables like soil gas radon measurements as well as more detailed geological information.