967 resultados para Instrumental variable regression


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we introduce the method of leaps and bounds regression which can be used to select variables quickly and obtain the best regression models. These models contain one variable, two variables, three variables and so on. The results obtained by using leaps and bounds regression were compared with those achieved by using stepwise regression to lead to the conclusion that leaps and bounds regression is an effective method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper provides a root-n consistent, asymptotically normal weighted least squares estimator of the coefficients in a truncated regression model. The distribution of the errors is unknown and permits general forms of unknown heteroskedasticity. Also provided is an instrumental variables based two-stage least squares estimator for this model, which can be used when some regressors are endogenous, mismeasured, or otherwise correlated with the errors. A simulation study indicates that the new estimators perform well in finite samples. Our limiting distribution theory includes a new asymptotic trimming result addressing the boundary bias in first-stage density estimation without knowledge of the support boundary. © 2007 Cambridge University Press.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the problem of variable selection in regression modeling in high-dimensional spaces where there is known structure among the covariates. This is an unconventional variable selection problem for two reasons: (1) The dimension of the covariate space is comparable, and often much larger, than the number of subjects in the study, and (2) the covariate space is highly structured, and in some cases it is desirable to incorporate this structural information in to the model building process. We approach this problem through the Bayesian variable selection framework, where we assume that the covariates lie on an undirected graph and formulate an Ising prior on the model space for incorporating structural information. Certain computational and statistical problems arise that are unique to such high-dimensional, structured settings, the most interesting being the phenomenon of phase transitions. We propose theoretical and computational schemes to mitigate these problems. We illustrate our methods on two different graph structures: the linear chain and the regular graph of degree k. Finally, we use our methods to study a specific application in genomics: the modeling of transcription factor binding sites in DNA sequences. © 2010 American Statistical Association.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper studies the multiplicity-correction effect of standard Bayesian variable-selection priors in linear regression. Our first goal is to clarify when, and how, multiplicity correction happens automatically in Bayesian analysis, and to distinguish this correction from the Bayesian Ockham's-razor effect. Our second goal is to contrast empirical-Bayes and fully Bayesian approaches to variable selection through examples, theoretical results and simulations. Considerable differences between the two approaches are found. In particular, we prove a theorem that characterizes a surprising aymptotic discrepancy between fully Bayes and empirical Bayes. This discrepancy arises from a different source than the failure to account for hyperparameter uncertainty in the empirical-Bayes estimate. Indeed, even at the extreme, when the empirical-Bayes estimate converges asymptotically to the true variable-inclusion probability, the potential for a serious difference remains. © Institute of Mathematical Statistics, 2010.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

C17 polyacetylenes are a group of bioactive compounds present in carrots which have recently gained scientific attention due to their cytotoxicity against cancer cells. In common with many bioactive compounds, their levels may be influenced by thermal processes, such as boiling or water immersion. This study investigated the effect of a number of water immersion time/temperature combinations on concentrations of these compounds and attempted to model the changes. Carrot samples were thermally treated by heating in water at temperatures from 50–100 °C and holding times of 2–60 min. Following heating, levels of falcarinol (FaOH), falcarindiol (FaDOH), falcarindiol-3-acetate (FaDOAc) and Hunter colour parameters (L*a*b*) were determined. FaOH, FaDOH, FaDOAc levels were significantly reduced at lower temperatures (50–60 °C). In contrast, samples heated at temperatures from 70–100 °C exhibited higher levels of polyacetylenes (p < 0.05) than did raw unprocessed samples. Regression modelling was used to model the effects of temperature and holding time on the levels of the variables measured. Temperature treatment and holding time were found to significantly affect the polyacetylene content of carrot disks. Predicted models were found to be significant (p < 0.05) with high coefficients of determination (R2).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nitrogen Dioxide (NO2) is known to act as an environmental trigger for many respiratory illnesses. As a pollutant it is difficult to map accurately, as concentrations can vary greatly over small distances. In this study three geostatistical techniques were compared, producing maps of NO2 concentrations in the United Kingdom (UK). The primary data source for each technique was NO2 point data, generated from background automatic monitoring and background diffusion tubes, which are analysed by different laboratories on behalf of local councils and authorities in the UK. The techniques used were simple kriging (SK), ordinary kriging (OK) and simple kriging with a locally varying mean (SKlm). SK and OK make use of the primary variable only. SKlm differs in that it utilises additional data to inform prediction, and hence potentially reduces uncertainty. The secondary data source was Oxides of Nitrogen (NOx) derived from dispersion modelling outputs, at 1km x 1km resolution for the UK. These data were used to define the locally varying mean in SKlm, using two regression approaches: (i) global regression (GR) and (ii) geographically weighted regression (GWR). Based upon summary statistics and cross-validation prediction errors, SKlm using GWR derived local means produced the most accurate predictions. Therefore, using GWR to inform SKlm was beneficial in this study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: PET/CT scanning can determine suitability for curative therapy and inform decision making when considering radical therapy in patients with non-small cell lung cancer (NSCLC). Metastases to central mediastinal lymph nodes (N2) may alter such management decisions. We report a 2 year retrospective series assessing N2 lymph node staging accuracy with PET/CT compared to pathological analysis at surgery.

METHODS: Patients with NSCLC attending our centre (excluding those who had induction chemotherapy) who had staging PET/CT scans and pathological nodal sampling between June 2006 and June 2008 were analysed. For each lymph node assessed pathologically, the corresponding PET/CT status was determined. 64 patients with 200 N2 lymph nodes were analysed.

RESULTS: Sensitivity of PET/CT scans for indentifying involved N2 lymph nodes was
39%, specificity 96% and overall accuracy 90%. For individual lymph node analysis, logistic regression demonstrated a significant linear association between PET/CT sensitivity and time from scanning to surgery (p=0.031) but not for specificity and accuracy. Those scanned <9 weeks before pathological sampling were significantly more sensitive (64% >9 weeks, 0% ≥ 9 weeks, p=0.013) and more accurate (94% <9 weeks, 81% ≥ 9 weeks, p=0.007). Differences in specificity were not seen (97% <9 weeks, 91% ≥ 9 weeks, p=0.228). No significant difference in specificity was found at any time point.

CONCLUSIONS: We recommend that if a PET/CT scan is older than 9 weeks, and management would be altered by the presence of N2 nodes, re-staging of the
mediastinum should be undertaken.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This Paper Studies Tests of Joint Hypotheses in Time Series Regression with a Unit Root in Which Weakly Dependent and Heterogeneously Distributed Innovations Are Allowed. We Consider Two Types of Regression: One with a Constant and Lagged Dependent Variable, and the Other with a Trend Added. the Statistics Studied Are the Regression \"F-Test\" Originally Analysed by Dickey and Fuller (1981) in a Less General Framework. the Limiting Distributions Are Found Using Functinal Central Limit Theory. New Test Statistics Are Proposed Which Require Only Already Tabulated Critical Values But Which Are Valid in a Quite General Framework (Including Finite Order Arma Models Generated by Gaussian Errors). This Study Extends the Results on Single Coefficients Derived in Phillips (1986A) and Phillips and Perron (1986).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In CoDaWork’05, we presented an application of discriminant function analysis (DFA) to 4 different compositional datasets and modelled the first canonical variable using a segmented regression model solely based on an observation about the scatter plots. In this paper, multiple linear regressions are applied to different datasets to confirm the validity of our proposed model. In addition to dating the unknown tephras by calibration as discussed previously, another method of mapping the unknown tephras into samples of the reference set or missing samples in between consecutive reference samples is proposed. The application of these methodologies is demonstrated with both simulated and real datasets. This new proposed methodology provides an alternative, more acceptable approach for geologists as their focus is on mapping the unknown tephra with relevant eruptive events rather than estimating the age of unknown tephra. Kew words: Tephrochronology; Segmented regression

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multiple regression analysis is a statistical technique which allows to predict a dependent variable from m ore than one independent variable and also to determine influential independent variables. Using experimental data, in this study the multiple regression analysis is applied to predict the room mean velocity and determine the most influencing parameters on the velocity. More than 120 experiments for four different heat source locations were carried out in a test chamber with a high level wall mounted air supply terminal at air change rates 3-6 ach. The influence of the environmental parameters such as supply air momentum, room heat load, Archimedes number and local temperature ratio, were examined by two methods: a simple regression analysis incorporated into scatter matrix plots and multiple stepwise regression analysis. It is concluded that, when a heat source is located along the jet centre line, the supply momentum mainly influences the room mean velocity regardless of the plume strength. However, when the heat source is located outside the jet region, the local temperature ratio (the inverse of the local heat removal effectiveness) is a major influencing parameter.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper derives some exact power properties of tests for spatial autocorrelation in the context of a linear regression model. In particular, we characterize the circumstances in which the power vanishes as the autocorrelation increases, thus extending the work of Krämer (2005). More generally, the analysis in the paper sheds new light on how the power of tests for spatial autocorrelation is affected by the matrix of regressors and by the spatial structure. We mainly focus on the problem of residual spatial autocorrelation, in which case it is appropriate to restrict attention to the class of invariant tests, but we also consider the case when the autocorrelation is due to the presence of a spatially lagged dependent variable among the regressors. A numerical study aimed at assessing the practical relevance of the theoretical results is included

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this study was to determine the potential of mid-infrared spectroscopy coupled with multidimensional statistical analysis for the prediction of processed cheese instrumental texture and meltability attributes. Processed cheeses (n = 32) of varying composition were manufactured in a pilot plant. Following two and four weeks storage at 4 degrees C samples were analysed using texture profile analysis, two meltability tests (computer vision, Olson and Price) and mid-infrared spectroscopy (4000-640 cm(-1)). Partial least squares regression was used to develop predictive models for all measured attributes. Five attributes were successfully modelled with varying degrees of accuracy. The computer vision meltability model allowed for discrimination between high and low melt values (R-2 = 0.64). The hardness and springiness models gave approximate quantitative results (R-2 = 0.77) and the cohesiveness (R-2 = 0.81) and Olson and Price meltability (R-2 = 0.88) models gave good prediction results. (c) 2006 Elsevier Ltd. All rights reserved..

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The question of what explains variation in expenditures on Active Labour Market Programs (ALMPs) has attracted significant scholarship in recent years. Significant insights have been gained with respect to the role of employers, unions and dual labour markets, openness, and partisanship. However, there remain significant disagreements with respects to key explanatory variables such the role of unions or the impact of partisanship. Qualitative studies have shown that there are both good conceptual reasons as well as historical evidence that different ALMPs are driven by different dynamics. There is little reason to believe that vastly different programs such as training and employment subsidies are driven by similar structural, interest group or indeed partisan dynamics. The question is therefore whether different ALMPs have the same correlation with different key explanatory variables identified in the literature? Using regression analysis, this paper shows that the explanatory variables identified by the literature have different relation to distinct ALMPs. This refinement adds significant analytical value and shows that disagreements are at least partly due to a dependent variable problem of ‘over-aggregation’.