907 resultados para Weighted regression
Resumo:
The paper develops a method to solve higher-dimensional stochasticcontrol problems in continuous time. A finite difference typeapproximation scheme is used on a coarse grid of low discrepancypoints, while the value function at intermediate points is obtainedby regression. The stability properties of the method are discussed,and applications are given to test problems of up to 10 dimensions.Accurate solutions to these problems can be obtained on a personalcomputer.
Resumo:
In this paper we examine the determinants of wages and decompose theobserved differences across genders into the "explained by differentcharacteristics" and "explained by different returns components"using a sample of Spanish workers. Apart from the conditionalexpectation of wages, we estimate the conditional quantile functionsfor men and women and find that both the absolute wage gap and thepart attributed to different returns at each of the quantiles, farfrom being well represented by their counterparts at the mean, aregreater as we move up in the wage range.
Resumo:
This paper establishes a general framework for metric scaling of any distance measure between individuals based on a rectangular individuals-by-variables data matrix. The method allows visualization of both individuals and variables as well as preserving all the good properties of principal axis methods such as principal components and correspondence analysis, based on the singular-value decomposition, including the decomposition of variance into components along principal axes which provide the numerical diagnostics known as contributions. The idea is inspired from the chi-square distance in correspondence analysis which weights each coordinate by an amount calculated from the margins of the data table. In weighted metric multidimensional scaling (WMDS) we allow these weights to be unknown parameters which are estimated from the data to maximize the fit to the original distances. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing a matrix and displaying its rows and columns in biplots.
Resumo:
OBJECTIVE: To detect anatomical differences in areas related to motor processing between patients with motor conversion disorder (CD) and controls. METHODS: T1-weighted 3T brain MRI data of 15 patients suffering from motor CD (nine with hemiparesis and six with paraparesis) and 25 age- and gender-matched healthy volunteers were compared using voxel-based morphometry (VBM) and voxel-based cortical thickness (VBCT) analysis. RESULTS: We report significant cortical thickness (VBCT) increases in the bilateral premotor cortex of hemiparetic patients relative to controls and a trend towards increased grey matter volume (VBM) in the same region. Regression analyses showed a non-significant positive correlation between cortical thickness changes and symptom severity as well as illness duration in CD patients. CONCLUSIONS: Cortical thickness increases in premotor cortical areas of patients with hemiparetic CD provide evidence for altered brain structure in a condition with presumed normal brain anatomy. These may either represent premorbid vulnerability or a plasticity phenomenon related to the disease with the trends towards correlations with clinical variables supporting the latter.
Resumo:
The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.
Resumo:
This paper proposes a common and tractable framework for analyzingdifferent definitions of fixed and random effects in a contant-slopevariable-intercept model. It is shown that, regardless of whethereffects (i) are treated as parameters or as an error term, (ii) areestimated in different stages of a hierarchical model, or whether (iii)correlation between effects and regressors is allowed, when the sameinformation on effects is introduced into all estimation methods, theresulting slope estimator is also the same across methods. If differentmethods produce different results, it is ultimately because differentinformation is being used for each methods.
Resumo:
This paper shows how recently developed regression-based methods for thedecomposition of health inequality can be extended to incorporateindividual heterogeneity in the responses of health to the explanatoryvariables. We illustrate our method with an application to the CanadianNPHS of 1994. Our strategy for the estimation of heterogeneous responsesis based on the quantile regression model. The results suggest that thereis an important degree of heterogeneity in the association of health toexplanatory variables which, in turn, accounts for a substantial percentageof inequality in observed health. A particularly interesting finding isthat the marginal response of health to income is zero for healthyindividuals but positive and significant for unhealthy individuals. Theheterogeneity in the income response reduces both overall health inequalityand income related health inequality.
Resumo:
Summary points: - The bias introduced by random measurement error will be different depending on whether the error is in an exposure variable (risk factor) or outcome variable (disease) - Random measurement error in an exposure variable will bias the estimates of regression slope coefficients towards the null - Random measurement error in an outcome variable will instead increase the standard error of the estimates and widen the corresponding confidence intervals, making results less likely to be statistically significant - Increasing sample size will help minimise the impact of measurement error in an outcome variable but will only make estimates more precisely wrong when the error is in an exposure variable
Resumo:
We construct a weighted Euclidean distance that approximates any distance or dissimilarity measure between individuals that is based on a rectangular cases-by-variables data matrix. In contrast to regular multidimensional scaling methods for dissimilarity data, the method leads to biplots of individuals and variables while preserving all the good properties of dimension-reduction methods that are based on the singular-value decomposition. The main benefits are the decomposition of variance into components along principal axes, which provide the numerical diagnostics known as contributions, and the estimation of nonnegative weights for each variable. The idea is inspired by the distance functions used in correspondence analysis and in principal component analysis of standardized data, where the normalizations inherent in the distances can be considered as differential weighting of the variables. In weighted Euclidean biplots we allow these weights to be unknown parameters, which are estimated from the data to maximize the fit to the chosen distances or dissimilarities. These weights are estimated using a majorization algorithm. Once this extra weight-estimation step is accomplished, the procedure follows the classical path in decomposing the matrix and displaying its rows and columns in biplots.
Resumo:
This work aimed to measure and analyze total rainfall (P), rainfall intensity and five-day antecedent rainfall effects on runoff (R); to compare measured and simulated R values using the Soil Conservation Service Curve Number method (CN) for each rainfall event; and to establish average R/P ratios for observed R values. A one-year (07/01/96 to 06/30/97) rainfall-runoff data study was carried out in the Capetinga watershed (962.4 ha), located at the Federal District of Brazil, 47° 52' longitude West and 15° 52' latitude South. Soils of the watershed were predominantly covered by natural vegetation. Total rainfall and runoff for the period were 1,744 and 52.5 mm, respectively, providing R/P of 3% and suggesting that watershed physical characteristics favored water infiltration into the soil. A multivariate regression analysis for 31 main rainfall-runoff events totaling 781.9 and 51.0 mm, respectively, indicated that the amount of runoff was only dependent upon rainfall volume. Simulated values of total runoff were underestimated about 15% when using CN method and an area-weighted average of the CN based on published values. On the other hand, when average values of CN were calculated for the watershed, total runoff was overestimated about 39%, suggesting that CN method shoud be used with care in areas under natural vegetation.
Resumo:
BACKGROUND: Three-dimensional (3D) navigator-gated and prospectively corrected free-breathing coronary magnetic resonance angiography (MRA) allows for submillimeter image resolution but suffers from poor contrast between coronary blood and myocardium. Data collected over >100 ms/heart beat are also susceptible to bulk cardiac and respiratory motion. To address these problems, we examined the effect of a T2 preparation prepulse (T2prep) for myocardial suppression and a shortened acquisition window on coronary definition. METHODS AND RESULTS: Eight healthy adult subjects and 5 patients with confirmed coronary artery disease (CAD) underwent free-breathing 3D MRA with and without T2prep and with 120- and 60-ms data-acquisition windows. The T2prep resulted in a 123% (P<0. 001) increase in contrast-to-noise ratio (CNR). Coronary edge definition was improved by 33% (P<0.001). Acquisition window shortening from 120 to 60 ms resulted in better vessel definition (11%; P<0.001). Among patients with CAD, there was a good correspondence with disease. CONCLUSIONS: Free-breathing, T2prep, 3D coronary MRA with a shorter acquisition window resulted in improved CNR and better coronary artery definition, allowing the assessment of coronary disease. This approach offers the potential for free-breathing, noninvasive assessment of the major coronary arteries.
Resumo:
The predictive potential of six selected factors was assessed in 72 patients with primary myelodysplastic syndrome using univariate and multivariate logistic regression analysis of survival at 18 months. Factors were age (above median of 69 years), dysplastic features in the three myeloid bone marrow cell lineages, presence of chromosome defects, all metaphases abnormal, double or complex chromosome defects (C23), and a Bournemouth score of 2, 3, or 4 (B234). In the multivariate approach, B234 and C23 proved to be significantly associated with a reduction in the survival probability. The similarity of the regression coefficients associated with these two factors means that they have about the same weight. Consequently, the model was simplified by counting the number of factors (0, 1, or 2) present in each patient, thus generating a scoring system called the Lausanne-Bournemouth score (LB score). The LB score combines the well-recognized and easy-to-use Bournemouth score (B score) with the chromosome defect complexity, C23 constituting an additional indicator of patient outcome. The predicted risk of death within 18 months calculated from the model is as follows: 7.1% (confidence interval: 1.7-24.8) for patients with an LB score of 0, 60.1% (44.7-73.8) for an LB score of 1, and 96.8% (84.5-99.4) for an LB score of 2. The scoring system presented here has several interesting features. The LB score may improve the predictive value of the B score, as it is able to recognize two prognostic groups in the intermediate risk category of patients with B scores of 2 or 3. It has also the ability to identify two distinct prognostic subclasses among RAEB and possibly CMML patients. In addition to its above-described usefulness in the prognostic evaluation, the LB score may bring new insights into the understanding of evolution patterns in MDS. We used the combination of the B score and chromosome complexity to define four classes which may be considered four possible states of myelodysplasia and which describe two distinct evolutional pathways.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. The paper considers a data driven approach in modelling uncertainty in spatial predictions. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic features and describe stochastic variability and non-uniqueness of spatial properties. It is able to capture and preserve key spatial dependencies such as connectivity, which is often difficult to achieve with two-point geostatistical models. Semi-supervised SVR is designed to integrate various kinds of conditioning data and learn dependences from them. A stochastic semi-supervised SVR model is integrated into a Bayesian framework to quantify uncertainty with multiple models fitted to dynamic observations. The developed approach is illustrated with a reservoir case study. The resulting probabilistic production forecasts are described by uncertainty envelopes.