914 resultados para Random regression


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Confidence in decision making is an important dimension of managerialbehavior. However, what is the relation between confidence, on the onehand, and the fact of receiving or expecting to receive feedback ondecisions taken, on the other hand? To explore this and related issuesin the context of everyday decision making, use was made of the ESM(Experience Sampling Method) to sample decisions taken by undergraduatesand business executives. For several days, participants received 4 or 5SMS messages daily (on their mobile telephones) at random moments at whichpoint they completed brief questionnaires about their current decisionmaking activities. Issues considered here include differences between thetypes of decisions faced by the two groups, their structure, feedback(received and expected), and confidence in decisions taken as well as inthe validity of feedback. No relation was found between confidence indecisions and whether participants received or expected to receivefeedback on those decisions. In addition, although participants areclearly aware that feedback can provide both confirming and disconfirming evidence, their ability to specify appropriatefeedback is imperfect. Finally, difficulties experienced inusing the ESM are discussed as are possibilities for further researchusing this methodology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper develops a method to solve higher-dimensional stochasticcontrol problems in continuous time. A finite difference typeapproximation scheme is used on a coarse grid of low discrepancypoints, while the value function at intermediate points is obtainedby regression. The stability properties of the method are discussed,and applications are given to test problems of up to 10 dimensions.Accurate solutions to these problems can be obtained on a personalcomputer.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the fixed design regression model, additional weights areconsidered for the Nadaraya--Watson and Gasser--M\"uller kernel estimators.We study their asymptotic behavior and the relationships between new andclassical estimators. For a simple family of weights, and considering theIMSE as global loss criterion, we show some possible theoretical advantages.An empirical study illustrates the performance of the weighted estimatorsin finite samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most methods for small-area estimation are based on composite estimators derived from design- or model-based methods. A composite estimator is a linear combination of a direct and an indirect estimator with weights that usually depend on unknown parameters which need to be estimated. Although model-based small-area estimators are usually based on random-effects models, the assumption of fixed effects is at face value more appropriate.Model-based estimators are justified by the assumption of random (interchangeable) area effects; in practice, however, areas are not interchangeable. In the present paper we empirically assess the quality of several small-area estimators in the setting in which the area effects are treated as fixed. We consider two settings: one that draws samples from a theoretical population, and another that draws samples from an empirical population of a labor force register maintained by the National Institute of Social Security (NISS) of Catalonia. We distinguish two types of composite estimators: a) those that use weights that involve area specific estimates of bias and variance; and, b) those that use weights that involve a common variance and a common squared bias estimate for all the areas. We assess their precision and discuss alternatives to optimizing composite estimation in applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we examine the determinants of wages and decompose theobserved differences across genders into the "explained by differentcharacteristics" and "explained by different returns components"using a sample of Spanish workers. Apart from the conditionalexpectation of wages, we estimate the conditional quantile functionsfor men and women and find that both the absolute wage gap and thepart attributed to different returns at each of the quantiles, farfrom being well represented by their counterparts at the mean, aregreater as we move up in the wage range.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an exact test for whether two random variables that have known bounds on their support are negatively correlated. The alternative hypothesis is that they are not negatively correlated. No assumptions are made on the underlying distributions. We show by example that the Spearman rank correlation test as the competing exact test of correlation in nonparametric settings rests on an additional assumption on the data generating process without which it is not valid as a test for correlation.We then show how to test for the significance of the slope in a linear regression analysis that invovles a single independent variable and where outcomes of the dependent variable belong to a known bounded set.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we analyse the observed systematic differences incosts for teaching hospitals (THhenceforth) in Spain. Concernhas been voiced regarding the existence of a bias in thefinancing of TH s has been raised once prospective budgets arein the arena for hospital finance, and claims for adjusting totake into account the legitimate extra costs of teaching onhospital expenditure are well grounded. We focus on theestimation of the impact of teaching status on average cost. Weused a version of a multiproduct hospital cost function takinginto account some relevant factors from which to derive theobserved differences. We assume that the relationship betweenthe explanatory and the dependent variables follows a flexibleform for each of the explanatory variables. We also model theunderlying covariance structure of the data. We assumed twoqualitatively different sources of variation: random effects andserial correlation. Random variation refers to both general levelvariation (through the random intercept) and the variationspecifically related to teaching status. We postulate that theimpact of the random effects is predominant over the impact ofthe serial correlation effects. The model is estimated byrestricted maximum likelihood. Our results show that costs are 9%higher (15% in the case of median costs) in teaching than innon-teaching hospitals. That is, teaching status legitimatelyexplains no more than half of the observed difference in actualcosts. The impact on costs of the teaching factor depends on thenumber of residents, with an increase of 51.11% per resident forhospitals with fewer than 204 residents (third quartile of thenumber of residents) and 41.84% for hospitals with more than 204residents. In addition, the estimated dispersion is higher amongteaching hospitals. As a result, due to the considerable observedheterogeneity, results should be interpreted with caution. From apolicy making point of view, we conclude that since a higherrelative burden for medical training is under public hospitalcommand, an explicit adjustment to the extra costs that theteaching factor imposes on hospital finance is needed, beforehospital competition for inpatient services takes place.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper shows how recently developed regression-based methods for thedecomposition of health inequality can be extended to incorporateindividual heterogeneity in the responses of health to the explanatoryvariables. We illustrate our method with an application to the CanadianNPHS of 1994. Our strategy for the estimation of heterogeneous responsesis based on the quantile regression model. The results suggest that thereis an important degree of heterogeneity in the association of health toexplanatory variables which, in turn, accounts for a substantial percentageof inequality in observed health. A particularly interesting finding isthat the marginal response of health to income is zero for healthyindividuals but positive and significant for unhealthy individuals. Theheterogeneity in the income response reduces both overall health inequalityand income related health inequality.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

OBJECTIVES: To examine trends in the prevalence of congenital heart defects (CHDs) in Europe and to compare these trends with the recent decrease in the prevalence of CHDs in Canada (Quebec) that was attributed to the policy of mandatory folic acid fortification. STUDY DESIGN: We used data for the period 1990-2007 for 47 508 cases of CHD not associated with a chromosomal anomaly from 29 population-based European Surveillance of Congenital Anomalies registries in 16 countries covering 7.3 million births. We estimated trends for all CHDs combined and separately for 3 severity groups using random-effects Poisson regression models with splines. RESULTS: We found that the total prevalence of CHDs increased during the 1990s and the early 2000s until 2004 and decreased thereafter. We found essentially no trend in total prevalence of the most severe group (group I), whereas the prevalence of severity group II increased until about 2000 and decreased thereafter. Trends for severity group III (the most prevalent group) paralleled those for all CHDs combined. CONCLUSIONS: The prevalence of CHDs decreased in recent years in Europe in the absence of a policy for mandatory folic acid fortification. One possible explanation for this decrease may be an as-yet-undocumented increase in folic acid intake of women in Europe following recommendations for folic acid supplementation and/or voluntary fortification. However, alternative hypotheses, including reductions in risk factors of CHDs (eg, maternal smoking) and improved management of maternal chronic health conditions (eg, diabetes), must also be considered for explaining the observed decrease in the prevalence of CHDs in Europe or elsewhere.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, several anonymization algorithms have appeared for privacy preservation on graphs. Some of them are based on random-ization techniques and on k-anonymity concepts. We can use both of them to obtain an anonymized graph with a given k-anonymity value. In this paper we compare algorithms based on both techniques in orderto obtain an anonymized graph with a desired k-anonymity value. We want to analyze the complexity of these methods to generate anonymized graphs and the quality of the resulting graphs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: The prevalence of hyperuricemia has rarely been investigated in developing countries. The purpose of the present study was to investigate the prevalence of hyperuricemia and the association between uric acid levels and the various cardiovascular risk factors in a developing country with high average blood pressures (the Seychelles, Indian Ocean, population mainly of African origin). METHODS: This cross-sectional health examination survey was based on a population random sample from the Seychelles. It included 1011 subjects aged 25 to 64 years. Blood pressure (BP), body mass index (BMI), waist circumference, waist-to-hip ratio, total and HDL cholesterol, serum triglycerides and serum uric acid were measured. Data were analyzed using scatterplot smoothing techniques and gender-specific linear regression models. RESULTS: The prevalence of a serum uric acid level >420 micromol/L in men was 35.2% and the prevalence of a serum uric acid level >360 micromol/L was 8.7% in women. Serum uric acid was strongly related to serum triglycerides in men as well as in women (r = 0.73 in men and r = 0.59 in women, p < 0.001). Uric acid levels were also significantly associated but to a lesser degree with age, BMI, blood pressure, alcohol and the use of antihypertensive therapy. In a regression model, triglycerides, age, BMI, antihypertensive therapy and alcohol consumption accounted for about 50% (R2) of the serum uric acid variations in men as well as in women. CONCLUSIONS: This study shows that the prevalence of hyperuricemia can be high in a developing country such as the Seychelles. Besides alcohol consumption and the use of antihypertensive therapy, mainly diuretics, serum uric acid is markedly associated with parameters of the metabolic syndrome, in particular serum triglycerides. Considering the growing incidence of obesity and metabolic syndrome worldwide and the potential link between hyperuricemia and cardiovascular complications, more emphasis should be put on the evolving prevalence of hyperuricemia in developing countries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The predictive potential of six selected factors was assessed in 72 patients with primary myelodysplastic syndrome using univariate and multivariate logistic regression analysis of survival at 18 months. Factors were age (above median of 69 years), dysplastic features in the three myeloid bone marrow cell lineages, presence of chromosome defects, all metaphases abnormal, double or complex chromosome defects (C23), and a Bournemouth score of 2, 3, or 4 (B234). In the multivariate approach, B234 and C23 proved to be significantly associated with a reduction in the survival probability. The similarity of the regression coefficients associated with these two factors means that they have about the same weight. Consequently, the model was simplified by counting the number of factors (0, 1, or 2) present in each patient, thus generating a scoring system called the Lausanne-Bournemouth score (LB score). The LB score combines the well-recognized and easy-to-use Bournemouth score (B score) with the chromosome defect complexity, C23 constituting an additional indicator of patient outcome. The predicted risk of death within 18 months calculated from the model is as follows: 7.1% (confidence interval: 1.7-24.8) for patients with an LB score of 0, 60.1% (44.7-73.8) for an LB score of 1, and 96.8% (84.5-99.4) for an LB score of 2. The scoring system presented here has several interesting features. The LB score may improve the predictive value of the B score, as it is able to recognize two prognostic groups in the intermediate risk category of patients with B scores of 2 or 3. It has also the ability to identify two distinct prognostic subclasses among RAEB and possibly CMML patients. In addition to its above-described usefulness in the prognostic evaluation, the LB score may bring new insights into the understanding of evolution patterns in MDS. We used the combination of the B score and chromosome complexity to define four classes which may be considered four possible states of myelodysplasia and which describe two distinct evolutional pathways.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. The paper considers a data driven approach in modelling uncertainty in spatial predictions. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic features and describe stochastic variability and non-uniqueness of spatial properties. It is able to capture and preserve key spatial dependencies such as connectivity, which is often difficult to achieve with two-point geostatistical models. Semi-supervised SVR is designed to integrate various kinds of conditioning data and learn dependences from them. A stochastic semi-supervised SVR model is integrated into a Bayesian framework to quantify uncertainty with multiple models fitted to dynamic observations. The developed approach is illustrated with a reservoir case study. The resulting probabilistic production forecasts are described by uncertainty envelopes.