42 resultados para Regression discontinuity
Resumo:
When actuaries face with the problem of pricing an insurance contract that contains different types of coverage, such as a motor insurance or homeowner's insurance policy, they usually assume that types of claim are independent. However, this assumption may not be realistic: several studies have shown that there is a positive correlation between types of claim. Here we introduce different regression models in order to relax the independence assumption, including zero-inflated models to account for excess of zeros and overdispersion. These models have been largely ignored to multivariate Poisson date, mainly because of their computational di±culties. Bayesian inference based on MCMC helps to solve this problem (and also lets us derive, for several quantities of interest, posterior summaries to account for uncertainty). Finally, these models are applied to an automobile insurance claims database with three different types of claims. We analyse the consequences for pure and loaded premiums when the independence assumption is relaxed by using different multivariate Poisson regression models and their zero-inflated versions.
Resumo:
In a recent paper Bermúdez [2009] used bivariate Poisson regression models for ratemaking in car insurance, and included zero-inflated models to account for the excess of zeros and the overdispersion in the data set. In the present paper, we revisit this model in order to consider alternatives. We propose a 2-finite mixture of bivariate Poisson regression models to demonstrate that the overdispersion in the data requires more structure if it is to be taken into account, and that a simple zero-inflated bivariate Poisson model does not suffice. At the same time, we show that a finite mixture of bivariate Poisson regression models embraces zero-inflated bivariate Poisson regression models as a special case. Additionally, we describe a model in which the mixing proportions are dependent on covariates when modelling the way in which each individual belongs to a separate cluster. Finally, an EM algorithm is provided in order to ensure the models’ ease-of-fit. These models are applied to the same automobile insurance claims data set as used in Bermúdez [2009] and it is shown that the modelling of the data set can be improved considerably.
Resumo:
This article focuses on business risk management in the insurance industry. A methodology for estimating the profit loss caused by each customer in the portfolio due to policy cancellation is proposed. Using data from a European insurance company, customer behaviour over time is analyzed in order to estimate the probability of policy cancelation and the resulting potential profit loss due to cancellation. Customers may have up to two different lines of business contracts: motor insurance and other diverse insurance (such as, home contents, life or accident insurance). Implications for understanding customer cancellation behaviour as the core of business risk management are outlined.
Resumo:
Time series regression models are especially suitable in epidemiology for evaluating short-term effects of time-varying exposures on health. The problem is that potential for confounding in time series regression is very high. Thus, it is important that trend and seasonality are properly accounted for. Our paper reviews the statistical models commonly used in time-series regression methods, specially allowing for serial correlation, make them potentially useful for selected epidemiological purposes. In particular, we discuss the use of time-series regression for counts using a wide range Generalised Linear Models as well as Generalised Additive Models. In addition, recently critical points in using statistical software for GAM were stressed, and reanalyses of time series data on air pollution and health were performed in order to update already published. Applications are offered through an example on the relationship between asthma emergency admissions and photochemical air pollutants
Resumo:
In CoDaWork’05, we presented an application of discriminant function analysis (DFA) to 4 differentcompositional datasets and modelled the first canonical variable using a segmented regression modelsolely based on an observation about the scatter plots. In this paper, multiple linear regressions areapplied to different datasets to confirm the validity of our proposed model. In addition to dating theunknown tephras by calibration as discussed previously, another method of mapping the unknown tephrasinto samples of the reference set or missing samples in between consecutive reference samples isproposed. The application of these methodologies is demonstrated with both simulated and real datasets.This new proposed methodology provides an alternative, more acceptable approach for geologists as theirfocus is on mapping the unknown tephra with relevant eruptive events rather than estimating the age ofunknown tephra.Kew words: Tephrochronology; Segmented regression
Resumo:
This paper performs an empirical Decomposition of International Inequality in Ecological Footprint in order to quantify to what extent explanatory variables such as a country’s affluence, economic structure, demographic characteristics, climate and technology contributed to international differences in terms of natural resource consumption during the period 1993-2007. We use a Regression-Based Inequality Decomposition approach. As a result, the methodology extends qualitatively the results obtained in standard environmental impact regressions as it comprehends further social dimensions of the Sustainable Development concept, i.e. equity within generations. The results obtained point to prioritizing policies that take into account both future and present generations.
Resumo:
This paper performs an empirical Decomposition of International Inequality in Ecological Footprint in order to quantify to what extent explanatory variables such as a country’s affluence, economic structure, demographic characteristics, climate and technology contributed to international differences in terms of natural resource consumption during the period 1993-2007. We use a Regression- Based Inequality Decomposition approach. As a result, the methodology extends qualitatively the results obtained in standard environmental impact regressions as it comprehends further social dimensions of the Sustainable Development concept, i.e. equity within generations. The results obtained point to prioritizing policies that take into account both future and present generations. Keywords: Ecological Footprint Inequality, Regression-Based Inequality Decomposition, Intragenerational equity, Sustainable development.
Resumo:
We consider the application of normal theory methods to the estimation and testing of a general type of multivariate regressionmodels with errors--in--variables, in the case where various data setsare merged into a single analysis and the observable variables deviatepossibly from normality. The various samples to be merged can differ on the set of observable variables available. We show that there is a convenient way to parameterize the model so that, despite the possiblenon--normality of the data, normal--theory methods yield correct inferencesfor the parameters of interest and for the goodness--of--fit test. Thetheory described encompasses both the functional and structural modelcases, and can be implemented using standard software for structuralequations models, such as LISREL, EQS, LISCOMP, among others. An illustration with Monte Carlo data is presented.
Resumo:
Random coefficient regression models have been applied in differentfields and they constitute a unifying setup for many statisticalproblems. The nonparametric study of this model started with Beranand Hall (1992) and it has become a fruitful framework. In thispaper we propose and study statistics for testing a basic hypothesisconcerning this model: the constancy of coefficients. The asymptoticbehavior of the statistics is investigated and bootstrapapproximations are used in order to determine the critical values ofthe test statistics. A simulation study illustrates the performanceof the proposals.
Resumo:
The paper develops a method to solve higher-dimensional stochasticcontrol problems in continuous time. A finite difference typeapproximation scheme is used on a coarse grid of low discrepancypoints, while the value function at intermediate points is obtainedby regression. The stability properties of the method are discussed,and applications are given to test problems of up to 10 dimensions.Accurate solutions to these problems can be obtained on a personalcomputer.
Resumo:
In the fixed design regression model, additional weights areconsidered for the Nadaraya--Watson and Gasser--M\"uller kernel estimators.We study their asymptotic behavior and the relationships between new andclassical estimators. For a simple family of weights, and considering theIMSE as global loss criterion, we show some possible theoretical advantages.An empirical study illustrates the performance of the weighted estimatorsin finite samples.
Resumo:
In this paper we examine the determinants of wages and decompose theobserved differences across genders into the "explained by differentcharacteristics" and "explained by different returns components"using a sample of Spanish workers. Apart from the conditionalexpectation of wages, we estimate the conditional quantile functionsfor men and women and find that both the absolute wage gap and thepart attributed to different returns at each of the quantiles, farfrom being well represented by their counterparts at the mean, aregreater as we move up in the wage range.
Resumo:
The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.
Resumo:
This paper proposes a common and tractable framework for analyzingdifferent definitions of fixed and random effects in a contant-slopevariable-intercept model. It is shown that, regardless of whethereffects (i) are treated as parameters or as an error term, (ii) areestimated in different stages of a hierarchical model, or whether (iii)correlation between effects and regressors is allowed, when the sameinformation on effects is introduced into all estimation methods, theresulting slope estimator is also the same across methods. If differentmethods produce different results, it is ultimately because differentinformation is being used for each methods.