130 resultados para semi-parametric estimation
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
In the mid-1980s, many European countries introduced fixed-term contracts.Since then their labor markets have become more dynamic. This paper studiesthe implications of such reforms for the duration distribution ofunemployment, with particular emphasis on the changes in the durationdependence. I estimate a parametric duration model using cross-sectionaldata drawn from the Spanish Labor Force Survey from 1980 to 1994 to analyzethe chances of leaving unemployment before and after the introduction offixed-term contracts. I find that duration dependence has increased sincesuch reform. Semi-parametric estimation of the model also shows that forlong spells, the probability of leaving unemployment has decreased sincesuch reform.
Resumo:
In this paper we analyse the observed systematic differences incosts for teaching hospitals (THhenceforth) in Spain. Concernhas been voiced regarding the existence of a bias in thefinancing of TH s has been raised once prospective budgets arein the arena for hospital finance, and claims for adjusting totake into account the legitimate extra costs of teaching onhospital expenditure are well grounded. We focus on theestimation of the impact of teaching status on average cost. Weused a version of a multiproduct hospital cost function takinginto account some relevant factors from which to derive theobserved differences. We assume that the relationship betweenthe explanatory and the dependent variables follows a flexibleform for each of the explanatory variables. We also model theunderlying covariance structure of the data. We assumed twoqualitatively different sources of variation: random effects andserial correlation. Random variation refers to both general levelvariation (through the random intercept) and the variationspecifically related to teaching status. We postulate that theimpact of the random effects is predominant over the impact ofthe serial correlation effects. The model is estimated byrestricted maximum likelihood. Our results show that costs are 9%higher (15% in the case of median costs) in teaching than innon-teaching hospitals. That is, teaching status legitimatelyexplains no more than half of the observed difference in actualcosts. The impact on costs of the teaching factor depends on thenumber of residents, with an increase of 51.11% per resident forhospitals with fewer than 204 residents (third quartile of thenumber of residents) and 41.84% for hospitals with more than 204residents. In addition, the estimated dispersion is higher amongteaching hospitals. As a result, due to the considerable observedheterogeneity, results should be interpreted with caution. From apolicy making point of view, we conclude that since a higherrelative burden for medical training is under public hospitalcommand, an explicit adjustment to the extra costs that theteaching factor imposes on hospital finance is needed, beforehospital competition for inpatient services takes place.
Resumo:
We present a real data set of claims amounts where costs related to damage are recorded separately from those related to medical expenses. Only claims with positive costs are considered here. Two approaches to density estimation are presented: a classical parametric and a semi-parametric method, based on transformation kernel density estimation. We explore the data set with standard univariate methods. We also propose ways to select the bandwidth and transformation parameters in the univariate case based on Bayesian methods. We indicate how to compare the results of alternative methods both looking at the shape of the overall density domain and exploring the density estimates in the right tail.
Resumo:
Drawing on PISA data of 2006, this study examines the impact of socio-economic school composition on science test score achievement for Spanish students in compulsory secondary schools. We define school composition in terms of the average parental human capital of students in the same school. These contextual peer effects are estimated using a semi-parametric methodology, which enables the spillovers to affect all the parameters of the educational production function. We also deal with the potential problem of self-selection of student into schools, using an artificial sorting that we argue to be independent from unobserved student’s abilities. The results indicate that the association between socio-economic school composition and test score results is clearly positive and significantly higher when computed with the semi-parametric approach. However, we find that the endogenous sorting of students into schools plays a fundamental role, given that the spillovers are significantly reduced when this selection process is ruled out from our measure of school composition effects. Specifically, the estimations suggest that the contextual peer effects are moderately positive only in those schools where the socio-economic composition is considerably elevated. In addition, we find some evidence of asymmetry of how the external effects and the sorting process actually operate, which seem affect in a different way males and females as well as high and low performance students.
Resumo:
Our objective is to analyse fraud as an operational risk for the insurance company. We study the effect of a fraud detection policy on the insurer's results account, quantifying the loss risk from the perspective of claims auditing. From the point of view of operational risk, the study aims to analyse the effect of failing to detect fraudulent claims after investigation. We have chosen VAR as the risk measure with a non-parametric estimation of the loss risk involved in the detection or non-detection of fraudulent claims. The most relevant conclusion is that auditing claims reduces loss risk in the insurance company.
Resumo:
Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.
Factors affecting hospital admission and recovery stay duration of in-patient motor victims in Spain
Resumo:
Hospital expenses are a major cost driver of healthcare systems in Europe, with motor injuries being the leading mechanism of hospitalizations. This paper investigates the injury characteristics which explain the hospitalization of victims of traffic accidents that took place in Spain. Using a motor insurance database with 16.081 observations a generalized Tobit regression model is applied to analyse the factors that influence both the likelihood of being admitted to hospital after a motor collision and the length of hospital stay in the event of admission. The consistency of Tobit estimates relies on the normality of perturbation terms. Here a semi-parametric regression model was fitted to test the consistency of estimates, concluding that a normal distribution of errors cannot be rejected. Among other results, it was found that older men with fractures and injuries located in the head and lower torso are more likely to be hospitalized after the collision, and that they also have a longer expected length of hospital recovery stay.
Resumo:
This comment corrects the errors in the estimation process that appear in Martins (2001). The first error is in the parametric probit estimation, as the previously presented results do not maximize the log-likelihood function. In the global maximum more variables become significant. As for the semiparametric estimation method, the kernel function used in Martins (2001) can take on both positive and negative values, which implies that the participation probability estimates may be outside the interval [0,1]. We have solved the problem by applying local smoothing in the kernel estimation, as suggested by Klein and Spady (1993).
Resumo:
This paper presents an analysis of motor vehicle insurance claims relating to vehicle damage and to associated medical expenses. We use univariate severity distributions estimated with parametric and non-parametric methods. The methods are implemented using the statistical package R. Parametric analysis is limited to estimation of normal and lognormal distributions for each of the two claim types. The nonparametric analysis presented involves kernel density estimation. We illustrate the benefits of applying transformations to data prior to employing kernel based methods. We use a log-transformation and an optimal transformation amongst a class of transformations that produces symmetry in the data. The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management. To this end, we have included in the Appendix of this paper all the R code that has been used in the analysis so that readers, both students and educators, can fully explore the techniques described
Resumo:
A new parametric minimum distance time-domain estimator for ARFIMA processes is introduced in this paper. The proposed estimator minimizes the sum of squared correlations of residuals obtained after filtering a series through ARFIMA parameters. The estimator iseasy to compute and is consistent and asymptotically normally distributed for fractionallyintegrated (FI) processes with an integration order d strictly greater than -0.75. Therefore, it can be applied to both stationary and non-stationary processes. Deterministic components are also allowed in the DGP. Furthermore, as a by-product, the estimation procedure provides an immediate check on the adequacy of the specified model. This is so because the criterion function, when evaluated at the estimated values, coincides with the Box-Pierce goodness of fit statistic. Empirical applications and Monte-Carlo simulations supporting the analytical results and showing the good performance of the estimator in finite samples are also provided.
Resumo:
Application of semi-distributed hydrological models to large, heterogeneous watersheds deals with several problems. On one hand, the spatial and temporal variability in catchment features should be adequately represented in the model parameterization, while maintaining the model complexity in an acceptable level to take advantage of state-of-the-art calibration techniques. On the other hand, model complexity enhances uncertainty in adjusted model parameter values, therefore increasing uncertainty in the water routing across the watershed. This is critical for water quality applications, where not only streamflow, but also a reliable estimation of the surface versus subsurface contributions to the runoff is needed. In this study, we show how a regularized inversion procedure combined with a multiobjective function calibration strategy successfully solves the parameterization of a complex application of a water quality-oriented hydrological model. The final value of several optimized parameters showed significant and consistentdifferences across geological and landscape features. Although the number of optimized parameters was significantly increased by the spatial and temporal discretization of adjustable parameters, the uncertainty in water routing results remained at reasonable values. In addition, a stepwise numerical analysis showed that the effects on calibration performance due to inclusion of different data types in the objective function could be inextricably linked. Thus caution should be taken when adding or removing data from an aggregated objective function.
Resumo:
A parametric procedure for the blind inversion of nonlinear channels is proposed, based on a recent method of blind source separation in nonlinear mixtures. Experiments show that the proposed algorithms perform efficiently, even in the presence of hard distortion. The method, based on the minimization of the output mutual information, needs the knowledge of log-derivative of input distribution (the so-called score function). Each algorithm consists of three adaptive blocks: one devoted to adaptive estimation of the score function, and two other blocks estimating the inverses of the linear and nonlinear parts of the channel, (quasi-)optimally adapted using the estimated score functions. This paper is mainly concerned by the nonlinear part, for which we propose two parametric models, the first based on a polynomial model and the second on a neural network, while [14, 15] proposed non-parametric approaches.
Resumo:
This paper proposes a spatial filtering technique forthe reception of pilot-aided multirate multicode direct-sequencecode division multiple access (DS/CDMA) systems such as widebandCDMA (WCDMA). These systems introduce a code-multiplexedpilot sequence that can be used for the estimation of thefilter weights, but the presence of the traffic signal (transmittedat the same time as the pilot sequence) corrupts that estimationand degrades the performance of the filter significantly. This iscaused by the fact that although the traffic and pilot signals areusually designed to be orthogonal, the frequency selectivity of thechannel degrades this orthogonality at hte receiving end. Here,we propose a semi-blind technique that eliminates the self-noisecaused by the code-multiplexing of the pilot. We derive analyticallythe asymptotic performance of both the training-only andthe semi-blind techniques and compare them with the actual simulatedperformance. It is shown, both analytically and via simulation,that high gains can be achieved with respect to training-onlybasedtechniques.
Resumo:
This paper analyzes the asymptotic performance of maximum likelihood (ML) channel estimation algorithms in wideband code division multiple access (WCDMA) scenarios. We concentrate on systems with periodic spreading sequences (period larger than or equal to the symbol span) where the transmitted signal contains a code division multiplexed pilot for channel estimation purposes. First, the asymptotic covariances of the training-only, semi-blind conditional maximum likelihood (CML) and semi-blind Gaussian maximum likelihood (GML) channelestimators are derived. Then, these formulas are further simplified assuming randomized spreading and training sequences under the approximation of high spreading factors and high number of codes. The results provide a useful tool to describe the performance of the channel estimators as a function of basicsystem parameters such as number of codes, spreading factors, or traffic to training power ratio.
Resumo:
In this paper, the theory of hidden Markov models (HMM) isapplied to the problem of blind (without training sequences) channel estimationand data detection. Within a HMM framework, the Baum–Welch(BW) identification algorithm is frequently used to find out maximum-likelihood (ML) estimates of the corresponding model. However, such a procedureassumes the model (i.e., the channel response) to be static throughoutthe observation sequence. By means of introducing a parametric model fortime-varying channel responses, a version of the algorithm, which is moreappropriate for mobile channels [time-dependent Baum-Welch (TDBW)] isderived. Aiming to compare algorithm behavior, a set of computer simulationsfor a GSM scenario is provided. Results indicate that, in comparisonto other Baum–Welch (BW) versions of the algorithm, the TDBW approachattains a remarkable enhancement in performance. For that purpose, onlya moderate increase in computational complexity is needed.