37 resultados para Non-parametric methods

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a real data set of claims amounts where costs related to damage are recorded separately from those related to medical expenses. Only claims with positive costs are considered here. Two approaches to density estimation are presented: a classical parametric and a semi-parametric method, based on transformation kernel density estimation. We explore the data set with standard univariate methods. We also propose ways to select the bandwidth and transformation parameters in the univariate case based on Bayesian methods. We indicate how to compare the results of alternative methods both looking at the shape of the overall density domain and exploring the density estimates in the right tail.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an analysis of motor vehicle insurance claims relating to vehicle damage and to associated medical expenses. We use univariate severity distributions estimated with parametric and non-parametric methods. The methods are implemented using the statistical package R. Parametric analysis is limited to estimation of normal and lognormal distributions for each of the two claim types. The nonparametric analysis presented involves kernel density estimation. We illustrate the benefits of applying transformations to data prior to employing kernel based methods. We use a log-transformation and an optimal transformation amongst a class of transformations that produces symmetry in the data. The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management. To this end, we have included in the Appendix of this paper all the R code that has been used in the analysis so that readers, both students and educators, can fully explore the techniques described

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Land cover classification is a key research field in remote sensing and land change science as thematic maps derived from remotely sensed data have become the basis for analyzing many socio-ecological issues. However, land cover classification remains a difficult task and it is especially challenging in heterogeneous tropical landscapes where nonetheless such maps are of great importance. The present study aims to establish an efficient classification approach to accurately map all broad land cover classes in a large, heterogeneous tropical area of Bolivia, as a basis for further studies (e.g., land cover-land use change). Specifically, we compare the performance of parametric (maximum likelihood), non-parametric (k-nearest neighbour and four different support vector machines - SVM), and hybrid classifiers, using both hard and soft (fuzzy) accuracy assessments. In addition, we test whether the inclusion of a textural index (homogeneity) in the classifications improves their performance. We classified Landsat imagery for two dates corresponding to dry and wet seasons and found that non-parametric, and particularly SVM classifiers, outperformed both parametric and hybrid classifiers. We also found that the use of the homogeneity index along with reflectance bands significantly increased the overall accuracy of all the classifications, but particularly of SVM algorithms. We observed that improvements in producer’s and user’s accuracies through the inclusion of the homogeneity index were different depending on land cover classes. Earlygrowth/degraded forests, pastures, grasslands and savanna were the classes most improved, especially with the SVM radial basis function and SVM sigmoid classifiers, though with both classifiers all land cover classes were mapped with producer’s and user’s accuracies of around 90%. Our approach seems very well suited to accurately map land cover in tropical regions, thus having the potential to contribute to conservation initiatives, climate change mitigation schemes such as REDD+, and rural development policies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El objetivo de este trabajo es analizar como ha evolucionado y los efectos que el tipo de propiedad tiene sobre el desempeño de los bancos en aquellos países de la Europa Central y del Este, que en los últimos años han experimentado con gran intensidad el proceso de integración europea. Con este fin, hemos analizado 242 bancos correspondientes a 12 países (10 nuevos miembros de la UE y 2 en fase de negociación). Para verificar la existencia de un efecto derivado del tipo de propiedad, analizamos las dimensiones de la eficiencia bancaria, rentabilidad, costes, e intermediación, mediante la aplicación de distintas técnicas, tanto paramétricas como no paramétricas. Los resultados muestran la existencia de ciertos efectos derivados del tipo de propiedad. Así, entre los principales resultados, destaca que los bancos privatizados tienden a presentar unos niveles de rentabilidad superiores a los presentados por otros tipos de propiedad, mientras que a su vez, los bancos de origen extranjero son los que de media presentan unos menores niveles de costes, si bien esta diferencia no es estadísticamente significativa. Analizamos también la importancia que supone la presencia de un inversor estratégico en la propiedad de los bancos, obteniendo una mejoría que si bien no es significativa en los ratios de rentabilidad, si lo es en relación a los gastos generales de gestión.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the scope of the European project Hydroptimet, INTERREG IIIB-MEDOCC programme, limited area model (LAM) intercomparison of intense events that produced many damages to people and territory is performed. As the comparison is limited to single case studies, the work is not meant to provide a measure of the different models' skill, but to identify the key model factors useful to give a good forecast on such a kind of meteorological phenomena. This work focuses on the Spanish flash-flood event, also known as "Montserrat-2000" event. The study is performed using forecast data from seven operational LAMs, placed at partners' disposal via the Hydroptimet ftp site, and observed data from Catalonia rain gauge network. To improve the event analysis, satellite rainfall estimates have been also considered. For statistical evaluation of quantitative precipitation forecasts (QPFs), several non-parametric skill scores based on contingency tables have been used. Furthermore, for each model run it has been possible to identify Catalonia regions affected by misses and false alarms using contingency table elements. Moreover, the standard "eyeball" analysis of forecast and observed precipitation fields has been supported by the use of a state-of-the-art diagnostic method, the contiguous rain area (CRA) analysis. This method allows to quantify the spatial shift forecast error and to identify the error sources that affected each model forecasts. High-resolution modelling and domain size seem to have a key role for providing a skillful forecast. Further work is needed to support this statement, including verification using a wider observational data set.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Public authorities and road users alike are increasingly concerned by recent trends in road safety outcomes in Barcelona, which is the European city with the highest number of registered Powered Two-Wheel (PTW) vehicles per inhabitant,. In this study we explore the determinants of motorcycle and moped accident severity in a large urban area, drawing on Barcelona’s local police database (2002-2008). We apply non-parametric regression techniques to characterize PTW accidents and parametric methods to investigate the factors influencing their severity. Our results show that PTW accident victims are more vulnerable, showing greater degrees of accident severity, than other traffic victims. Speed violations and alcohol consumption provide the worst health outcomes. Demographic and environment-related risk factors, in addition to helmet use, play an important role in determining accident severity. Thus, this study furthers our understanding of the most vulnerable vehicle types, while our results have direct implications for local policy makers in their fight to reduce the severity of PTW accidents in large urban areas.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents a comparative analysis of linear and mixed modelsfor short term forecasting of a real data series with a high percentage of missing data. Data are the series of significant wave heights registered at regular periods of three hours by a buoy placed in the Bay of Biscay.The series is interpolated with a linear predictor which minimizes theforecast mean square error. The linear models are seasonal ARIMA models and themixed models have a linear component and a non linear seasonal component.The non linear component is estimated by a non parametric regression of dataversus time. Short term forecasts, no more than two days ahead, are of interestbecause they can be used by the port authorities to notice the fleet.Several models are fitted and compared by their forecasting behavior.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A parametric procedure for the blind inversion of nonlinear channels is proposed, based on a recent method of blind source separation in nonlinear mixtures. Experiments show that the proposed algorithms perform efficiently, even in the presence of hard distortion. The method, based on the minimization of the output mutual information, needs the knowledge of log-derivative of input distribution (the so-called score function). Each algorithm consists of three adaptive blocks: one devoted to adaptive estimation of the score function, and two other blocks estimating the inverses of the linear and nonlinear parts of the channel, (quasi-)optimally adapted using the estimated score functions. This paper is mainly concerned by the nonlinear part, for which we propose two parametric models, the first based on a polynomial model and the second on a neural network, while [14, 15] proposed non-parametric approaches.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We propose new methods for evaluating predictive densities that focus on the models' actual predictive ability in finite samples. The tests offer a simple way of evaluatingthe correct specification of predictive densities, either parametric or non-parametric.The results indicate that our tests are well sized and have good power in detecting mis-specification in predictive densities. An empirical application to the Survey ofProfessional Forecasters and a baseline Dynamic Stochastic General Equilibrium modelshows the usefulness of our methodology.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Cirrhosis is the final stage of most of chronic liver diseases, and is almost invariably complicated by portal hypertension, which is the most important cause of morbidity and mortality in these patients. This review will focus on the non-invasive methods currently used in clinical practice for diagnosing liver cirrhosis and portal hypertension. The first-line techniques include physical examination, laboratory parameters, transient elastography and Doppler-US. More sophisticated imaging methods which are less commonly employed are CT scan and MRI, and new technologies which are currently under evaluation are MR elastography and acoustic radiation force imaging (ARFI). Even if none of them can replace the invasive measurement of hepatic venous pressure gradient and the endoscopic screening of gastroesophageal varices, they notably facilitate the clinical management of patients with cirrhosis and portal hypertension, and provide valuable prognostic information.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background To determine generic utilities for Spanish chronic obstructive pulmonary disease (COPD) patients stratified by different classifications: GOLD 2007, GOLD 2013, GesEPOC 2012 and BODEx index. Methods Multicentre, observational, cross-sectional study. Patients were aged ≥40 years, with spirometrically confirmed COPD. Utility values were derived from EQ-5D-3 L. Means, standard deviations (SD), medians and interquartile ranges (IQR) were computed based on the different classifications. Differences in median utilities between groups were assessed by non-parametric tests. Results 346 patients were included, of which 85.5% were male with a mean age of 67.9 (SD = 9.7) years and a mean duration of COPD of 7.6 (SD = 5.8) years; 80.3% were ex-smokers and the mean smoking history was 54.2 (SD = 33.2) pack-years. Median utilities (IQR) by GOLD 2007 were 0.87 (0.22) for moderate; 0.80 (0.26) for severe and 0.67 (0.42) for very-severe patients (p < 0.001 for all comparisons). Median utilities by GOLD 2013 were group A: 1.0 (0.09); group B: 0.87 (0.13); group C: 1.0 (0.16); group D: 0.74 (0.29); comparisons were statistically significant (p < 0.001) except A vs C. Median utilities by GesEPOC phenotypes were 0.84 (0.33) for non exacerbator; 0.80 (0.26) for COPD-asthma overlap; 0.71 (0.62) for exacerbator with emphysema; 0.72 (0.57) for exacerbator with chronic bronchitis (p < 0.001). Comparisons between patients with or without exacerbations and between patients with COPD-asthma overlap and exacerbator with chronic bronchitis were statistically-significant (p < 0.001). Median utilities by BODEx index were: group 02: 0.89 (0.20); group 34: 0.80 (0.27); group 56: 0.67 (0.29); group 79: 0.41 (0.31). All comparisons were significant (p < 0.001) except between groups 34 and 56. Conclusion Irrespective of the classification used utilities were associated to disease severity. Some clinical phenotypes were associated with worse utilities, probably related to a higher frequency of exacerbations. GOLD 2007 guidelines and BODEx index better discriminated patients with a worse health status than GOLD 2013 guidelines, while GOLD 2013 guidelines were better able to identify a smaller group of patients with the best health.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Inductive learning aims at finding general rules that hold true in a database. Targeted learning seeks rules for the predictions of the value of a variable based on the values of others, as in the case of linear or non-parametric regression analysis. Non-targeted learning finds regularities without a specific prediction goal. We model the product of non-targeted learning as rules that state that a certain phenomenon never happens, or that certain conditions necessitate another. For all types of rules, there is a trade-off between the rule's accuracy and its simplicity. Thus rule selection can be viewed as a choice problem, among pairs of degree of accuracy and degree of complexity. However, one cannot in general tell what is the feasible set in the accuracy-complexity space. Formally, we show that finding out whether a point belongs to this set is computationally hard. In particular, in the context of linear regression, finding a small set of variables that obtain a certain value of R2 is computationally hard. Computational complexity may explain why a person is not always aware of rules that, if asked, she would find valid. This, in turn, may explain why one can change other people's minds (opinions, beliefs) without providing new information.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Our objective is to analyse fraud as an operational risk for the insurance company. We study the effect of a fraud detection policy on the insurer's results account, quantifying the loss risk from the perspective of claims auditing. From the point of view of operational risk, the study aims to analyse the effect of failing to detect fraudulent claims after investigation. We have chosen VAR as the risk measure with a non-parametric estimation of the loss risk involved in the detection or non-detection of fraudulent claims. The most relevant conclusion is that auditing claims reduces loss risk in the insurance company.