818 resultados para Robust Regression
Resumo:
This paper uses sequential stochastic dominance procedures to compare the joint distribution of health and income across space and time. It is the First application of which we are aware of methods to compare multidimensional distributions of income and health using procedures that are robust to aggregation techniques. The paper's approach is more general than comparisons of health gradients and does not require the estimation of health equivalent incomes. We illustrate the approach by contrasting Canada and the US using comparable data. Canada dominates the US over the lower bidimensional welfare distribution of health and income, though not generally in terms of the uni-dimensional distribution of health or income. The paper also finds that welfare for both Canadians and Americans has not unambiguously improved during the last decade over the joint distribution of income and health, in spite of the fact that the uni-dimensional distributions of income have clearly improved during that period.
Resumo:
Lean meat percentage (LMP) is an important carcass quality parameter. The aim of this work is to obtain a calibration equation for the Computed Tomography (CT) scans with the Partial Least Square Regression (PLS) technique in order to predict the LMP of the carcass and the different cuts and to study and compare two different methodologies of the selection of the variables (Variable Importance for Projection — VIP- and Stepwise) to be included in the prediction equation. The error of prediction with cross-validation (RMSEPCV) of the LMP obtained with PLS and selection based on VIP value was 0.82% and for stepwise selection it was 0.83%. The prediction of the LMP scanning only the ham had a RMSEPCV of 0.97% and if the ham and the loin were scanned the RMSEPCV was 0.90%. Results indicate that for CT data both VIP and stepwise selection are good methods. Moreover the scanning of only the ham allowed us to obtain a good prediction of the LMP of the whole carcass.
Resumo:
Microsatellite instability (MSI) occurs in 10-20% of colorectal tumours and is associated with good prognosis. Here we describe the development and validation of a genomic signature that identifies colorectal cancer patients with MSI caused by DNA mismatch repair deficiency with high accuracy. Microsatellite status for 276 stage II and III colorectal tumours has been determined. Full-genome expression data was used to identify genes that correlate with MSI status. A subset of these samples (n = 73) had sequencing data for 615 genes available. An MSI gene signature of 64 genes was developed and validated in two independent validation sets: the first consisting of frozen samples from 132 stage II patients; and the second consisting of FFPE samples from the PETACC-3 trial (n = 625). The 64-gene MSI signature identified MSI patients in the first validation set with a sensitivity of 90.3% and an overall accuracy of 84.8%, with an AUC of 0.942 (95% CI, 0.888-0.975). In the second validation, the signature also showed excellent performance, with a sensitivity 94.3% and an overall accuracy of 90.6%, with an AUC of 0.965 (95% CI, 0.943-0.988). Besides correct identification of MSI patients, the gene signature identified a group of MSI-like patients that were MSS by standard assessment but MSI by signature assessment. The MSI-signature could be linked to a deficient MMR phenotype, as both MSI and MSI-like patients showed a high mutation frequency (8.2% and 6.4% of 615 genes assayed, respectively) as compared to patients classified as MSS (1.6% mutation frequency). The MSI signature showed prognostic power in stage II patients (n = 215) with a hazard ratio of 0.252 (p = 0.0145). Patients with an MSI-like phenotype had also an improved survival when compared to MSS patients. The MSI signature was translated to a diagnostic microarray and technically and clinically validated in FFPE and frozen samples.
Resumo:
This paper explores the effects of two main sources of innovation - intramural and external R&D— on the productivity level in a sample of 3,267 Catalonian firms. The data set used is based on the official innovation survey of Catalonia which was a part of the Spanish sample of CIS4, covering the years 2002-2004. We compare empirical results by applying usual OLS and quantile regression techniques both in manufacturing and services industries. In quantile regression, results suggest different patterns at both innovation sources as we move across conditional quantiles. The elasticity of intramural R&D activities on productivity decreased when we move up the high productivity levels both in manufacturing and services sectors, while the effects of external R&D rise in high-technology industries but are more ambiguous in low-technology and knowledge-intensive services. JEL codes: O300, C100, O140 Keywords: Innovation sources, R&D, Productivity, Quantile Regression
Resumo:
This work covers two aspects. First, it generally compares and summarizes the similarities and differences of state of the art feature detector and descriptor and second it presents a novel approach of detecting intestinal content (in particular bubbles) in capsule endoscopy images. Feature detectors and descriptors providing invariance to change of perspective, scale, signal-noise-ratio and lighting conditions are important and interesting topics in current research and the number of possible applications seems to be numberless. After analysing a selection of in the literature presented approaches, this work investigates in their suitability for applications information extraction in capsule endoscopy images. Eventually, a very good performing detector of intestinal content in capsule endoscopy images is presented. A accurate detection of intestinal content is crucial for all kinds of machine learning approaches and other analysis on capsule endoscopy studies because they occlude the field of view of the capsule camera and therefore those frames need to be excluded from analysis. As a so called “byproduct” of this investigation a graphical user interface supported Feature Analysis Tool is presented to execute and compare the discussed feature detectors and descriptor on arbitrary images, with configurable parameters and visualized their output. As well the presented bubble classifier is part of this tool and if a ground truth is available (or can also be generated using this tool) a detailed visualization of the validation result will be performed.
Resumo:
When actuaries face with the problem of pricing an insurance contract that contains different types of coverage, such as a motor insurance or homeowner's insurance policy, they usually assume that types of claim are independent. However, this assumption may not be realistic: several studies have shown that there is a positive correlation between types of claim. Here we introduce different regression models in order to relax the independence assumption, including zero-inflated models to account for excess of zeros and overdispersion. These models have been largely ignored to multivariate Poisson date, mainly because of their computational di±culties. Bayesian inference based on MCMC helps to solve this problem (and also lets us derive, for several quantities of interest, posterior summaries to account for uncertainty). Finally, these models are applied to an automobile insurance claims database with three different types of claims. We analyse the consequences for pure and loaded premiums when the independence assumption is relaxed by using different multivariate Poisson regression models and their zero-inflated versions.
Resumo:
Humans can recognize categories of environmental sounds, including vocalizations produced by humans and animals and the sounds of man-made objects. Most neuroimaging investigations of environmental sound discrimination have studied subjects while consciously perceiving and often explicitly recognizing the stimuli. Consequently, it remains unclear to what extent auditory object processing occurs independently of task demands and consciousness. Studies in animal models have shown that environmental sound discrimination at a neural level persists even in anesthetized preparations, whereas data from anesthetized humans has thus far provided null results. Here, we studied comatose patients as a model of environmental sound discrimination capacities during unconsciousness. We included 19 comatose patients treated with therapeutic hypothermia (TH) during the first 2 days of coma, while recording nineteen-channel electroencephalography (EEG). At the level of each individual patient, we applied a decoding algorithm to quantify the differential EEG responses to human vs. animal vocalizations as well as to sounds of living vocalizations vs. man-made objects. Discrimination between vocalization types was accurate in 11 patients and discrimination between sounds from living and man-made sources in 10 patients. At the group level, the results were significant only for the comparison between vocalization types. These results lay the groundwork for disentangling truly preferential activations in response to auditory categories, and the contribution of awareness to auditory category discrimination.
Resumo:
1. Species distribution modelling is used increasingly in both applied and theoretical research to predict how species are distributed and to understand attributes of species' environmental requirements. In species distribution modelling, various statistical methods are used that combine species occurrence data with environmental spatial data layers to predict the suitability of any site for that species. While the number of data sharing initiatives involving species' occurrences in the scientific community has increased dramatically over the past few years, various data quality and methodological concerns related to using these data for species distribution modelling have not been addressed adequately. 2. We evaluated how uncertainty in georeferences and associated locational error in occurrences influence species distribution modelling using two treatments: (1) a control treatment where models were calibrated with original, accurate data and (2) an error treatment where data were first degraded spatially to simulate locational error. To incorporate error into the coordinates, we moved each coordinate with a random number drawn from the normal distribution with a mean of zero and a standard deviation of 5 km. We evaluated the influence of error on the performance of 10 commonly used distributional modelling techniques applied to 40 species in four distinct geographical regions. 3. Locational error in occurrences reduced model performance in three of these regions; relatively accurate predictions of species distributions were possible for most species, even with degraded occurrences. Two species distribution modelling techniques, boosted regression trees and maximum entropy, were the best performing models in the face of locational errors. The results obtained with boosted regression trees were only slightly degraded by errors in location, and the results obtained with the maximum entropy approach were not affected by such errors. 4. Synthesis and applications. To use the vast array of occurrence data that exists currently for research and management relating to the geographical ranges of species, modellers need to know the influence of locational error on model quality and whether some modelling techniques are particularly robust to error. We show that certain modelling techniques are particularly robust to a moderate level of locational error and that useful predictions of species distributions can be made even when occurrence data include some error.
Resumo:
Calomys callosus a wild rodent, is a natural host of Trypanosoma cruzi. Twelve C. callosus were infected with 10(5) trypomastigotes of the F strain (a myotropic strain) of T. cruzi. Parasitemia decreased on the 21 st day becoming negative around the 40th day of infection. All animals survived but had positive parasitological tests, until the end of the experiment. The infected animals developed severe inflammation in the myocardium and skeletal muscle. This process was pronounced from the 26 th to the 30th day and gradually subsided from the 50 th day becoming absent or residual on the 64 th day after infection. Collagen was identified by the picro Sirius red method. Fibrogenesis developed early, but regression of fibrosis occurred between the 50th and 64th day. Ultrastructural study disclosed a predominance of macrophages and fibroblasts in the inflammatory infiltrates, with small numbers of lymphocytes. Macrophages had active phagocytosis and showed points of contact with altered muscle cells. Different degrees of matrix expansion were present, with granular and fibrilar deposits and collagen bundles. These alterations subsided by the 64th days. Macrophages seem to be the main immune effector cell in the C. callosus model of infection with T. cruzi. The mechanisms involved in the rapid fibrogenesis and its regression deserve further investigation.
Resumo:
Background Individual signs and symptoms are of limited value for the diagnosis of influenza. Objective To develop a decision tree for the diagnosis of influenza based on a classification and regression tree (CART) analysis. Methods Data from two previous similar cohort studies were assembled into a single dataset. The data were randomly divided into a development set (70%) and a validation set (30%). We used CART analysis to develop three models that maximize the number of patients who do not require diagnostic testing prior to treatment decisions. The validation set was used to evaluate overfitting of the model to the training set. Results Model 1 has seven terminal nodes based on temperature, the onset of symptoms and the presence of chills, cough and myalgia. Model 2 was a simpler tree with only two splits based on temperature and the presence of chills. Model 3 was developed with temperature as a dichotomous variable (≥38°C) and had only two splits based on the presence of fever and myalgia. The area under the receiver operating characteristic curves (AUROCC) for the development and validation sets, respectively, were 0.82 and 0.80 for Model 1, 0.75 and 0.76 for Model 2 and 0.76 and 0.77 for Model 3. Model 2 classified 67% of patients in the validation group into a high- or low-risk group compared with only 38% for Model 1 and 54% for Model 3. Conclusions A simple decision tree (Model 2) classified two-thirds of patients as low or high risk and had an AUROCC of 0.76. After further validation in an independent population, this CART model could support clinical decision making regarding influenza, with low-risk patients requiring no further evaluation for influenza and high-risk patients being candidates for empiric symptomatic or drug therapy.
Resumo:
In a recent paper Bermúdez [2009] used bivariate Poisson regression models for ratemaking in car insurance, and included zero-inflated models to account for the excess of zeros and the overdispersion in the data set. In the present paper, we revisit this model in order to consider alternatives. We propose a 2-finite mixture of bivariate Poisson regression models to demonstrate that the overdispersion in the data requires more structure if it is to be taken into account, and that a simple zero-inflated bivariate Poisson model does not suffice. At the same time, we show that a finite mixture of bivariate Poisson regression models embraces zero-inflated bivariate Poisson regression models as a special case. Additionally, we describe a model in which the mixing proportions are dependent on covariates when modelling the way in which each individual belongs to a separate cluster. Finally, an EM algorithm is provided in order to ensure the models’ ease-of-fit. These models are applied to the same automobile insurance claims data set as used in Bermúdez [2009] and it is shown that the modelling of the data set can be improved considerably.
Resumo:
This article focuses on business risk management in the insurance industry. A methodology for estimating the profit loss caused by each customer in the portfolio due to policy cancellation is proposed. Using data from a European insurance company, customer behaviour over time is analyzed in order to estimate the probability of policy cancelation and the resulting potential profit loss due to cancellation. Customers may have up to two different lines of business contracts: motor insurance and other diverse insurance (such as, home contents, life or accident insurance). Implications for understanding customer cancellation behaviour as the core of business risk management are outlined.
Resumo:
Background: Thin melanomas (Breslow thickness <= 1 mm) are considered highly curable. The aim of this study was to evaluate the correlation between histological tumour regression and sentinel lymph node (SLN) involvement in thin melanomas. Patients and methods: This was a retrospective single-centre study of 34 patients with thin melanomas undergoing SLN biopsy between April 1998 and January 2005. Results: The study included 14 women and 20 men of mean age 56.3 years. Melanomas were located on the neck (n = 3), soles (n = 4), trunk (n = 13) and extremities (n = 14). Pathological examination showed 25 SSM, four acral lentiginous melanomas, three in situ melanomas, one nodular melanoma and one unclassified melanoma with a mean Breslow thickness of 0.57 mm. Histological tumour regression was observed in 26 over 34 cases and ulceration was found in one case. Clark levels were as follows: I (n = 3), II (n = 20), III (n = 9), IV (n = 2). Growth phase was available in 15 cases (seven radial and eight vertical). Mitotic rates, available in 24 cases, were: 0 (n = 9), 1 (n = 11), 2 (n = 2), 3 (n = 1), 6 (n = 1). One patient with histological tumour regression (2.9% of cases and 3.8% of cases with regressing tumours) had a metastatic SLN. One patient negative for SLN had a lung relapse and died of the disease. Mean follow-up was 26.2 months. Conclusion: The results of the present study and the analysis of the literature show that histological regression of the primary tumour does not seem predictive of higher risk of SLN involvement in thin melanomas. This suggests that screening for SLN is not indicated in thin melanomas, even those with histological regression.