885 resultados para Bayesian ridge regression
Resumo:
Background Individual signs and symptoms are of limited value for the diagnosis of influenza. Objective To develop a decision tree for the diagnosis of influenza based on a classification and regression tree (CART) analysis. Methods Data from two previous similar cohort studies were assembled into a single dataset. The data were randomly divided into a development set (70%) and a validation set (30%). We used CART analysis to develop three models that maximize the number of patients who do not require diagnostic testing prior to treatment decisions. The validation set was used to evaluate overfitting of the model to the training set. Results Model 1 has seven terminal nodes based on temperature, the onset of symptoms and the presence of chills, cough and myalgia. Model 2 was a simpler tree with only two splits based on temperature and the presence of chills. Model 3 was developed with temperature as a dichotomous variable (≥38°C) and had only two splits based on the presence of fever and myalgia. The area under the receiver operating characteristic curves (AUROCC) for the development and validation sets, respectively, were 0.82 and 0.80 for Model 1, 0.75 and 0.76 for Model 2 and 0.76 and 0.77 for Model 3. Model 2 classified 67% of patients in the validation group into a high- or low-risk group compared with only 38% for Model 1 and 54% for Model 3. Conclusions A simple decision tree (Model 2) classified two-thirds of patients as low or high risk and had an AUROCC of 0.76. After further validation in an independent population, this CART model could support clinical decision making regarding influenza, with low-risk patients requiring no further evaluation for influenza and high-risk patients being candidates for empiric symptomatic or drug therapy.
Resumo:
Sampling issues represent a topic of ongoing interest to the forensic science community essentially because of their crucial role in laboratory planning and working protocols. For this purpose, forensic literature described thorough (Bayesian) probabilistic sampling approaches. These are now widely implemented in practice. They allow, for instance, to obtain probability statements that parameters of interest (e.g., the proportion of a seizure of items that present particular features, such as an illegal substance) satisfy particular criteria (e.g., a threshold or an otherwise limiting value). Currently, there are many approaches that allow one to derive probability statements relating to a population proportion, but questions on how a forensic decision maker - typically a client of a forensic examination or a scientist acting on behalf of a client - ought actually to decide about a proportion or a sample size, remained largely unexplored to date. The research presented here intends to address methodology from decision theory that may help to cope usefully with the wide range of sampling issues typically encountered in forensic science applications. The procedures explored in this paper enable scientists to address a variety of concepts such as the (net) value of sample information, the (expected) value of sample information or the (expected) decision loss. All of these aspects directly relate to questions that are regularly encountered in casework. Besides probability theory and Bayesian inference, the proposed approach requires some additional elements from decision theory that may increase the efforts needed for practical implementation. In view of this challenge, the present paper will emphasise the merits of graphical modelling concepts, such as decision trees and Bayesian decision networks. These can support forensic scientists in applying the methodology in practice. How this may be achieved is illustrated with several examples. The graphical devices invoked here also serve the purpose of supporting the discussion of the similarities, differences and complementary aspects of existing Bayesian probabilistic sampling criteria and the decision-theoretic approach proposed throughout this paper.
Resumo:
This study presents a classification criteria for two-class Cannabis seedlings. As the cultivation of drug type cannabis is forbidden in Switzerland, law enforcement authorities regularly ask laboratories to determine cannabis plant's chemotype from seized material in order to ascertain that the plantation is legal or not. In this study, the classification analysis is based on data obtained from the relative proportion of three major leaf compounds measured by gas-chromatography interfaced with mass spectrometry (GC-MS). The aim is to discriminate between drug type (illegal) and fiber type (legal) cannabis at an early stage of the growth. A Bayesian procedure is proposed: a Bayes factor is computed and classification is performed on the basis of the decision maker specifications (i.e. prior probability distributions on cannabis type and consequences of classification measured by losses). Classification rates are computed with two statistical models and results are compared. Sensitivity analysis is then performed to analyze the robustness of classification criteria.
Advanced mapping of environmental data: Geostatistics, Machine Learning and Bayesian Maximum Entropy
Resumo:
This book combines geostatistics and global mapping systems to present an up-to-the-minute study of environmental data. Featuring numerous case studies, the reference covers model dependent (geostatistics) and data driven (machine learning algorithms) analysis techniques such as risk mapping, conditional stochastic simulations, descriptions of spatial uncertainty and variability, artificial neural networks (ANN) for spatial data, Bayesian maximum entropy (BME), and more.
Resumo:
Nandrolone (19-nortestosterone) is a widely used anabolic steroid in sports where strength plays an essential role. Once nandrolone has been metabolised, two major metabolites are excreted in urine, 19-norandrosterone (NA) and 19-noretiocholanolone (NE). In 1997, in France, quite a few sportsmen had concentrations of 19-norandrosterone very close to the IOC cut off limit (2ng/ml). At that time, a debate took place about the capability of the human male body to produce by itself these metabolites without any intake of nandrolone or related compounds. The International Football Federation (FIFA) was very concerned with this problematic, especially because the World Cup was about to start in France. In this respect, a statistical study was held with all football players from the first and second divisions of the Swiss Football National League. All players gave a urine sample after effort and around 6% of them showed traces of 19-norandrosterone. These results were compared with amateur football players (control group) and around 6% of them had very small amounts of 19-norandrosterone and/or 19-noretiocholanolone in urine after effort, whereas none of them had detectable traces of one or the other metabolite before effort. The origin of these compounds in urine after a strenuous physical activity is still unknown, but three hypotheses can be put forward. First, an endogenous production of nandrolone metabolites takes place. Second, nandrolone metabolites are released from the fatty tissues after an intake of nandrolone, some related compounds or some contaminated nutritive supplements. Finally, the sportsmen may have taken something during or just before the football game.
Resumo:
Objectives: Imatinib has been increasingly proposed for therapeutic drug monitoring (TDM), as trough concentrations (Cmin) correlate with response rates in CML patients. This analysis aimed to evaluate the impact of imatinib exposure on optimal molecular response rates in a large European cohort of patients followed by centralized TDM.¦Methods: Sequential PK/PD analysis was performed in NONMEM 7 on 2230 plasma (PK) samples obtained along with molecular response (PD) data from 1299 CML patients. Model-based individual Bayesian estimates of exposure, parameterized as to initial dose adjusted and log-normalized Cmin (log-Cmin) or clearance (CL), were investigated as potential predictors of optimal molecular response, while accounting for time under treatment (stratified at 3 years), gender, CML phase, age, potentially interacting comedication, and TDM frequency. PK/PD analysis used mixed-effect logistic regression (iterative two-stage method) to account for intra-patient correlation.¦Results: In univariate analyses, CL, log-Cmin, time under treatment, TDM frequency, gender (all p<0.01) and CML phase (p=0.02) were significant predictors of the outcome. In multivariate analyses, all but log-Cmin remained significant (p<0.05). Our model estimates a 54.1% probability of optimal molecular response in a female patient with a median CL of 14.4 L/h, increasing by 4.7% with a 35% decrease in CL (percentile 10 of CL distribution), and decreasing by 6% with a 45% increased CL (percentile 90), respectively. Male patients were less likely than female to be in optimal response (odds ratio: 0.62, p<0.001), with an estimated probability of 42.3%.¦Conclusions: Beyond CML phase and time on treatment, expectedly correlated to the outcome, an effect of initial imatinib exposure on the probability of achieving optimal molecular response was confirmed in field-conditions by this multivariate analysis. Interestingly, male patients had a higher risk of suboptimal response, which might not exclusively derive from their 18.5% higher CL, but also from reported lower adherence to the treatment. A prospective longitudinal study would be desirable to confirm the clinical importance of identified covariates and to exclude biases possibly affecting this observational survey.
Resumo:
This paper tries to resolve some of the main shortcomings in the empirical literature of location decisions for new plants, i.e. spatial effects and overdispersion. Spatial effects are omnipresent, being a source of overdispersion in the data as well as a factor shaping the functional relationship between the variables that explain a firm’s location decisions. Using Count Data models, empirical researchers have dealt with overdispersion and excess zeros by developments of the Poisson regression model. This study aims to take this a step further, by adopting Bayesian methods and models in order to tackle the excess of zeros, spatial and non-spatial overdispersion and spatial dependence simultaneously. Data for Catalonia is used and location determinants are analysed to that end. The results show that spatial effects are determinant. Additionally, overdispersion is descomposed into an unstructured iid effect and a spatially structured effect. Keywords: Bayesian Analysis, Spatial Models, Firm Location. JEL Classification: C11, C21, R30.
Resumo:
In a recent paper Bermúdez [2009] used bivariate Poisson regression models for ratemaking in car insurance, and included zero-inflated models to account for the excess of zeros and the overdispersion in the data set. In the present paper, we revisit this model in order to consider alternatives. We propose a 2-finite mixture of bivariate Poisson regression models to demonstrate that the overdispersion in the data requires more structure if it is to be taken into account, and that a simple zero-inflated bivariate Poisson model does not suffice. At the same time, we show that a finite mixture of bivariate Poisson regression models embraces zero-inflated bivariate Poisson regression models as a special case. Additionally, we describe a model in which the mixing proportions are dependent on covariates when modelling the way in which each individual belongs to a separate cluster. Finally, an EM algorithm is provided in order to ensure the models’ ease-of-fit. These models are applied to the same automobile insurance claims data set as used in Bermúdez [2009] and it is shown that the modelling of the data set can be improved considerably.
Resumo:
The new mineral francoisite-(Ce), (Ce,Nd,Ca)[(UO(2))(3)O(OH)(PO(4))(2)]center dot 6H(2)O is the Ce-analog of francoisite-(Nd). It has been discovered simultaneously at the La Creusaz uranium deposit near Les Marecottes in Valais, Switzerland, and at the Number 2 uranium Workings, Radium Ridge near Mt. Painter, Arkaroola area, Northern Flinders Ranges in South Australia. Francoisite-(Ce) is a uranyl-bearing supergene mineral that results from the alteration under oxidative conditions of REE- and U(4+)-bearing hypogene minerals: allanite-(Ce), monazite-(Ce), +/- uraninite at Les Marecottes; monazite-(Ce), ishikawaite-samarskite, and an unknown primary U-mineral at Radium Ridge. The REE composition of francoisite-(Ce) results from a short aqueous transport of REE leached out of primary minerals [most likely monazite-(Ce) at Radium Ridge and allanite-(Ce) at La Creusaz], with fractionation among REE resulting mainly from aqueous transport, with only limited Ce loss due to oxidation to Ce(4+) during transport.
Resumo:
This article focuses on business risk management in the insurance industry. A methodology for estimating the profit loss caused by each customer in the portfolio due to policy cancellation is proposed. Using data from a European insurance company, customer behaviour over time is analyzed in order to estimate the probability of policy cancelation and the resulting potential profit loss due to cancellation. Customers may have up to two different lines of business contracts: motor insurance and other diverse insurance (such as, home contents, life or accident insurance). Implications for understanding customer cancellation behaviour as the core of business risk management are outlined.
Resumo:
Background: Thin melanomas (Breslow thickness <= 1 mm) are considered highly curable. The aim of this study was to evaluate the correlation between histological tumour regression and sentinel lymph node (SLN) involvement in thin melanomas. Patients and methods: This was a retrospective single-centre study of 34 patients with thin melanomas undergoing SLN biopsy between April 1998 and January 2005. Results: The study included 14 women and 20 men of mean age 56.3 years. Melanomas were located on the neck (n = 3), soles (n = 4), trunk (n = 13) and extremities (n = 14). Pathological examination showed 25 SSM, four acral lentiginous melanomas, three in situ melanomas, one nodular melanoma and one unclassified melanoma with a mean Breslow thickness of 0.57 mm. Histological tumour regression was observed in 26 over 34 cases and ulceration was found in one case. Clark levels were as follows: I (n = 3), II (n = 20), III (n = 9), IV (n = 2). Growth phase was available in 15 cases (seven radial and eight vertical). Mitotic rates, available in 24 cases, were: 0 (n = 9), 1 (n = 11), 2 (n = 2), 3 (n = 1), 6 (n = 1). One patient with histological tumour regression (2.9% of cases and 3.8% of cases with regressing tumours) had a metastatic SLN. One patient negative for SLN had a lung relapse and died of the disease. Mean follow-up was 26.2 months. Conclusion: The results of the present study and the analysis of the literature show that histological regression of the primary tumour does not seem predictive of higher risk of SLN involvement in thin melanomas. This suggests that screening for SLN is not indicated in thin melanomas, even those with histological regression.
Resumo:
Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending the corresponding approaches to the scale of a field site represents a major, and as-of-yet largely unresolved, challenge. To address this problem, we have developed downscaling procedure based on a non-linear Bayesian sequential simulation approach. The main objective of this algorithm is to estimate the value of the sparsely sampled hydraulic conductivity at non-sampled locations based on its relation to the electrical conductivity logged at collocated wells and surface resistivity measurements, which are available throughout the studied site. The in situ relationship between the hydraulic and electrical conductivities is described through a non-parametric multivariatekernel density function. Then a stochastic integration of low-resolution, large-scale electrical resistivity tomography (ERT) data in combination with high-resolution, local-scale downhole measurements of the hydraulic and electrical conductivities is applied. The overall viability of this downscaling approach is tested and validated by comparing flow and transport simulation through the original and the upscaled hydraulic conductivity fields. Our results indicate that the proposed procedure allows obtaining remarkably faithful estimates of the regional-scale hydraulic conductivity structure and correspondingly reliable predictions of the transport characteristics over relatively long distances.
Resumo:
Attrition in longitudinal studies can lead to biased results. The study is motivated by the unexpected observation that alcohol consumption decreased despite increased availability, which may be due to sample attrition of heavy drinkers. Several imputation methods have been proposed, but rarely compared in longitudinal studies of alcohol consumption. The imputation of consumption level measurements is computationally particularly challenging due to alcohol consumption being a semi-continuous variable (dichotomous drinking status and continuous volume among drinkers), and the non-normality of data in the continuous part. Data come from a longitudinal study in Denmark with four waves (2003-2006) and 1771 individuals at baseline. Five techniques for missing data are compared: Last value carried forward (LVCF) was used as a single, and Hotdeck, Heckman modelling, multivariate imputation by chained equations (MICE), and a Bayesian approach as multiple imputation methods. Predictive mean matching was used to account for non-normality, where instead of imputing regression estimates, "real" observed values from similar cases are imputed. Methods were also compared by means of a simulated dataset. The simulation showed that the Bayesian approach yielded the most unbiased estimates for imputation. The finding of no increase in consumption levels despite a higher availability remained unaltered. Copyright (C) 2011 John Wiley & Sons, Ltd.