124 resultados para sparse Bayesian regression
em Université de Lausanne, Switzerland
Resumo:
Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale for the purpose of improving predictions of groundwater flow and solute transport. However, extending corresponding approaches to the regional scale still represents one of the major challenges in the domain of hydrogeophysics. To address this problem, we have developed a regional-scale data integration methodology based on a two-step Bayesian sequential simulation approach. Our objective is to generate high-resolution stochastic realizations of the regional-scale hydraulic conductivity field in the common case where there exist spatially exhaustive but poorly resolved measurements of a related geophysical parameter, as well as highly resolved but spatially sparse collocated measurements of this geophysical parameter and the hydraulic conductivity. To integrate this multi-scale, multi-parameter database, we first link the low- and high-resolution geophysical data via a stochastic downscaling procedure. This is followed by relating the downscaled geophysical data to the high-resolution hydraulic conductivity distribution. After outlining the general methodology of the approach, we demonstrate its application to a realistic synthetic example where we consider as data high-resolution measurements of the hydraulic and electrical conductivities at a small number of borehole locations, as well as spatially exhaustive, low-resolution estimates of the electrical conductivity obtained from surface-based electrical resistivity tomography. The different stochastic realizations of the hydraulic conductivity field obtained using our procedure are validated by comparing their solute transport behaviour with that of the underlying ?true? hydraulic conductivity field. We find that, even in the presence of strong subsurface heterogeneity, our proposed procedure allows for the generation of faithful representations of the regional-scale hydraulic conductivity structure and reliable predictions of solute transport over long, regional-scale distances.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. Recent advances in machine learning offer a novel approach to model spatial distribution of petrophysical properties in complex reservoirs alternative to geostatistics. The approach is based of semisupervised learning, which handles both ?labelled? observed data and ?unlabelled? data, which have no measured value but describe prior knowledge and other relevant data in forms of manifolds in the input space where the modelled property is continuous. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic geological features and describe stochastic variability and non-uniqueness of spatial properties. On the other hand, it is able to capture and preserve key spatial dependencies such as connectivity of high permeability geo-bodies, which is often difficult in contemporary petroleum reservoir studies. Semi-supervised SVR as a data driven algorithm is designed to integrate various kind of conditioning information and learn dependences from it. The semi-supervised SVR model is able to balance signal/noise levels and control the prior belief in available data. In this work, stochastic semi-supervised SVR geomodel is integrated into Bayesian framework to quantify uncertainty of reservoir production with multiple models fitted to past dynamic observations (production history). Multiple history matched models are obtained using stochastic sampling and/or MCMC-based inference algorithms, which evaluate posterior probability distribution. Uncertainty of the model is described by posterior probability of the model parameters that represent key geological properties: spatial correlation size, continuity strength, smoothness/variability of spatial property distribution. The developed approach is illustrated with a fluvial reservoir case. The resulting probabilistic production forecasts are described by uncertainty envelopes. The paper compares the performance of the models with different combinations of unknown parameters and discusses sensitivity issues.
Resumo:
Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. The paper considers a data driven approach in modelling uncertainty in spatial predictions. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic features and describe stochastic variability and non-uniqueness of spatial properties. It is able to capture and preserve key spatial dependencies such as connectivity, which is often difficult to achieve with two-point geostatistical models. Semi-supervised SVR is designed to integrate various kinds of conditioning data and learn dependences from them. A stochastic semi-supervised SVR model is integrated into a Bayesian framework to quantify uncertainty with multiple models fitted to dynamic observations. The developed approach is illustrated with a reservoir case study. The resulting probabilistic production forecasts are described by uncertainty envelopes.
Resumo:
Many of the most interesting questions ecologists ask lead to analyses of spatial data. Yet, perhaps confused by the large number of statistical models and fitting methods available, many ecologists seem to believe this is best left to specialists. Here, we describe the issues that need consideration when analysing spatial data and illustrate these using simulation studies. Our comparative analysis involves using methods including generalized least squares, spatial filters, wavelet revised models, conditional autoregressive models and generalized additive mixed models to estimate regression coefficients from synthetic but realistic data sets, including some which violate standard regression assumptions. We assess the performance of each method using two measures and using statistical error rates for model selection. Methods that performed well included generalized least squares family of models and a Bayesian implementation of the conditional auto-regressive model. Ordinary least squares also performed adequately in the absence of model selection, but had poorly controlled Type I error rates and so did not show the improvements in performance under model selection when using the above methods. Removing large-scale spatial trends in the response led to poor performance. These are empirical results; hence extrapolation of these findings to other situations should be performed cautiously. Nevertheless, our simulation-based approach provides much stronger evidence for comparative analysis than assessments based on single or small numbers of data sets, and should be considered a necessary foundation for statements of this type in future.
Resumo:
We present the most comprehensive comparison to date of the predictive benefit of genetics in addition to currently used clinical variables, using genotype data for 33 single-nucleotide polymorphisms (SNPs) in 1,547 Caucasian men from the placebo arm of the REduction by DUtasteride of prostate Cancer Events (REDUCE®) trial. Moreover, we conducted a detailed comparison of three techniques for incorporating genetics into clinical risk prediction. The first method was a standard logistic regression model, which included separate terms for the clinical covariates and for each of the genetic markers. This approach ignores a substantial amount of external information concerning effect sizes for these Genome Wide Association Study (GWAS)-replicated SNPs. The second and third methods investigated two possible approaches to incorporating meta-analysed external SNP effect estimates - one via a weighted PCa 'risk' score based solely on the meta analysis estimates, and the other incorporating both the current and prior data via informative priors in a Bayesian logistic regression model. All methods demonstrated a slight improvement in predictive performance upon incorporation of genetics. The two methods that incorporated external information showed the greatest receiver-operating-characteristic AUCs increase from 0.61 to 0.64. The value of our methods comparison is likely to lie in observations of performance similarities, rather than difference, between three approaches of very different resource requirements. The two methods that included external information performed best, but only marginally despite substantial differences in complexity.
Resumo:
PURPOSE: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. METHOD: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). RESULTS: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. CONCLUSION: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables like soil gas radon measurements as well as more detailed geological information.
Resumo:
Over the past few decades, age estimation of living persons has represented a challenging task for many forensic services worldwide. In general, the process for age estimation includes the observation of the degree of maturity reached by some physical attributes, such as dentition or several ossification centers. The estimated chronological age or the probability that an individual belongs to a meaningful class of ages is then obtained from the observed degree of maturity by means of various statistical methods. Among these methods, those developed in a Bayesian framework offer to users the possibility of coherently dealing with the uncertainty associated with age estimation and of assessing in a transparent and logical way the probability that an examined individual is younger or older than a given age threshold. Recently, a Bayesian network for age estimation has been presented in scientific literature; this kind of probabilistic graphical tool may facilitate the use of the probabilistic approach. Probabilities of interest in the network are assigned by means of transition analysis, a statistical parametric model, which links the chronological age and the degree of maturity by means of specific regression models, such as logit or probit models. Since different regression models can be employed in transition analysis, the aim of this paper is to study the influence of the model in the classification of individuals. The analysis was performed using a dataset related to the ossifications status of the medial clavicular epiphysis and results support that the classification of individuals is not dependent on the choice of the regression model.
Resumo:
Aberrant blood vessels enable tumor growth, provide a barrier to immune infiltration, and serve as a source of protumorigenic signals. Targeting tumor blood vessels for destruction, or tumor vascular disruption therapy, can therefore provide significant therapeutic benefit. Here, we describe the ability of chimeric antigen receptor (CAR)-bearing T cells to recognize human prostate-specific membrane antigen (hPSMA) on endothelial targets in vitro as well as in vivo. CAR T cells were generated using the anti-PSMA scFv, J591, and the intracellular signaling domains: CD3ζ, CD28, and/or CD137/4-1BB. We found that all anti-hPSMA CAR T cells recognized and eliminated PSMA(+) endothelial targets in vitro, regardless of the signaling domain. T cells bearing the third-generation anti-hPSMA CAR, P28BBζ, were able to recognize and kill primary human endothelial cells isolated from gynecologic cancers. In addition, the P28BBζ CAR T cells mediated regression of hPSMA-expressing vascular neoplasms in mice. Finally, in murine models of ovarian cancers populated by murine vessels expressing hPSMA, the P28BBζ CAR T cells were able to ablate PSMA(+) vessels, cause secondary depletion of tumor cells, and reduce tumor burden. Taken together, these results provide a strong rationale for the use of CAR T cells as agents of tumor vascular disruption, specifically those targeting PSMA. Cancer Immunol Res; 3(1); 68-84. ©2014 AACR.
Resumo:
Purpose: Recent reports have suggested that intraabdominal postoperative infection is associated with higher rates of overall and local recurrence and cancer-specific mortality. However, the mechanisms responsible for this association are unknown. We hypothesized that the greater inflammatory response in patients with postoperative intraabdominal infection is associated to an increase in local and systemic angiogenesis. Methods: We designed a prospective cohorts study with matched controls. Patients with postoperative intra-abdominal infection (abscess and/or anastomotic leakage) (group 1; n=17) after elective colorectal cancer resection operated on for cure were compared to patients with an uncomplicated postoperative course (group 2; n=17). IL-6 and VEGF levels were determined by ELISA in serum and peritoneal fluid at baseline, 48 hours and postoperative day 4 or at the time the peritoneal infection occurred. Results: No differences were observed in age, gender, preoperative CEA, tumor stage and location and type of procedure performed. Although there were no differences in serum IL-6 levels at 48 hours, this pro-inflammatory cytokine was higher in group 1 on postoperative day 4 (group 1: 21533 + 27900 vs. group 2: 1130 + 3563 pg/ml; p < 0.001). Serum VEGF levels were higher in group 1 on postoperative day 4 (group 1: 1212 + 1025 vs. group 2: 408 + 407 pg/ml; p < 0.01). Peritoneal fluid VEGF levels were also higher in group 1 at 48 hours (group 1: 4857 + 4384 vs. group 2: 630 + 461 pg/ml; p < 0.001) and postoperative day 4 (group 1: 32807 + 98486 vs. group 2: 1002 + 1229 pg/ml; p < 0.001). A positive correlation between serum IL-6 and VEGF serum levels was observed on postoperative day 4 (r=0.7; p<0.01). Conclusions: These results suggest that not only the inflammatory response but also the angiogenic pathways are stimulated in patients with intra-abdominal infection after surgery for colorectal cancer. The implications of this finding on long-term follow-up need to be evaluated.
Resumo:
PURPOSE: To present the long-term follow-up of 10 adolescents and young adults with documented cognitive and behavioral regression as children due to nonlesional focal, mainly frontal, epilepsy with continuous spike-waves during slow wave sleep (CSWS). METHODS: Past medical and electroencephalography (EEG) data were reviewed and neuropsychological tests exploring main cognitive functions were administered. KEY FINDINGS: After a mean duration of follow-up of 15.6 years (range, 8-23 years), none of the 10 patients had recovered fully, but four regained borderline to normal intelligence and were almost independent. Patients with prolonged global intellectual regression had the worst outcome, whereas those with more specific and short-lived deficits recovered best. The marked behavioral disorders resolved in all but one patient. Executive functions were neither severely nor homogenously affected. Three patients with a frontal syndrome during the active phase (AP) disclosed only mild residual executive and social cognition deficits. The main cognitive gains occurred shortly after the AP, but qualitative improvements continued to occur. Long-term outcome correlated best with duration of CSWS. SIGNIFICANCE: Our findings emphasize that cognitive recovery after cessation of CSWS depends on the severity and duration of the initial regression. None of our patients had major executive and social cognition deficits with preserved intelligence, as reported in adults with early destructive lesions of the frontal lobes. Early recognition of epilepsy with CSWS and rapid introduction of effective therapy are crucial for a best possible outcome.
Resumo:
BACKGROUND: Only a few studies have explored the relation between coffee and tea intake and head and neck cancers, with inconsistent results. METHODS: We pooled individual-level data from nine case-control studies of head and neck cancers, including 5,139 cases and 9,028 controls. Logistic regression was used to estimate odds ratios (OR) and 95% confidence intervals (95% CI), adjusting for potential confounders. RESULTS: Caffeinated coffee intake was inversely related with the risk of cancer of the oral cavity and pharynx: the ORs were 0.96 (95% CI, 0.94-0.98) for an increment of 1 cup per day and 0.61 (95% CI, 0.47-0.80) in drinkers of >4 cups per day versus nondrinkers. This latter estimate was consistent for different anatomic sites (OR, 0.46; 95% CI, 0.30-0.71 for oral cavity; OR, 0.58; 95% CI, 0.41-0.82 for oropharynx/hypopharynx; and OR, 0.61; 95% CI, 0.37-1.01 for oral cavity/pharynx not otherwise specified) and across strata of selected covariates. No association of caffeinated coffee drinking was found with laryngeal cancer (OR, 0.96; 95% CI, 0.64-1.45 in drinkers of >4 cups per day versus nondrinkers). Data on decaffeinated coffee were too sparse for detailed analysis, but indicated no increased risk. Tea intake was not associated with head and neck cancer risk (OR, 0.99; 95% CI, 0.89-1.11 for drinkers versus nondrinkers). CONCLUSIONS: This pooled analysis of case-control studies supports the hypothesis of an inverse association between caffeinated coffee drinking and risk of cancer of the oral cavity and pharynx. IMPACT: Given widespread use of coffee and the relatively high incidence and low survival of head and neck cancers, the observed inverse association may have appreciable public health relevance.
Resumo:
Purpose: While imatinib has revolutionized the treatment of chronic myeloid leukaemia (CML) and gastrointestinal stromal tumors (GIST), its pharmacokinetic-pharmacodynamic relationships have been poorly studied. This study aimed to explore the issue in oncologic patients, and to evaluate the specific influence of the target genotype in a GIST subpopulation. Patients and methods: Data from 59 patients (321 plasma samples) were collected during a previous pharmacokinetic study. Based on a population model purposely developed, individual post-hoc Bayesian estimates of pharmacokinetic parameters were derived, and used to estimate drug exposure (AUC; area under curve). Free fraction parameters were deduced from a model incorporating plasma alpha1-acid glycoprotein levels. Associations between AUC (or clearance) and therapeutic response (coded on a 3-point scale), or tolerability (4-point scale), were explored by ordered logistic regression. Influence of KIT genotype on response was also assessed in GIST patients. Results: Total and free drug exposure correlated with the number of side effects (p < 0.005). A relationship with response was not evident in the whole patient set (with good-responders tending to receive lower doses and bad-responders higher doses). In GIST patients however, higher free drug exposure predicted better responses. A strong association was notably observed in patients harboring an exon 9 mutation or a wild type KIT, known to decrease tumor sensitivity towards imatinib (p < 0.005). Conclusions: Our results are arguments to further evaluate the potential benefit of a therapeutic monitoring program for imatinib. Our data also suggest that stratification by genotype will be important in future trials.
Resumo:
Knowledge of the spatial distribution of hydraulic conductivity (K) within an aquifer is critical for reliable predictions of solute transport and the development of effective groundwater management and/or remediation strategies. While core analyses and hydraulic logging can provide highly detailed information, such information is inherently localized around boreholes that tend to be sparsely distributed throughout the aquifer volume. Conversely, larger-scale hydraulic experiments like pumping and tracer tests provide relatively low-resolution estimates of K in the investigated subsurface region. As a result, traditional hydrogeological measurement techniques contain a gap in terms of spatial resolution and coverage, and they are often alone inadequate for characterizing heterogeneous aquifers. Geophysical methods have the potential to bridge this gap. The recent increased interest in the application of geophysical methods to hydrogeological problems is clearly evidenced by the formation and rapid growth of the domain of hydrogeophysics over the past decade (e.g., Rubin and Hubbard, 2005).