906 resultados para Probabilistic estimation
Resumo:
A method is proposed for the estimation of absolute binding free energy of interaction between proteins and ligands. Conformational sampling of the protein-ligand complex is performed by molecular dynamics (MD) in vacuo and the solvent effect is calculated a posteriori by solving the Poisson or the Poisson-Boltzmann equation for selected frames of the trajectory. The binding free energy is written as a linear combination of the buried surface upon complexation, SASbur, the electrostatic interaction energy between the ligand and the protein, Eelec, and the difference of the solvation free energies of the complex and the isolated ligand and protein, deltaGsolv. The method uses the buried surface upon complexation to account for the non-polar contribution to the binding free energy because it is less sensitive to the details of the structure than the van der Waals interaction energy. The parameters of the method are developed for a training set of 16 HIV-1 protease-inhibitor complexes of known 3D structure. A correlation coefficient of 0.91 was obtained with an unsigned mean error of 0.8 kcal/mol. When applied to a set of 25 HIV-1 protease-inhibitor complexes of unknown 3D structures, the method provides a satisfactory correlation between the calculated binding free energy and the experimental pIC5o without reparametrization.
Resumo:
Radioactive soil-contamination mapping and risk assessment is a vital issue for decision makers. Traditional approaches for mapping the spatial concentration of radionuclides employ various regression-based models, which usually provide a single-value prediction realization accompanied (in some cases) by estimation error. Such approaches do not provide the capability for rigorous uncertainty quantification or probabilistic mapping. Machine learning is a recent and fast-developing approach based on learning patterns and information from data. Artificial neural networks for prediction mapping have been especially powerful in combination with spatial statistics. A data-driven approach provides the opportunity to integrate additional relevant information about spatial phenomena into a prediction model for more accurate spatial estimates and associated uncertainty. Machine-learning algorithms can also be used for a wider spectrum of problems than before: classification, probability density estimation, and so forth. Stochastic simulations are used to model spatial variability and uncertainty. Unlike regression models, they provide multiple realizations of a particular spatial pattern that allow uncertainty and risk quantification. This paper reviews the most recent methods of spatial data analysis, prediction, and risk mapping, based on machine learning and stochastic simulations in comparison with more traditional regression models. The radioactive fallout from the Chernobyl Nuclear Power Plant accident is used to illustrate the application of the models for prediction and classification problems. This fallout is a unique case study that provides the challenging task of analyzing huge amounts of data ('hard' direct measurements, as well as supplementary information and expert estimates) and solving particular decision-oriented problems.
Resumo:
A major issue in the application of waveform inversion methods to crosshole ground-penetrating radar (GPR) data is the accurate estimation of the source wavelet. Here, we explore the viability and robustness of incorporating this step into a recently published time-domain inversion procedure through an iterative deconvolution approach. Our results indicate that, at least in non-dispersive electrical environments, such an approach provides remarkably accurate and robust estimates of the source wavelet even in the presence of strong heterogeneity of both the dielectric permittivity and electrical conductivity. Our results also indicate that the proposed source wavelet estimation approach is relatively insensitive to ambient noise and to the phase characteristics of the starting wavelet. Finally, there appears to be little to no trade-off between the wavelet estimation and the tomographic imaging procedures.
Resumo:
This article extends existing discussion in literature on probabilistic inference and decision making with respect to continuous hypotheses that are prevalent in forensic toxicology. As a main aim, this research investigates the properties of a widely followed approach for quantifying the level of toxic substances in blood samples, and to compare this procedure with a Bayesian probabilistic approach. As an example, attention is confined to the presence of toxic substances, such as THC, in blood from car drivers. In this context, the interpretation of results from laboratory analyses needs to take into account legal requirements for establishing the 'presence' of target substances in blood. In a first part, the performance of the proposed Bayesian model for the estimation of an unknown parameter (here, the amount of a toxic substance) is illustrated and compared with the currently used method. The model is then used in a second part to approach-in a rational way-the decision component of the problem, that is judicial questions of the kind 'Is the quantity of THC measured in the blood over the legal threshold of 1.5 μg/l?'. This is pointed out through a practical example.
Resumo:
The physical disector is a method of choice for estimating unbiased neuron numbers; nevertheless, calibration is needed to evaluate each counting method. The validity of this method can be assessed by comparing the estimated cell number with the true number determined by a direct counting method in serial sections. We reconstructed a 1/5 of rat lumbar dorsal root ganglia taken from two experimental conditions. From each ganglion, images of 200 adjacent semi-thin sections were used to reconstruct a volumetric dataset (stack of voxels). On these stacks the number of sensory neurons was estimated and counted respectively by physical disector and direct counting methods. Also, using the coordinates of nuclei from the direct counting, we simulate, by a Matlab program, disector pairs separated by increasing distances in a ganglion model. The comparison between the results of these approaches clearly demonstrates that the physical disector method provides a valid and reliable estimate of the number of sensory neurons only when the distance between the consecutive disector pairs is 60 microm or smaller. In these conditions the size of error between the results of physical disector and direct counting does not exceed 6%. In contrast when the distance between two pairs is larger than 60 microm (70-200 microm) the size of error increases rapidly to 27%. We conclude that the physical dissector method provides a reliable estimate of the number of rat sensory neurons only when the separating distance between the consecutive dissector pairs is no larger than 60 microm.
Resumo:
Les précipitations journalières extrêmes centennales ont été estimées à partir d'analyses de Gumbel et de sept formule empiriques effectuées sur des séries de mesures pluviométriques à 151 endroits de la Suisse pour deux périodes de 50 ans. Ces estimations ont été comparées avec les valeurs journalières maximales mesurées durant les 100 dernières années (1911-2010) afin de tester l'efficacité de ces sept formules. Cette comparaison révèle que la formule de Weibull serait la meilleure pour estimer les précipitations journalières centennales à partir de la série de mesures pluviométriques 1961-2010, mais la moins bonne pour la série de mesures 1911-1960. La formule de Hazen serait la plus efficace pour cette dernière période. Ces différences de performances entre les formules empiriques pour les deux périodes étudiées résultent de l'augmentation des précipitations journalières maximales mesurées de 1911 à 2010 pour 90% des stations en Suisse. Mais les différences entre les pluies extrêmes estimées à partir des sept formules empiriques ne dépassent pas 6% en moyenne.
Resumo:
Captan and folpet are two fungicides largely used in agriculture, but biomonitoring data are mostly limited to measurements of captan metabolite concentrations in spot urine samples of workers, which complicate interpretation of results in terms of internal dose estimation, daily variations according to tasks performed, and most plausible routes of exposure. This study aimed at performing repeated biological measurements of exposure to captan and folpet in field workers (i) to better assess internal dose along with main routes-of-entry according to tasks and (ii) to establish most appropriate sampling and analysis strategies. The detailed urinary excretion time courses of specific and non-specific biomarkers of exposure to captan and folpet were established in tree farmers (n = 2) and grape growers (n = 3) over a typical workweek (seven consecutive days), including spraying and harvest activities. The impact of the expression of urinary measurements [excretion rate values adjusted or not for creatinine or cumulative amounts over given time periods (8, 12, and 24 h)] was evaluated. Absorbed doses and main routes-of-entry were then estimated from the 24-h cumulative urinary amounts through the use of a kinetic model. The time courses showed that exposure levels were higher during spraying than harvest activities. Model simulations also suggest a limited absorption in the studied workers and an exposure mostly through the dermal route. It further pointed out the advantage of expressing biomarker values in terms of body weight-adjusted amounts in repeated 24-h urine collections as compared to concentrations or excretion rates in spot samples, without the necessity for creatinine corrections.
Resumo:
The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.
Resumo:
As a thorough aggregation of probability and graph theory, Bayesian networks currently enjoy widespread interest as a means for studying factors that affect the coherent evaluation of scientific evidence in forensic science. Paper I of this series of papers intends to contribute to the discussion of Bayesian networks as a framework that is helpful for both illustrating and implementing statistical procedures that are commonly employed for the study of uncertainties (e.g. the estimation of unknown quantities). While the respective statistical procedures are widely described in literature, the primary aim of this paper is to offer an essentially non-technical introduction on how interested readers may use these analytical approaches - with the help of Bayesian networks - for processing their own forensic science data. Attention is mainly drawn to the structure and underlying rationale of a series of basic and context-independent network fragments that users may incorporate as building blocs while constructing larger inference models. As an example of how this may be done, the proposed concepts will be used in a second paper (Part II) for specifying graphical probability networks whose purpose is to assist forensic scientists in the evaluation of scientific evidence encountered in the context of forensic document examination (i.e. results of the analysis of black toners present on printed or copied documents).
Resumo:
The knowledge of the relationship that links radiation dose and image quality is a prerequisite to any optimization of medical diagnostic radiology. Image quality depends, on the one hand, on the physical parameters such as contrast, resolution, and noise, and on the other hand, on characteristics of the observer that assesses the image. While the role of contrast and resolution is precisely defined and recognized, the influence of image noise is not yet fully understood. Its measurement is often based on imaging uniform test objects, even though real images contain anatomical backgrounds whose statistical nature is much different from test objects used to assess system noise. The goal of this study was to demonstrate the importance of variations in background anatomy by quantifying its effect on a series of detection tasks. Several types of mammographic backgrounds and signals were examined by psychophysical experiments in a two-alternative forced-choice detection task. According to hypotheses concerning the strategy used by the human observers, their signal to noise ratio was determined. This variable was also computed for a mathematical model based on the statistical decision theory. By comparing theoretical model and experimental results, the way that anatomical structure is perceived has been analyzed. Experiments showed that the observer's behavior was highly dependent upon both system noise and the anatomical background. The anatomy partly acts as a signal recognizable as such and partly as a pure noise that disturbs the detection process. This dual nature of the anatomy is quantified. It is shown that its effect varies according to its amplitude and the profile of the object being detected. The importance of the noisy part of the anatomy is, in some situations, much greater than the system noise. Hence, reducing the system noise by increasing the dose will not improve task performance. This observation indicates that the tradeoff between dose and image quality might be optimized by accepting a higher system noise. This could lead to a better resolution, more contrast, or less dose.
Resumo:
In the first part of the study, nine estimators of the first-order autoregressive parameter are reviewed and a new estimator is proposed. The relationships and discrepancies between the estimators are discussed in order to achieve a clear differentiation. In the second part of the study, the precision in the estimation of autocorrelation is studied. The performance of the ten lag-one autocorrelation estimators is compared in terms of Mean Square Error (combining bias and variance) using data series generated by Monte Carlo simulation. The results show that there is not a single optimal estimator for all conditions, suggesting that the estimator ought to be chosen according to sample size and to the information available of the possible direction of the serial dependence. Additionally, the probability of labelling an actually existing autocorrelation as statistically significant is explored using Monte Carlo sampling. The power estimates obtained are quite similar among the tests associated with the different estimators. These estimates evidence the small probability of detecting autocorrelation in series with less than 20 measurement times.
Resumo:
Abstract
Resumo:
Many accidents involving Iowa snowplows have happened in recent years. This study investigated the influence of time of day, sex of subject, type of snowplow sign and snowplow speed on the criteria of oncoming driver reaction time and his estimate of snowplow speed. Film strips were made of a car passing a snow-Plow under various experimental conditions. These experimental movie strips were viewed in the laboratory by college student drivers who were asked to indicate their reaction time to slow down and to estimate the speed of the snowplow being passed. The generally best sign condition for the snowplow was to have a striped rear sign and a speed-proportional flashing light in addition to the standard rotating beacon on top of the truck. Several recommendations were made.
Resumo:
A new model for dealing with decision making under risk by considering subjective and objective information in the same formulation is here presented. The uncertain probabilistic weighted average (UPWA) is also presented. Its main advantage is that it unifies the probability and the weighted average in the same formulation and considering the degree of importance that each case has in the analysis. Moreover, it is able to deal with uncertain environments represented in the form of interval numbers. We study some of its main properties and particular cases. The applicability of the UPWA is also studied and it is seen that it is very broad because all the previous studies that use the probability or the weighted average can be revised with this new approach. Focus is placed on a multi-person decision making problem regarding the selection of strategies by using the theory of expertons.