983 resultados para Probabilistic analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sampling issues represent a topic of ongoing interest to the forensic science community essentially because of their crucial role in laboratory planning and working protocols. For this purpose, forensic literature described thorough (Bayesian) probabilistic sampling approaches. These are now widely implemented in practice. They allow, for instance, to obtain probability statements that parameters of interest (e.g., the proportion of a seizure of items that present particular features, such as an illegal substance) satisfy particular criteria (e.g., a threshold or an otherwise limiting value). Currently, there are many approaches that allow one to derive probability statements relating to a population proportion, but questions on how a forensic decision maker - typically a client of a forensic examination or a scientist acting on behalf of a client - ought actually to decide about a proportion or a sample size, remained largely unexplored to date. The research presented here intends to address methodology from decision theory that may help to cope usefully with the wide range of sampling issues typically encountered in forensic science applications. The procedures explored in this paper enable scientists to address a variety of concepts such as the (net) value of sample information, the (expected) value of sample information or the (expected) decision loss. All of these aspects directly relate to questions that are regularly encountered in casework. Besides probability theory and Bayesian inference, the proposed approach requires some additional elements from decision theory that may increase the efforts needed for practical implementation. In view of this challenge, the present paper will emphasise the merits of graphical modelling concepts, such as decision trees and Bayesian decision networks. These can support forensic scientists in applying the methodology in practice. How this may be achieved is illustrated with several examples. The graphical devices invoked here also serve the purpose of supporting the discussion of the similarities, differences and complementary aspects of existing Bayesian probabilistic sampling criteria and the decision-theoretic approach proposed throughout this paper.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Part I of this series of articles focused on the construction of graphical probabilistic inference procedures, at various levels of detail, for assessing the evidential value of gunshot residue (GSR) particle evidence. The proposed models - in the form of Bayesian networks - address the issues of background presence of GSR particles, analytical performance (i.e., the efficiency of evidence searching and analysis procedures) and contamination. The use and practical implementation of Bayesian networks for case pre-assessment is also discussed. This paper, Part II, concentrates on Bayesian parameter estimation. This topic complements Part I in that it offers means for producing estimates useable for the numerical specification of the proposed probabilistic graphical models. Bayesian estimation procedures are given a primary focus of attention because they allow the scientist to combine (his/her) prior knowledge about the problem of interest with newly acquired experimental data. The present paper also considers further topics such as the sensitivity of the likelihood ratio due to uncertainty in parameters and the study of likelihood ratio values obtained for members of particular populations (e.g., individuals with or without exposure to GSR).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of statistical models for forensic fingerprint identification purposes has been the subject of increasing research attention in recent years. This can be partly seen as a response to a number of commentators who claim that the scientific basis for fingerprint identification has not been adequately demonstrated. In addition, key forensic identification bodies such as ENFSI [1] and IAI [2] have recently endorsed and acknowledged the potential benefits of using statistical models as an important tool in support of the fingerprint identification process within the ACE-V framework. In this paper, we introduce a new Likelihood Ratio (LR) model based on Support Vector Machines (SVMs) trained with features discovered via morphometric and spatial analyses of corresponding minutiae configurations for both match and close non-match populations often found in AFIS candidate lists. Computed LR values are derived from a probabilistic framework based on SVMs that discover the intrinsic spatial differences of match and close non-match populations. Lastly, experimentation performed on a set of over 120,000 publicly available fingerprint images (mostly sourced from the National Institute of Standards and Technology (NIST) datasets) and a distortion set of approximately 40,000 images, is presented, illustrating that the proposed LR model is reliably guiding towards the right proposition in the identification assessment of match and close non-match populations. Results further indicate that the proposed model is a promising tool for fingerprint practitioners to use for analysing the spatial consistency of corresponding minutiae configurations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Radioactive soil-contamination mapping and risk assessment is a vital issue for decision makers. Traditional approaches for mapping the spatial concentration of radionuclides employ various regression-based models, which usually provide a single-value prediction realization accompanied (in some cases) by estimation error. Such approaches do not provide the capability for rigorous uncertainty quantification or probabilistic mapping. Machine learning is a recent and fast-developing approach based on learning patterns and information from data. Artificial neural networks for prediction mapping have been especially powerful in combination with spatial statistics. A data-driven approach provides the opportunity to integrate additional relevant information about spatial phenomena into a prediction model for more accurate spatial estimates and associated uncertainty. Machine-learning algorithms can also be used for a wider spectrum of problems than before: classification, probability density estimation, and so forth. Stochastic simulations are used to model spatial variability and uncertainty. Unlike regression models, they provide multiple realizations of a particular spatial pattern that allow uncertainty and risk quantification. This paper reviews the most recent methods of spatial data analysis, prediction, and risk mapping, based on machine learning and stochastic simulations in comparison with more traditional regression models. The radioactive fallout from the Chernobyl Nuclear Power Plant accident is used to illustrate the application of the models for prediction and classification problems. This fallout is a unique case study that provides the challenging task of analyzing huge amounts of data ('hard' direct measurements, as well as supplementary information and expert estimates) and solving particular decision-oriented problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article extends existing discussion in literature on probabilistic inference and decision making with respect to continuous hypotheses that are prevalent in forensic toxicology. As a main aim, this research investigates the properties of a widely followed approach for quantifying the level of toxic substances in blood samples, and to compare this procedure with a Bayesian probabilistic approach. As an example, attention is confined to the presence of toxic substances, such as THC, in blood from car drivers. In this context, the interpretation of results from laboratory analyses needs to take into account legal requirements for establishing the 'presence' of target substances in blood. In a first part, the performance of the proposed Bayesian model for the estimation of an unknown parameter (here, the amount of a toxic substance) is illustrated and compared with the currently used method. The model is then used in a second part to approach-in a rational way-the decision component of the problem, that is judicial questions of the kind 'Is the quantity of THC measured in the blood over the legal threshold of 1.5 μg/l?'. This is pointed out through a practical example.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new model for dealing with decision making under risk by considering subjective and objective information in the same formulation is here presented. The uncertain probabilistic weighted average (UPWA) is also presented. Its main advantage is that it unifies the probability and the weighted average in the same formulation and considering the degree of importance that each case has in the analysis. Moreover, it is able to deal with uncertain environments represented in the form of interval numbers. We study some of its main properties and particular cases. The applicability of the UPWA is also studied and it is seen that it is very broad because all the previous studies that use the probability or the weighted average can be revised with this new approach. Focus is placed on a multi-person decision making problem regarding the selection of strategies by using the theory of expertons.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

ABSTRACT The traditional method of net present value (NPV) to analyze the economic profitability of an investment (based on a deterministic approach) does not adequately represent the implicit risk associated with different but correlated input variables. Using a stochastic simulation approach for evaluating the profitability of blueberry (Vaccinium corymbosum L.) production in Chile, the objective of this study is to illustrate the complexity of including risk in economic feasibility analysis when the project is subject to several but correlated risks. The results of the simulation analysis suggest that the non-inclusion of the intratemporal correlation between input variables underestimate the risk associated with investment decisions. The methodological contribution of this study illustrates the complexity of the interrelationships between uncertain variables and their impact on the convenience of carrying out this type of business in Chile. The steps for the analysis of economic viability were: First, adjusted probability distributions for stochastic input variables (SIV) were simulated and validated. Second, the random values of SIV were used to calculate random values of variables such as production, revenues, costs, depreciation, taxes and net cash flows. Third, the complete stochastic model was simulated with 10,000 iterations using random values for SIV. This result gave information to estimate the probability distributions of the stochastic output variables (SOV) such as the net present value, internal rate of return, value at risk, average cost of production, contribution margin and return on capital. Fourth, the complete stochastic model simulation results were used to analyze alternative scenarios and provide the results to decision makers in the form of probabilities, probability distributions, and for the SOV probabilistic forecasts. The main conclusion shown that this project is a profitable alternative investment in fruit trees in Chile.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Over the past few decades, age estimation of living persons has represented a challenging task for many forensic services worldwide. In general, the process for age estimation includes the observation of the degree of maturity reached by some physical attributes, such as dentition or several ossification centers. The estimated chronological age or the probability that an individual belongs to a meaningful class of ages is then obtained from the observed degree of maturity by means of various statistical methods. Among these methods, those developed in a Bayesian framework offer to users the possibility of coherently dealing with the uncertainty associated with age estimation and of assessing in a transparent and logical way the probability that an examined individual is younger or older than a given age threshold. Recently, a Bayesian network for age estimation has been presented in scientific literature; this kind of probabilistic graphical tool may facilitate the use of the probabilistic approach. Probabilities of interest in the network are assigned by means of transition analysis, a statistical parametric model, which links the chronological age and the degree of maturity by means of specific regression models, such as logit or probit models. Since different regression models can be employed in transition analysis, the aim of this paper is to study the influence of the model in the classification of individuals. The analysis was performed using a dataset related to the ossifications status of the medial clavicular epiphysis and results support that the classification of individuals is not dependent on the choice of the regression model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives: The aim of the study was to combine clinical results from the European Cohort of the REVERSE study and costs associated with the addition of cardiac resynchronization therapy (CRT) to optimal medical therapy (OMT) in patients with mild symptomatic (NYHA I-II) or asymptomatic left ventricular dysfunction and markers of cardiac dyssynchrony in Spain. Methods: A Markov model was developed with CRT + OMT (CRT-ON) versus OMT only (CRT-OFF) based on a retrospective cost-effectiveness analysis. Raw data was derived from literature and expert opinion, reflecting clinical and economic consequences of patient"s management in Spain. Time horizon was 10 years. Both costs (euro 2010) and effects were discounted at 3 percent per annum. Results: CRT-ON showed higher total costs than CRT-OFF; however, CRT reduced the length of hospitalization in ICU by 94 percent (0.006 versus 0.091 days) and general ward in by 34 percent (0.705 versus 1.076 days). Surviving CRT-ON patients (88.2 percent versus 77.5 percent) remained in better functional class longer, and they achieved an improvement of 0.9 life years (LYGs) and 0.77 years quality-adjusted life years (QALYs). CRT-ON proved to be cost-effective after 6 years, except for the 7th year due to battery depletion. At 10 years, the results were 18,431 per LYG and 21,500 per QALY gained. Probabilistic sensitivity analysis showed CRT-ON was cost-effective in 75.4 percent of the cases at 10 years. Conclusions: The use of CRT added to OMT represents an efficient use of resources in patients suffering from heart failure in NYHA functional classes I and II.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study aimed to describe the probabilistic structure of the annual series of extreme daily rainfall (Preabs), available from the weather station of Ubatuba, State of São Paulo, Brazil (1935-2009), by using the general distribution of extreme value (GEV). The autocorrelation function, the Mann-Kendall test, and the wavelet analysis were used in order to evaluate the presence of serial correlations, trends, and periodical components. Considering the results obtained using these three statistical methods, it was possible to assume the hypothesis that this temporal series is free from persistence, trends, and periodicals components. Based on quantitative and qualitative adhesion tests, it was found that the GEV may be used in order to quantify the probabilities of the Preabs data. The best results of GEV were obtained when the parameters of this function were estimated using the method of maximum likelihood. The method of L-moments has also shown satisfactory results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mass transfer kinetics in osmotic dehydration is usually modeled by Fick's law, empirical models and probabilistic models. The aim of this study was to determine the applicability of Peleg model to investigate the mass transfer during osmotic dehydration of mackerel (Scomber japonicus) slices at different temperatures. Osmotic dehydration was performed on mackerel slices by cooking-infusion in solutions with glycerol and salt (a w = 0.64) at different temperatures: 50, 70, and 90 ºC. Peleg rate constant (K1) (h(g/gdm)-1) varied with temperature variation from 0.761 to 0.396 for water loss, from 5.260 to 2.947 for salt gain, and from 0.854 to 0.566 for glycerol intake. In all cases, it followed the Arrhenius relationship (R²>0.86). The Ea (kJ / mol) values obtained were 16.14; 14.21, and 10.12 for water, salt, and glycerol, respectively. The statistical parameters that qualify the goodness of fit (R²>0.91 and RMSE<0.086) indicate promising applicability of Peleg model.