958 resultados para Bayesian nonparametic
Resumo:
In this paper the issue of finding uncertainty intervals for queries in a Bayesian Network is reconsidered. The investigation focuses on Bayesian Nets with discrete nodes and finite populations. An earlier asymptotic approach is compared with a simulation-based approach, together with further alternatives, one based on a single sample of the Bayesian Net of a particular finite population size, and another which uses expected population sizes together with exact probabilities. We conclude that a query of a Bayesian Net should be expressed as a probability embedded in an uncertainty interval. Based on an investigation of two Bayesian Net structures, the preferred method is the simulation method. However, both the single sample method and the expected sample size methods may be useful and are simpler to compute. Any method at all is more useful than none, when assessing a Bayesian Net under development, or when drawing conclusions from an ‘expert’ system.
Resumo:
Bayesian networks (BNs) are tools for representing expert knowledge or evidence. They are especially useful for synthesising evidence or belief concerning a complex intervention, assessing the sensitivity of outcomes to different situations or contextual frameworks and framing decision problems that involve alternative types of intervention. Bayesian networks are useful extensions to logic maps when initiating a review or to facilitate synthesis and bridge the gap between evidence acquisition and decision-making. Formal elicitation techniques allow development of BNs on the basis of expert opinion. Such applications are useful alternatives to ‘empty’ reviews, which identify knowledge gaps but fail to support decision-making. Where review evidence exists, it can inform the development of a BN. We illustrate the construction of a BN using a motivating example that demonstrates how BNs can ensure coherence, transparently structure the problem addressed by a complex intervention and assess sensitivity to context, all of which are critical components of robust reviews of complex interventions. We suggest that BNs should be utilised to routinely synthesise reviews of complex interventions or empty reviews where decisions must be made despite poor evidence.
Resumo:
Gene expression is arguably the most important indicator of biological function. Thus identifying differentially expressed genes is one of the main aims of high throughout studies that use microarray and RNAseq platforms to study deregulated cellular pathways. There are many tools for analysing differentia gene expression from transciptomic datasets. The major challenge of this topic is to estimate gene expression variance due to the high amount of ‘background noise’ that is generated from biological equipment and the lack of biological replicates. Bayesian inference has been widely used in the bioinformatics field. In this work, we reveal that the prior knowledge employed in the Bayesian framework also helps to improve the accuracy of differential gene expression analysis when using a small number of replicates. We have developed a differential analysis tool that uses Bayesian estimation of the variance of gene expression for use with small numbers of biological replicates. Our method is more consistent when compared to the widely used cyber-t tool that successfully introduced the Bayesian framework to differential analysis. We also provide a user-friendly web based Graphic User Interface for biologists to use with microarray and RNAseq data. Bayesian inference can compensate for the instability of variance caused when using a small number of biological replicates by using pseudo replicates as prior knowledge. We also show that our new strategy to select pseudo replicates will improve the performance of the analysis. - See more at: http://www.eurekaselect.com/node/138761/article#sthash.VeK9xl5k.dpuf
Resumo:
Cancer is the leading contributor to the disease burden in Australia. This thesis develops and applies Bayesian hierarchical models to facilitate an investigation of the spatial and temporal associations for cancer diagnosis and survival among Queenslanders. The key objectives are to document and quantify the importance of spatial inequalities, explore factors influencing these inequalities, and investigate how spatial inequalities change over time. Existing Bayesian hierarchical models are refined, new models and methods developed, and tangible benefits obtained for cancer patients in Queensland. The versatility of using Bayesian models in cancer control are clearly demonstrated through these detailed and comprehensive analyses.
Resumo:
Ship seakeeping operability refers to the quantification of motion performance in waves relative to mission requirements. This is used to make decisions about preferred vessel designs, but it can also be used as comprehensive assessment of the benefits of ship-motion-control systems. Traditionally, operability computation aggregates statistics of motion computed over over the envelope of likely environmental conditions in order to determine a coefficient in the range from 0 to 1 called operability. When used for assessment of motion-control systems, the increase of operability is taken as the key performance indicator. The operability coefficient is often given the interpretation of the percentage of time operable. This paper considers an alternative probabilistic approach to this traditional computation of operability. It characterises operability not as a number to which a frequency interpretation is attached, but as a hypothesis that a vessel will attain the desired performance in one mission considering the envelope of likely operational conditions. This enables the use of Bayesian theory to compute the probability of that this hypothesis is true conditional on data from simulations. Thus, the metric considered is the probability of operability. This formulation not only adheres to recent developments in reliability and risk analysis, but also allows incorporating into the analysis more accurate descriptions of ship-motion-control systems since the analysis is not limited to linear ship responses in the frequency domain. The paper also discusses an extension of the approach to the case of assessment of increased levels of autonomy for unmanned marine craft.
Resumo:
This paper proposes new metrics and a performance-assessment framework for vision-based weed and fruit detection and classification algorithms. In order to compare algorithms, and make a decision on which one to use fora particular application, it is necessary to take into account that the performance obtained in a series of tests is subject to uncertainty. Such characterisation of uncertainty seems not to be captured by the performance metrics currently reported in the literature. Therefore, we pose the problem as a general problem of scientific inference, which arises out of incomplete information, and propose as a metric of performance the(posterior) predictive probabilities that the algorithms will provide a correct outcome for target and background detection. We detail the framework through which these predicted probabilities can be obtained, which is Bayesian in nature. As an illustration example, we apply the framework to the assessment of performance of four algorithms that could potentially be used in the detection of capsicums (peppers).
Resumo:
A flexible and simple Bayesian decision-theoretic design for dose-finding trials is proposed in this paper. In order to reduce the computational burden, we adopt a working model with conjugate priors, which is flexible to fit all monotonic dose-toxicity curves and produces analytic posterior distributions. We also discuss how to use a proper utility function to reflect the interest of the trial. Patients are allocated based on not only the utility function but also the chosen dose selection rule. The most popular dose selection rule is the one-step-look-ahead (OSLA), which selects the best-so-far dose. A more complicated rule, such as the two-step-look-ahead, is theoretically more efficient than the OSLA only when the required distributional assumptions are met, which is, however, often not the case in practice. We carried out extensive simulation studies to evaluate these two dose selection rules and found that OSLA was often more efficient than two-step-look-ahead under the proposed Bayesian structure. Moreover, our simulation results show that the proposed Bayesian method's performance is superior to several popular Bayesian methods and that the negative impact of prior misspecification can be managed in the design stage.
Resumo:
So far, most Phase II trials have been designed and analysed under a frequentist framework. Under this framework, a trial is designed so that the overall Type I and Type II errors of the trial are controlled at some desired levels. Recently, a number of articles have advocated the use of Bavesian designs in practice. Under a Bayesian framework, a trial is designed so that the trial stops when the posterior probability of treatment is within certain prespecified thresholds. In this article, we argue that trials under a Bayesian framework can also be designed to control frequentist error rates. We introduce a Bayesian version of Simon's well-known two-stage design to achieve this goal. We also consider two other errors, which are called Bayesian errors in this article because of their similarities to posterior probabilities. We show that our method can also control these Bayesian-type errors. We compare our method with other recent Bayesian designs in a numerical study and discuss implications of different designs on error rates. An example of a clinical trial for patients with nasopharyngeal carcinoma is used to illustrate differences of the different designs.
Resumo:
Stallard (1998, Biometrics 54, 279-294) recently used Bayesian decision theory for sample-size determination in phase II trials. His design maximizes the expected financial gains in the development of a new treatment. However, it results in a very high probability (0.65) of recommending an ineffective treatment for phase III testing. On the other hand, the expected gain using his design is more than 10 times that of a design that tightly controls the false positive error (Thall and Simon, 1994, Biometrics 50, 337-349). Stallard's design maximizes the expected gain per phase II trial, but it does not maximize the rate of gain or total gain for a fixed length of time because the rate of gain depends on the proportion: of treatments forwarding to the phase III study. We suggest maximizing the rate of gain, and the resulting optimal one-stage design becomes twice as efficient as Stallard's one-stage design. Furthermore, the new design has a probability of only 0.12 of passing an ineffective treatment to phase III study.
Resumo:
The minimum cost classifier when general cost functionsare associated with the tasks of feature measurement and classification is formulated as a decision graph which does not reject class labels at intermediate stages. Noting its complexities, a heuristic procedure to simplify this scheme to a binary decision tree is presented. The optimizationof the binary tree in this context is carried out using ynamicprogramming. This technique is applied to the voiced-unvoiced-silence classification in speech processing.
Resumo:
description and analysis of geographically indexed health data with respect to demographic, environmental, behavioural, socioeconomic, genetic, and infectious risk factors (Elliott andWartenberg 2004). Disease maps can be useful for estimating relative risk; ecological analyses, incorporating area and/or individual-level covariates; or cluster analyses (Lawson 2009). As aggregated data are often more readily available, one common method of mapping disease is to aggregate the counts of disease at some geographical areal level, and present them as choropleth maps (Devesa et al. 1999; Population Health Division 2006). Therefore, this chapter will focus exclusively on methods appropriate for areal data...
Resumo:
This paper proposes solutions to three issues pertaining to the estimation of finite mixture models with an unknown number of components: the non-identifiability induced by overfitting the number of components, the mixing limitations of standard Markov Chain Monte Carlo (MCMC) sampling techniques, and the related label switching problem. An overfitting approach is used to estimate the number of components in a finite mixture model via a Zmix algorithm. Zmix provides a bridge between multidimensional samplers and test based estimation methods, whereby priors are chosen to encourage extra groups to have weights approaching zero. MCMC sampling is made possible by the implementation of prior parallel tempering, an extension of parallel tempering. Zmix can accurately estimate the number of components, posterior parameter estimates and allocation probabilities given a sufficiently large sample size. The results will reflect uncertainty in the final model and will report the range of possible candidate models and their respective estimated probabilities from a single run. Label switching is resolved with a computationally light-weight method, Zswitch, developed for overfitted mixtures by exploiting the intuitiveness of allocation-based relabelling algorithms and the precision of label-invariant loss functions. Four simulation studies are included to illustrate Zmix and Zswitch, as well as three case studies from the literature. All methods are available as part of the R package Zmix, which can currently be applied to univariate Gaussian mixture models.
Resumo:
Being able to accurately predict the risk of falling is crucial in patients with Parkinson’s dis- ease (PD). This is due to the unfavorable effect of falls, which can lower the quality of life as well as directly impact on survival. Three methods considered for predicting falls are decision trees (DT), Bayesian networks (BN), and support vector machines (SVM). Data on a 1-year prospective study conducted at IHBI, Australia, for 51 people with PD are used. Data processing are conducted using rpart and e1071 packages in R for DT and SVM, con- secutively; and Bayes Server 5.5 for the BN. The results show that BN and SVM produce consistently higher accuracy over the 12 months evaluation time points (average sensitivity and specificity > 92%) than DT (average sensitivity 88%, average specificity 72%). DT is prone to imbalanced data so needs to adjust for the misclassification cost. However, DT provides a straightforward, interpretable result and thus is appealing for helping to identify important items related to falls and to generate fallers’ profiles.
Resumo:
This paper describes the development of a model, based on Bayesian networks, to estimate the likelihood that sheep flocks are infested with lice at shearing and to assist farm managers or advisers to assess whether or not to apply a lousicide treatment. The risk of lice comes from three main sources: (i) lice may have been present at the previous shearing and not eradicated; (ii) lice may have been introduced with purchased sheep; and (iii) lice may have entered with strays. A Bayesian network is used to assess the probability of each of these events independently and combine them for an overall assessment. Rubbing is a common indicator of lice but there are other causes too. If rubbing has been observed, an additional Bayesian network is used to assess the probability that lice are the cause. The presence or absence of rubbing and its possible cause are combined with these networks to improve the overall risk assessment.
Resumo:
Dynamic Bayesian Networks (DBNs) provide a versatile platform for predicting and analysing the behaviour of complex systems. As such, they are well suited to the prediction of complex ecosystem population trajectories under anthropogenic disturbances such as the dredging of marine seagrass ecosystems. However, DBNs assume a homogeneous Markov chain whereas a key characteristics of complex ecosystems is the presence of feedback loops, path dependencies and regime changes whereby the behaviour of the system can vary based on past states. This paper develops a method based on the small world structure of complex systems networks to modularise a non-homogeneous DBN and enable the computation of posterior marginal probabilities given evidence in forwards inference. It also provides an approach for an approximate solution for backwards inference as convergence is not guaranteed for a path dependent system. When applied to the seagrass dredging problem, the incorporation of path dependency can implement conditional absorption and allows release from the zero state in line with environmental and ecological observations. As dredging has a marked global impact on seagrass and other marine ecosystems of high environmental and economic value, using such a complex systems model to develop practical ways to meet the needs of conservation and industry through enhancing resistance and/or recovery is of paramount importance.