Biblioteca Digital

26 resultados para Local likelihood function

em CentAUR: Central Archive University of Reading - UK

Likelihood-free estimation of model evidence

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Statistical methods of inference typically require the likelihood function to be computable in a reasonable amount of time. The class of “likelihood-free” methods termed Approximate Bayesian Computation (ABC) is able to eliminate this requirement, replacing the evaluation of the likelihood with simulation from it. Likelihood-free methods have gained in efficiency and popularity in the past few years, following their integration with Markov Chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) in order to better explore the parameter space. They have been applied primarily to estimating the parameters of a given model, but can also be used to compare models. Here we present novel likelihood-free approaches to model comparison, based upon the independent estimation of the evidence of each model under study. Key advantages of these approaches over previous techniques are that they allow the exploitation of MCMC or SMC algorithms for exploring the parameter space, and that they do not require a sampler able to mix between models. We validate the proposed methods using a simple exponential family problem before providing a realistic problem from human population genetics: the comparison of different demographic models based upon genetic data from the Y chromosome.

Vertex splitting and connectivity augmentation in hypergraphs

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We consider problems of splitting and connectivity augmentation in hypergraphs. In a hypergraph G = (V +s, E), to split two edges su, sv, is to replace them with a single edge uv. We are interested in doing this in such a way as to preserve a defined level of connectivity in V . The splitting technique is often used as a way of adding new edges into a graph or hypergraph, so as to augment the connectivity to some prescribed level. We begin by providing a short history of work done in this area. Then several preliminary results are given in a general form so that they may be used to tackle several problems. We then analyse the hypergraphs G = (V + s, E) for which there is no split preserving the local-edge-connectivity present in V. We provide two structural theorems, one of which implies a slight extension to Mader’s classical splitting theorem. We also provide a characterisation of the hypergraphs for which there is no such “good” split and a splitting result concerned with a specialisation of the local-connectivity function. We then use our splitting results to provide an upper bound on the smallest number of size-two edges we must add to any given hypergraph to ensure that in the resulting hypergraph we have λ(x, y) ≥ r(x, y) for all x, y in V, where r is an integer valued, symmetric requirement function on V*V. This is the so called “local-edge-connectivity augmentation problem” for hypergraphs. We also provide an extension to a Theorem of Szigeti, about augmenting to satisfy a requirement r, but using hyperedges. Next, in a result born of collaborative work with Zoltán Király from Budapest, we show that the local-connectivity augmentation problem is NP-complete for hypergraphs. Lastly we concern ourselves with an augmentation problem that includes a locational constraint. The premise is that we are given a hypergraph H = (V,E) with a bipartition P = {P1, P2} of V and asked to augment it with size-two edges, so that the result is k-edge-connected, and has no new edge contained in some P(i). We consider the splitting technique and describe the obstacles that prevent us forming “good” splits. From this we deduce results about which hypergraphs have a complete Pk-split. This leads to a minimax result on the optimal number of edges required and a polynomial algorithm to provide an optimal augmentation.

Bayesian model comparison with un-normalised likelihoods

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalizing constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes’ factors for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates.

abctools: an R package for tuning approximate Bayesian computation analyses

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Approximate Bayesian computation (ABC) is a popular family of algorithms which perform approximate parameter inference when numerical evaluation of the likelihood function is not possible but data can be simulated from the model. They return a sample of parameter values which produce simulations close to the observed dataset. A standard approach is to reduce the simulated and observed datasets to vectors of summary statistics and accept when the difference between these is below a specified threshold. ABC can also be adapted to perform model choice. In this article, we present a new software package for R, abctools which provides methods for tuning ABC algorithms. This includes recent dimension reduction algorithms to tune the choice of summary statistics, and coverage methods to tune the choice of threshold. We provide several illustrations of these routines on applications taken from the ABC literature.

Estimating the spatial scales of regionalized variables by nested sampling, hierarchical analysis of variance and residual maximum likelihood

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.

Estimating the spatial scales of regionalized variables by nested sampling, hierarchical analysis of variance and residual maximum likelihood

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.

Assessing the uncertainty associated with intermittent rainfall fields

Relevância:

30.00% 30.00%

Publicador:

Resumo:

[1] In many practical situations where spatial rainfall estimates are needed, rainfall occurs as a spatially intermittent phenomenon. An efficient geostatistical method for rainfall estimation in the case of intermittency has previously been published and comprises the estimation of two independent components: a binary random function for modeling the intermittency and a continuous random function that models the rainfall inside the rainy areas. The final rainfall estimates are obtained as the product of the estimates of these two random functions. However the published approach does not contain a method for estimation of uncertainties. The contribution of this paper is the presentation of the indicator maximum likelihood estimator from which the local conditional distribution of the rainfall value at any location may be derived using an ensemble approach. From the conditional distribution, representations of uncertainty such as the estimation variance and confidence intervals can be obtained. An approximation to the variance can be calculated more simply by assuming rainfall intensity is independent of location within the rainy area. The methodology has been validated using simulated and real rainfall data sets. The results of these case studies show good agreement between predicted uncertainties and measured errors obtained from the validation data.

On the relationship of normal modes to local modes in molecular vibrations

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A simple model for the effective vibrational hamiltonian of the XH stretching vibrations in H2O, NH3 and CH4 is considered, based on a morse potential function for the bond stretches plus potential and kinetic energy coupling between pairs of bond oscillators. It is shown that this model can be set up as a matrix in local mode basis functions, or as a matrix in normal mode basis functions, leading to identical results. The energy levels obtained exhibit normal mode patterns at low vibrational excitation, and local mode patterns at high excitation. When the hamiltonian is set up in the normal mode basis it is shown that Darling-Dennison resonances must be included, and simple relations are found to exist between the xrs, gtt, and Krrss anharmonic constants (where the Darling-Dennison coefficients are denoted K) due to their contributions from morse anharmonicity in the bond stretches. The importance of the Darling-Dennison resonances is stressed. The relationship of the two alternative representations of this local mode/normal mode model are investigated, and the potential uses and limitations of the model are discussed.

Rational conversion of substrate and product specificity in a Salvia monoterpene synthase: structural insights into the evolution of terpene synthase function

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Terpene synthases are responsible for the biosynthesis of the complex chemical defense arsenal of plants and microorganisms. How do these enzymes, which all appear to share a common terpene synthase fold, specify the many different products made almost entirely from one of only three substrates? Elucidation of the structure of 1,8-cineole synthase from Salvia fruticosa (Sf-CinS1) combined with analysis of functional and phylogenetic relationships of enzymes within Salvia species identified active-site residues responsible for product specificity. Thus, Sf-CinS1 was successfully converted to a sabinene synthase with a minimum number of rationally predicted substitutions, while identification of the Asn side chain essential for water activation introduced 1,8-cineole and alpha-terpineol activity to Salvia pomifera sabinene synthase. A major contribution to product specificity in Sf-CinS1 appears to come from a local deformation within one of the helices forming the active site. This deformation is observed in all other mono- or sesquiterpene structures available, pointing to a conserved mechanism. Moreover, a single amino acid substitution enlarged the active-site cavity enough to accommodate the larger farnesyl pyrophosphate substrate and led to the efficient synthesis of sesquiterpenes, while alternate single substitutions of this critical amino acid yielded five additional terpene synthases.

Prediction of global and local model quality in CASP8 using the ModFOLD server

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of effective methods for predicting the quality of three-dimensional (3D) models is fundamentally important for the success of tertiary structure (TS) prediction strategies. Since CASP7, the Quality Assessment (QA) category has existed to gauge the ability of various model quality assessment programs (MQAPs) at predicting the relative quality of individual 3D models. For the CASP8 experiment, automated predictions were submitted in the QA category using two methods from the ModFOLD server-ModFOLD version 1.1 and ModFOLDclust. ModFOLD version 1.1 is a single-model machine learning based method, which was used for automated predictions of global model quality (QMODE1). ModFOLDclust is a simple clustering based method, which was used for automated predictions of both global and local quality (QMODE2). In addition, manual predictions of model quality were made using ModFOLD version 2.0-an experimental method that combines the scores from ModFOLDclust and ModFOLD v1.1. Predictions from the ModFOLDclust method were the most successful of the three in terms of the global model quality, whilst the ModFOLD v1.1 method was comparable in performance to other single-model based methods. In addition, the ModFOLDclust method performed well at predicting the per-residue, or local, model quality scores. Predictions of the per-residue errors in our own 3D models, selected using the ModFOLD v2.0 method, were also the most accurate compared with those from other methods. All of the MQAPs described are publicly accessible via the ModFOLD server at: http://www.reading.ac.uk/bioinf/ModFOLD/. The methods are also freely available to download from: http://www.reading.ac.uk/bioinf/downloads/.

Hypothesis about mechanisms through which nicotine might exert its effect on the interdependence of inflammation and gut barrier function in ulcerative colitis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ulcerative colitis (UC) is characterized by impairment of the epithelial barrier and the formation of ulcer-type lesions, which result in local leaks and generalized alterations of mucosal tight junctions. Ultimately, this results in increased basal permeability. Although disruption of the epithelial barrier in the gut is a hallmark of inflammatory bowel disease and intestinal infections, it remains unclear whether barrier breakdown is an initiating event of UC or rather a consequence of an underlying inflammation, evidenced by increased production of proinflammatory cytokines. UC is less common in smokers, suggesting that the nicotine in cigarettes may ameliorate disease severity. The mechanism behind this therapeutic effect is still not fully understood, and indeed it remains unclear if nicotine is the true protective agent in cigarettes. Nicotine is metabolized in the body into a variety of metabolites and can also be degraded to form various breakdown products. It is possible these metabolites or degradation products may be the true protective or curative agents. A greater understanding of the pharmacodynamics and kinetics of nicotine in relation to the immune system and enhanced knowledge of out permeability defects in UC are required to establish the exact protective nature of nicotine and its metabolites in UC. This review suggests possible hypotheses for the protective mechanism of nicotine in UC, highlighting the relationship between gut permeability and inflammation, and indicates where in the pathogenesis of the disease nicotine may mediate its effect.

Performance of GPRS coding scheme detection under severe multipath and co-channel interference as a function of soft-bit width

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The General Packet Radio Service (GPRS) has been developed for the mobile radio environment to allow the migration from the traditional circuit switched connection to a more efficient packet based communication link particularly for data transfer. GPRS requires the addition of not only the GPRS software protocol stack, but also more baseband functionality for the mobile as new coding schemes have be en defined, uplink status flag detection, multislot operation and dynamic coding scheme detect. This paper concentrates on evaluating the performance of the GPRS coding scheme detection methods in the presence of a multipath fading channel with a single co-channel interferer as a function of various soft-bit data widths. It has been found that compressing the soft-bit data widths from the output of the equalizer to save memory can influence the likelihood decision of the coding scheme detect function and hence contribute to the overall performance loss of the system. Coding scheme detection errors can therefore force the channel decoder to either select the incorrect decoding scheme or have no clear decision which coding scheme to use resulting in the decoded radio block failing the block check sequence and contribute to the block error rate. For correct performance simulation, the performance of the full coding scheme detection must be taken into account.

Probability density function estimation using orthogonal forward regression

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Using the classical Parzen window estimate as the target function, the kernel density estimation is formulated as a regression problem and the orthogonal forward regression technique is adopted to construct sparse kernel density estimates. The proposed algorithm incrementally minimises a leave-one-out test error score to select a sparse kernel model, and a local regularisation method is incorporated into the density construction process to further enforce sparsity. The kernel weights are finally updated using the multiplicative nonnegative quadratic programming algorithm, which has the ability to reduce the model size further. Except for the kernel width, the proposed algorithm has no other parameters that need tuning, and the user is not required to specify any additional criterion to terminate the density construction procedure. Two examples are used to demonstrate the ability of this regression-based approach to effectively construct a sparse kernel density estimate with comparable accuracy to that of the full-sample optimised Parzen window density estimate.

A tunable radial basis function model for nonlinear system identification using particle swarm optimisation

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A tunable radial basis function (RBF) network model is proposed for nonlinear system identification using particle swarm optimisation (PSO). At each stage of orthogonal forward regression (OFR) model construction, PSO optimises one RBF unit's centre vector and diagonal covariance matrix by minimising the leave-one-out (LOO) mean square error (MSE). This PSO aided OFR automatically determines how many tunable RBF nodes are sufficient for modelling. Compared with the-state-of-the-art local regularisation assisted orthogonal least squares algorithm based on the LOO MSE criterion for constructing fixed-node RBF network models, the PSO tuned RBF model construction produces more parsimonious RBF models with better generalisation performance and is computationally more efficient.

Nonlinear system identification using particle swarm optimisation tuned radial basis function models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel particle swarm optimisation (PSO) tuned radial basis function (RBF) network model is proposed for identification of non-linear systems. At each stage of orthogonal forward regression (OFR) model construction process, PSO is adopted to tune one RBF unit's centre vector and diagonal covariance matrix by minimising the leave-one-out (LOO) mean square error (MSE). This PSO aided OFR automatically determines how many tunable RBF nodes are sufficient for modelling. Compared with the-state-of-the-art local regularisation assisted orthogonal least squares algorithm based on the LOO MSE criterion for constructing fixed-node RBF network models, the PSO tuned RBF model construction produces more parsimonious RBF models with better generalisation performance and is often more efficient in model construction. The effectiveness of the proposed PSO aided OFR algorithm for constructing tunable node RBF models is demonstrated using three real data sets.

«
1
2
»