54 resultados para DNA Sequence, Hidden Markov Model, Bayesian Model, Sensitive Analysis, Markov Chain Monte Carlo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper concerns the problem of agent trust in an electronic market place. We maintain that agent trust involves making decisions under uncertainty and therefore the phenomenon should be modelled probabilistically. We therefore propose a probabilistic framework that models agent interactions as a Hidden Markov Model (HMM). The observations of the HMM are the interaction outcomes and the hidden state is the underlying probability of a good outcome. The task of deciding whether to interact with another agent reduces to probabilistic inference of the current state of that agent given all previous interaction outcomes. The model is extended to include a probabilistic reputation system which involves agents gathering opinions about other agents and fusing them with their own beliefs. Our system is fully probabilistic and hence delivers the following improvements with respect to previous work: (a) the model assumptions are faithfully translated into algorithms; our system is optimal under those assumptions, (b) It can account for agents whose behaviour is not static with time (c) it can estimate the rate with which an agent's behaviour changes. The system is shown to significantly outperform previous state-of-the-art methods in several numerical experiments. Copyright © 2010, International Foundation for Autonomous Agents and Multiagent Systems (www.ifaamas.org). All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper a Markov chain based analytical model is proposed to evaluate the slotted CSMA/CA algorithm specified in the MAC layer of IEEE 802.15.4 standard. The analytical model consists of two two-dimensional Markov chains, used to model the state transition of an 802.15.4 device, during the periods of a transmission and between two consecutive frame transmissions, respectively. By introducing the two Markov chains a small number of Markov states are required and the scalability of the analytical model is improved. The analytical model is used to investigate the impact of the CSMA/CA parameters, the number of contending devices, and the data frame size on the network performance in terms of throughput and energy efficiency. It is shown by simulations that the proposed analytical model can accurately predict the performance of slotted CSMA/CA algorithm for uplink, downlink and bi-direction traffic, with both acknowledgement and non-acknowledgement modes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

IEEE 802.11 standard has achieved huge success in the past decade and is still under development to provide higher physical data rate and better quality of service (QoS). An important problem for the development and optimization of IEEE 802.11 networks is the modeling of the MAC layer channel access protocol. Although there are already many theoretic analysis for the 802.11 MAC protocol in the literature, most of the models focus on the saturated traffic and assume infinite buffer at the MAC layer. In this paper we develop a unified analytical model for IEEE 802.11 MAC protocol in ad hoc networks. The impacts of channel access parameters, traffic rate and buffer size at the MAC layer are modeled with the assistance of a generalized Markov chain and an M/G/1/K queue model. The performance of throughput, packet delivery delay and dropping probability can be achieved. Extensive simulations show the analytical model is highly accurate. From the analytical model it is shown that for practical buffer configuration (e.g. buffer size larger than one), we can maximize the total throughput and reduce the packet blocking probability (due to limited buffer size) and the average queuing delay to zero by effectively controlling the offered load. The average MAC layer service delay as well as its standard deviation, is also much lower than that in saturated conditions and has an upper bound. It is also observed that the optimal load is very close to the maximum achievable throughput regardless of the number of stations or buffer size. Moreover, the model is scalable for performance analysis of 802.11e in unsaturated conditions and 802.11 ad hoc networks with heterogenous traffic flows. © 2012 KSI.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Protein-DNA interactions are an essential feature in the genetic activities of life, and the ability to predict and manipulate such interactions has applications in a wide range of fields. This Thesis presents the methods of modelling the properties of protein-DNA interactions. In particular, it investigates the methods of visualising and predicting the specificity of DNA-binding Cys2His2 zinc finger interaction. The Cys2His2 zinc finger proteins interact via their individual fingers to base pair subsites on the target DNA. Four key residue positions on the a- helix of the zinc fingers make non-covalent interactions with the DNA with sequence specificity. Mutating these key residues generates combinatorial possibilities that could potentially bind to any DNA segment of interest. Many attempts have been made to predict the binding interaction using structural and chemical information, but with only limited success. The most important contribution of the thesis is that the developed model allows for the binding properties of a given protein-DNA binding to be visualised in relation to other protein-DNA combinations without having to explicitly physically model the specific protein molecule and specific DNA sequence. To prove this, various databases were generated, including a synthetic database which includes all possible combinations of the DNA-binding Cys2His2 zinc finger interactions. NeuroScale, a topographic visualisation technique, is exploited to represent the geometric structures of the protein-DNA interactions by measuring dissimilarity between the data points. In order to verify the effect of visualisation on understanding the binding properties of the DNA-binding Cys2His2 zinc finger interaction, various prediction models are constructed by using both the high dimensional original data and the represented data in low dimensional feature space. Finally, novel data sets are studied through the selected visualisation models based on the experimental DNA-zinc finger protein database. The result of the NeuroScale projection shows that different dissimilarity representations give distinctive structural groupings, but clustering in biologically-interesting ways. This method can be used to forecast the physiochemical properties of the novel proteins which may be beneficial for therapeutic purposes involving genome targeting in general.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The standard GTM (generative topographic mapping) algorithm assumes that the data on which it is trained consists of independent, identically distributed (iid) vectors. For time series, however, the iid assumption is a poor approximation. In this paper we show how the GTM algorithm can be extended to model time series by incorporating it as the emission density in a hidden Markov model. Since GTM has discrete hidden states we are able to find a tractable EM algorithm, based on the forward-backward algorithm, to train the model. We illustrate the performance of GTM through time using flight recorder data from a helicopter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models (HMMs) to identify the lag (or delay) between different variables for such data. We first present a method using maximum likelihood estimation and propose a simple algorithm which is capable of identifying associations between variables. We also adopt an information-theoretic approach and develop a novel procedure for training HMMs to maximise the mutual information between delayed time series. Both methods are successfully applied to real data. We model the oil drilling process with HMMs and estimate a crucial parameter, namely the lag for return.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The deficiencies of stationary models applied to financial time series are well documented. A special form of non-stationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We use a dynamic switching (modelled by a hidden Markov model) combined with a linear dynamical system in a hybrid switching state space model (SSSM) and discuss the practical details of training such models with a variational EM algorithm due to [Ghahramani and Hilton,1998]. The performance of the SSSM is evaluated on several financial data sets and it is shown to improve on a number of existing benchmark methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the analysis and prediction of many real-world time series, the assumption of stationarity is not valid. A special form of non-stationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We introduce a new model which combines a dynamic switching (controlled by a hidden Markov model) and a non-linear dynamical system. We show how to train this hybrid model in a maximum likelihood approach and evaluate its performance on both synthetic and financial data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis is concerned with approximate inference in dynamical systems, from a variational Bayesian perspective. When modelling real world dynamical systems, stochastic differential equations appear as a natural choice, mainly because of their ability to model the noise of the system by adding a variant of some stochastic process to the deterministic dynamics. Hence, inference in such processes has drawn much attention. Here two new extended frameworks are derived and presented that are based on basis function expansions and local polynomial approximations of a recently proposed variational Bayesian algorithm. It is shown that the new extensions converge to the original variational algorithm and can be used for state estimation (smoothing). However, the main focus is on estimating the (hyper-) parameters of these systems (i.e. drift parameters and diffusion coefficients). The new methods are numerically validated on a range of different systems which vary in dimensionality and non-linearity. These are the Ornstein-Uhlenbeck process, for which the exact likelihood can be computed analytically, the univariate and highly non-linear, stochastic double well and the multivariate chaotic stochastic Lorenz '63 (3-dimensional model). The algorithms are also applied to the 40 dimensional stochastic Lorenz '96 system. In this investigation these new approaches are compared with a variety of other well known methods such as the ensemble Kalman filter / smoother, a hybrid Monte Carlo sampler, the dual unscented Kalman filter (for jointly estimating the systems states and model parameters) and full weak-constraint 4D-Var. Empirical analysis of their asymptotic behaviour as a function of observation density or length of time window increases is provided.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work is concerned with approximate inference in dynamical systems, from a variational Bayesian perspective. When modelling real world dynamical systems, stochastic differential equations appear as a natural choice, mainly because of their ability to model the noise of the system by adding a variation of some stochastic process to the deterministic dynamics. Hence, inference in such processes has drawn much attention. Here a new extended framework is derived that is based on a local polynomial approximation of a recently proposed variational Bayesian algorithm. The paper begins by showing that the new extension of this variational algorithm can be used for state estimation (smoothing) and converges to the original algorithm. However, the main focus is on estimating the (hyper-) parameters of these systems (i.e. drift parameters and diffusion coefficients). The new approach is validated on a range of different systems which vary in dimensionality and non-linearity. These are the Ornstein–Uhlenbeck process, the exact likelihood of which can be computed analytically, the univariate and highly non-linear, stochastic double well and the multivariate chaotic stochastic Lorenz ’63 (3D model). As a special case the algorithm is also applied to the 40 dimensional stochastic Lorenz ’96 system. In our investigation we compare this new approach with a variety of other well known methods, such as the hybrid Monte Carlo, dual unscented Kalman filter, full weak-constraint 4D-Var algorithm and analyse empirically their asymptotic behaviour as a function of observation density or length of time window increases. In particular we show that we are able to estimate parameters in both the drift (deterministic) and the diffusion (stochastic) part of the model evolution equations using our new methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HM-SVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fully-annotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences. © 2008. Licensed under the Creative Commons.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a semiparametric smooth-coefficient stochastic production frontier model where all the coefficients are expressed as some unknown functions of environmental factors. The inefficiency term is multiplicatively decomposed into a scaling function of the environmental factors and a standard truncated normal random variable. A testing procedure is suggested for the relevance of the environmental factors. Monte Carlo study shows plausible ¯nite sample behavior of our proposed estimation and inference procedure. An empirical example is given, where both the semiparametric and standard parametric models are estimated and results are compared.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

IEEE 802.15.4 standard is a relatively new standard designed for low power low data rate wireless sensor networks (WSN), which has a wide range of applications, e.g., environment monitoring, e-health, home and industry automation. In this paper, we investigate the problems of hidden devices in coverage overlapped IEEE 802.15.4 WSNs, which is likely to arise when multiple 802.15.4 WSNs are deployed closely and independently. We consider a typical scenario of two 802.15.4 WSNs with partial coverage overlapping and propose a Markov-chain based analytical model to reveal the performance degradation due to the hidden devices from the coverage overlapping. Impacts of the hidden devices and network sleeping modes on saturated throughput and energy consumption are modeled. The analytic model is verified by simulations, which can provide the insights to network design and planning when multiple 802.15.4 WSNs are deployed closely. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The dynamics of the non-equilibrium Ising model with parallel updates is investigated using a generalized mean field approximation that incorporates multiple two-site correlations at any two time steps, which can be obtained recursively. The proposed method shows significant improvement in predicting local system properties compared to other mean field approximation techniques, particularly in systems with symmetric interactions. Results are also evaluated against those obtained from Monte Carlo simulations. The method is also employed to obtain parameter values for the kinetic inverse Ising modeling problem, where couplings and local field values of a fully connected spin system are inferred from data. © 2014 IOP Publishing Ltd and SISSA Medialab srl.