876 resultados para Modeling Rapport Using Hidden Markov Models
Resumo:
An important aspect of the QTL mapping problem is the treatment of missing genotype data. If complete genotype data were available, QTL mapping would reduce to the problem of model selection in linear regression. However, in the consideration of loci in the intervals between the available genetic markers, genotype data is inherently missing. Even at the typed genetic markers, genotype data is seldom complete, as a result of failures in the genotyping assays or for the sake of economy (for example, in the case of selective genotyping, where only individuals with extreme phenotypes are genotyped). We discuss the use of algorithms developed for hidden Markov models (HMMs) to deal with the missing genotype data problem.
Resumo:
This paper consides the problem of extracting the relationships between two time series in a non-linear non-stationary environment with Hidden Markov Models (HMMs). We describe an algorithm which is capable of identifying associations between variables. The method is applied both to synthetic data and real data. We show that HMMs are capable of modelling the oil drilling process and that they outperform existing methods.
Resumo:
Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models to identify the lag (or delay) between different variables for such data. Adopting an information-theoretic approach, we develop a procedure for training HMMs to maximise the mutual information (MMI) between delayed time series. The method is used to model the oil drilling process. We show that cross-correlation gives no information and that the MMI approach outperforms maximum likelihood.
Resumo:
Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot
Resumo:
Graphical techniques for modeling the dependencies of randomvariables have been explored in a variety of different areas includingstatistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics.Formalisms for manipulating these models have been developedrelatively independently in these research communities. In this paper weexplore hidden Markov models (HMMs) and related structures within the general framework of probabilistic independencenetworks (PINs). The paper contains a self-contained review of the basic principles of PINs.It is shown that the well-known forward-backward (F-B) and Viterbialgorithms for HMMs are special cases of more general inference algorithms forarbitrary PINs. Furthermore, the existence of inference and estimationalgorithms for more general graphical models provides a set of analysistools for HMM practitioners who wish to explore a richer class of HMMstructures.Examples of relatively complex models to handle sensorfusion and coarticulationin speech recognitionare introduced and treated within the graphical model framework toillustrate the advantages of the general approach.
Resumo:
Existing methods of dive analysis, developed for fully aquatic animals, tend to focus on frequency of behaviors rather than transitions between them. They, therefore, do not account for the variability of behavior of semiaquatic animals, and the switching between terrestrial and aquatic environments. This is the first study to use hidden Markov models (HMM) to divide dives of a semiaquatic animal into clusters and thus identify the environmental predictors of transition between behavioral modes. We used 18 existing data sets of the dives of 14 American mink (Neovison vison) fitted with time-depth recorders in lowland England. Using HMM, we identified 3 behavioral states (1, temporal cluster of dives; 2, more loosely aggregated diving within aquatic activity; and 3, terminal dive of a cluster or a single, isolated dive). Based on the higher than expected proportion of dives in State 1, we conclude that mink tend to dive in clusters. We found no relationship between temperature and the proportion of dives in each state or between temperature and the rate of transition between states, meaning that in our study area, mink are apparently not adopting different diving strategies at different temperatures. Transition analysis between states has shown that there is no correlation between ambient temperature and the likelihood of mink switching from one state to another, that is, changing foraging modes. The variables provided good discrimination and grouped into consistent states well, indicating promise for further application of HMM and other state transition analyses in studies of semiaquatic animals.
Resumo:
Amplifications and deletions of chromosomal DNA, as well as copy-neutral loss of heterozygosity have been associated with diseases processes. High-throughput single nucleotide polymorphism (SNP) arrays are useful for making genome-wide estimates of copy number and genotype calls. Because neighboring SNPs in high throughput SNP arrays are likely to have dependent copy number and genotype due to the underlying haplotype structure and linkage disequilibrium, hidden Markov models (HMM) may be useful for improving genotype calls and copy number estimates that do not incorporate information from nearby SNPs. We improve previous approaches that utilize a HMM framework for inference in high throughput SNP arrays by integrating copy number, genotype calls, and the corresponding confidence scores when available. Using simulated data, we demonstrate how confidence scores control smoothing in a probabilistic framework. Software for fitting HMMs to SNP array data is available in the R package ICE.
Resumo:
In this paper, the theory of hidden Markov models (HMM) isapplied to the problem of blind (without training sequences) channel estimationand data detection. Within a HMM framework, the Baum–Welch(BW) identification algorithm is frequently used to find out maximum-likelihood (ML) estimates of the corresponding model. However, such a procedureassumes the model (i.e., the channel response) to be static throughoutthe observation sequence. By means of introducing a parametric model fortime-varying channel responses, a version of the algorithm, which is moreappropriate for mobile channels [time-dependent Baum-Welch (TDBW)] isderived. Aiming to compare algorithm behavior, a set of computer simulationsfor a GSM scenario is provided. Results indicate that, in comparisonto other Baum–Welch (BW) versions of the algorithm, the TDBW approachattains a remarkable enhancement in performance. For that purpose, onlya moderate increase in computational complexity is needed.
Resumo:
This master thesis deals with determining of innovative projects "viability". "Viability" is the probability of innovative project being implemented. Hidden Markov Models are used for evaluation of this factor. The problem of determining parameters of model, which produce given data sequence with the highest probability, are solving in this research. Data about innovative projects contained in reports of Russian programs "UMNIK", "START" and additional data obtained during study are used as input data for determining of model parameters. The Baum-Welch algorithm which is one implementation of expectation-maximization algorithm is used at this research for calculating model parameters. At the end part of the master thesis mathematical basics for practical implementation are given (in particular mathematical description of the algorithm and implementation methods for Markov models).
Resumo:
Amongst all the objectives in the study of time series, uncovering the dynamic law of its generation is probably the most important. When the underlying dynamics are not available, time series modelling consists of developing a model which best explains a sequence of observations. In this thesis, we consider hidden space models for analysing and describing time series. We first provide an introduction to the principal concepts of hidden state models and draw an analogy between hidden Markov models and state space models. Central ideas such as hidden state inference or parameter estimation are reviewed in detail. A key part of multivariate time series analysis is identifying the delay between different variables. We present a novel approach for time delay estimating in a non-stationary environment. The technique makes use of hidden Markov models and we demonstrate its application for estimating a crucial parameter in the oil industry. We then focus on hybrid models that we call dynamical local models. These models combine and generalise hidden Markov models and state space models. Probabilistic inference is unfortunately computationally intractable and we show how to make use of variational techniques for approximating the posterior distribution over the hidden state variables. Experimental simulations on synthetic and real-world data demonstrate the application of dynamical local models for segmenting a time series into regimes and providing predictive distributions.
Resumo:
Large-conductance Ca(2+)-activated K(+) channels (BK) play a fundamental role in modulating membrane potential in many cell types. The gating of BK channels and its modulation by Ca(2+) and voltage has been the subject of intensive research over almost three decades, yielding several of the most complicated kinetic mechanisms ever proposed. A large number of open and closed states disposed, respectively, in two planes, named tiers, characterize these mechanisms. Transitions between states in the same plane are cooperative and modulated by Ca(2+). Transitions across planes are highly concerted and voltage-dependent. Here we reexamine the validity of the two-tiered hypothesis by restricting attention to the modulation by Ca(2+). Large single channel data sets at five Ca(2+) concentrations were simultaneously analyzed from a Bayesian perspective by using hidden Markov models and Markov-chain Monte Carlo stochastic integration techniques. Our results support a dramatic reduction in model complexity, favoring a simple mechanism derived from the Monod-Wyman-Changeux allosteric model for homotetramers, able to explain the Ca(2+) modulation of the gating process. This model differs from the standard Monod-Wyman-Changeux scheme in that one distinguishes when two Ca(2+) ions are bound to adjacent or diagonal subunits of the tetramer.