869 resultados para Markov chains hidden Markov models Viterbi algorithm Forward-Backward algorithm maximum likelihood
Resumo:
Background Plasmodium vivax continues to be the most widely distributed malarial parasite species in tropical and sub-tropical areas, causing high morbidity indices around the world. Better understanding of the proteins used by the parasite during the invasion of red blood cells is required to obtain an effective vaccine against this disease. This study describes characterizing the P. vivax asparagine-rich protein (PvARP) and examines its antigenicity in natural infection. Methods The target gene in the study was selected according to a previous in silico analysis using profile hidden Markov models which identified P. vivax proteins that play a possible role in invasion. Transcription of the arp gene in the P. vivax VCG-1 strain was here evaluated by RT-PCR. Specific human antibodies against PvARP were used to confirm protein expression by Western blot as well as its subcellular localization by immunofluorescence. Recognition of recombinant PvARP by sera from P. vivax-infected individuals was evaluated by ELISA. Results VCG-1 strain PvARP is a 281-residue-long molecule, which is encoded by a single exon and has an N-terminal secretion signal, as well as a tandem repeat region. This protein is expressed in mature schizonts and is located on the surface of merozoites, having an apparent accumulation towards their apical pole. Sera from P. vivax-infected patients recognized the recombinant, thereby suggesting that this protein is targeted by the immune response during infection.
Resumo:
Accurately and reliably identifying the actual number of clusters present with a dataset of gene expression profiles, when no additional information on cluster structure is available, is a problem addressed by few algorithms. GeneMCL transforms microarray analysis data into a graph consisting of nodes connected by edges, where the nodes represent genes, and the edges represent the similarity in expression of those genes, as given by a proximity measurement. This measurement is taken to be the Pearson correlation coefficient combined with a local non-linear rescaling step. The resulting graph is input to the Markov Cluster (MCL) algorithm, which is an elegant, deterministic, non-specific and scalable method, which models stochastic flow through the graph. The algorithm is inherently affected by any cluster structure present, and rapidly decomposes a graph into cohesive clusters. The potential of the GeneMCL algorithm is demonstrated with a 5730 gene subset (IGS) of the Van't Veer breast cancer database, for which the clusterings are shown to reflect underlying biological mechanisms. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
An input variable selection procedure is introduced for the identification and construction of multi-input multi-output (MIMO) neurofuzzy operating point dependent models. The algorithm is an extension of a forward modified Gram-Schmidt orthogonal least squares procedure for a linear model structure which is modified to accommodate nonlinear system modeling by incorporating piecewise locally linear model fitting. The proposed input nodes selection procedure effectively tackles the problem of the curse of dimensionality associated with lattice-based modeling algorithms such as radial basis function neurofuzzy networks, enabling the resulting neurofuzzy operating point dependent model to be widely applied in control and estimation. Some numerical examples are given to demonstrate the effectiveness of the proposed construction algorithm.
Resumo:
The dynamics of inter-regional communication within the brain during cognitive processing – referred to as functional connectivity – are investigated as a control feature for a brain computer interface. EMDPL is used to map phase synchronization levels between all channel pair combinations in the EEG. This results in complex networks of channel connectivity at all time–frequency locations. The mean clustering coefficient is then used as a descriptive feature encapsulating information about inter-channel connectivity. Hidden Markov models are applied to characterize and classify dynamics of the resulting complex networks. Highly accurate levels of classification are achieved when this technique is applied to classify EEG recorded during real and imagined single finger taps. These results are compared to traditional features used in the classification of a finger tap BCI demonstrating that functional connectivity dynamics provide additional information and improved BCI control accuracies.
Resumo:
The multivariate skew-t distribution (J Multivar Anal 79:93-113, 2001; J R Stat Soc, Ser B 65:367-389, 2003; Statistics 37:359-363, 2003) includes the Student t, skew-Cauchy and Cauchy distributions as special cases and the normal and skew-normal ones as limiting cases. In this paper, we explore the use of Markov Chain Monte Carlo (MCMC) methods to develop a Bayesian analysis of repeated measures, pretest/post-test data, under multivariate null intercept measurement error model (J Biopharm Stat 13(4):763-771, 2003) where the random errors and the unobserved value of the covariate (latent variable) follows a Student t and skew-t distribution, respectively. The results and methods are numerically illustrated with an example in the field of dentistry.
Resumo:
Robotic mapping is the process of automatically constructing an environment representation using mobile robots. We address the problem of semantic mapping, which consists of using mobile robots to create maps that represent not only metric occupancy but also other properties of the environment. Specifically, we develop techniques to build maps that represent activity and navigability of the environment. Our approach to semantic mapping is to combine machine learning techniques with standard mapping algorithms. Supervised learning methods are used to automatically associate properties of space to the desired classification patterns. We present two methods, the first based on hidden Markov models and the second on support vector machines. Both approaches have been tested and experimentally validated in two problem domains: terrain mapping and activity-based mapping.
Resumo:
The objective of this article is to find out the influence of the parameters of the ARIMA-GARCH models in the prediction of artificial neural networks (ANN) of the feed forward type, trained with the Levenberg-Marquardt algorithm, through Monte Carlo simulations. The paper presents a study of the relationship between ANN performance and ARIMA-GARCH model parameters, i.e. the fact that depending on the stationarity and other parameters of the time series, the ANN structure should be selected differently. Neural networks have been widely used to predict time series and their capacity for dealing with non-linearities is a normally outstanding advantage. However, the values of the parameters of the models of generalized autoregressive conditional heteroscedasticity have an influence on ANN prediction performance. The combination of the values of the GARCH parameters with the ARIMA autoregressive terms also implies in ANN performance variation. Combining the parameters of the ARIMA-GARCH models and changing the ANN`s topologies, we used the Theil inequality coefficient to measure the prediction of the feed forward ANN.
Resumo:
Discriminative training of Gaussian Mixture Models (GMMs) for speech or speaker recognition purposes is usually based on the gradient descent method, in which the iteration step-size, ε, uses to be defined experimentally. In this letter, we derive an equation to adaptively determine ε, by showing that the second-order Newton-Raphson iterative method to find roots of equations is equivalent to the gradient descent algorithm. © 2010 IEEE.
Resumo:
Traditionally, ancillary services are supplied by large conventional generators. However, with the huge penetration of distributed generators (DGs) as a result of the growing interest in satisfying energy requirements, and considering the benefits that they can bring along to the electrical system and to the environment, it appears reasonable to assume that ancillary services could also be provided by DGs in an economical and efficient way. In this paper, a settlement procedure for a reactive power market for DGs in distribution systems is proposed. Attention is directed to wind turbines connected to the network through synchronous generators with permanent magnets and doubly-fed induction generators. The generation uncertainty of this kind of DG is reduced by running a multi-objective optimization algorithm in multiple probabilistic scenarios through the Monte Carlo method and by representing the active power generated by the DGs through Markov models. The objectives to be minimized are the payments of the distribution system operator to the DGs for reactive power, the curtailment of transactions committed in an active power market previously settled, the losses in the lines of the network, and a voltage profile index. The proposed methodology was tested using a modified IEEE 37-bus distribution test system. © 1969-2012 IEEE.
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
O processamento de voz tornou-se uma tecnologia cada vez mais baseada na modelagem automática de vasta quantidade de dados. Desta forma, o sucesso das pesquisas nesta área está diretamente ligado a existência de corpora de domínio público e outros recursos específicos, tal como um dicionário fonético. No Brasil, ao contrário do que acontece para a língua inglesa, por exemplo, não existe atualmente em domínio público um sistema de Reconhecimento Automático de Voz (RAV) para o Português Brasileiro com suporte a grandes vocabulários. Frente a este cenário, o trabalho tem como principal objetivo discutir esforços dentro da iniciativa FalaBrasil [1], criada pelo Laboratório de Processamento de Sinais (LaPS) da UFPA, apresentando pesquisas e softwares na área de RAV para o Português do Brasil. Mais especificamente, o presente trabalho discute a implementação de um sistema de reconhecimento de voz com suporte a grandes vocabulários para o Português do Brasil, utilizando a ferramenta HTK baseada em modelo oculto de Markov (HMM) e a criação de um módulo de conversão grafema-fone, utilizando técnicas de aprendizado de máquina.
Resumo:
Sistemas de reconhecimento e síntese de voz são constituídos por módulos que dependem da língua e, enquanto existem muitos recursos públicos para alguns idiomas (p.e. Inglês e Japonês), os recursos para Português Brasileiro (PB) ainda são escassos. Outro aspecto é que, para um grande número de tarefas, a taxa de erro dos sistemas de reconhecimento de voz atuais ainda é elevada, quando comparada à obtida por seres humanos. Assim, apesar do sucesso das cadeias escondidas de Markov (HMM), é necessária a pesquisa por novos métodos. Este trabalho tem como motivação esses dois fatos e se divide em duas partes. A primeira descreve o desenvolvimento de recursos e ferramentas livres para reconhecimento e síntese de voz em PB, consistindo de bases de dados de áudio e texto, um dicionário fonético, um conversor grafema-fone, um separador silábico e modelos acústico e de linguagem. Todos os recursos construídos encontram-se publicamente disponíveis e, junto com uma interface de programação proposta, têm sido usados para o desenvolvimento de várias novas aplicações em tempo-real, incluindo um módulo de reconhecimento de voz para a suíte de aplicativos para escritório OpenOffice.org. São apresentados testes de desempenho dos sistemas desenvolvidos. Os recursos aqui produzidos e disponibilizados facilitam a adoção da tecnologia de voz para PB por outros grupos de pesquisa, desenvolvedores e pela indústria. A segunda parte do trabalho apresenta um novo método para reavaliar (rescoring) o resultado do reconhecimento baseado em HMMs, o qual é organizado em uma estrutura de dados do tipo lattice. Mais especificamente, o sistema utiliza classificadores discriminativos que buscam diminuir a confusão entre pares de fones. Para cada um desses problemas binários, são usadas técnicas de seleção automática de parâmetros para escolher a representaçãao paramétrica mais adequada para o problema em questão.
Resumo:
In many movies of scientific fiction, machines were capable of speaking with humans. However mankind is still far away of getting those types of machines, like the famous character C3PO of Star Wars. During the last six decades the automatic speech recognition systems have been the target of many studies. Throughout these years many technics were developed to be used in applications of both software and hardware. There are many types of automatic speech recognition system, among which the one used in this work were the isolated word and independent of the speaker system, using Hidden Markov Models as the recognition system. The goals of this work is to project and synthesize the first two steps of the speech recognition system, the steps are: the speech signal acquisition and the pre-processing of the signal. Both steps were developed in a reprogrammable component named FPGA, using the VHDL hardware description language, owing to the high performance of this component and the flexibility of the language. In this work it is presented all the theory of digital signal processing, as Fast Fourier Transforms and digital filters and also all the theory of speech recognition using Hidden Markov Models and LPC processor. It is also presented all the results obtained for each one of the blocks synthesized e verified in hardware
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Bayesian phylogenetic analyses are now very popular in systematics and molecular evolution because they allow the use of much more realistic models than currently possible with maximum likelihood methods. There are, however, a growing number of examples in which large Bayesian posterior clade probabilities are associated with very short edge lengths and low values for non-Bayesian measures of support such as nonparametric bootstrapping. For the four-taxon case when the true tree is the star phylogeny, Bayesian analyses become increasingly unpredictable in their preference for one of the three possible resolved tree topologies as data set size increases. This leads to the prediction that hard (or near-hard) polytomies in nature will cause unpredictable behavior in Bayesian analyses, with arbitrary resolutions of the polytomy receiving very high posterior probabilities in some cases. We present a simple solution to this problem involving a reversible-jump Markov chain Monte Carlo (MCMC) algorithm that allows exploration of all of tree space, including unresolved tree topologies with one or more polytomies. The reversible-jump MCMC approach allows prior distributions to place some weight on less-resolved tree topologies, which eliminates misleadingly high posteriors associated with arbitrary resolutions of hard polytomies. Fortunately, assigning some prior probability to polytomous tree topologies does not appear to come with a significant cost in terms of the ability to assess the level of support for edges that do exist in the true tree. Methods are discussed for applying arbitrary prior distributions to tree topologies of varying resolution, and an empirical example showing evidence of polytomies is analyzed and discussed.