985 resultados para Probabilistic Models
Resumo:
This paper is concerned with synchronization of complex stochastic dynamical networks in the presence of noise and functional uncertainty. A probabilistic control method for adaptive synchronization is presented. All required probabilistic models of the network are assumed to be unknown therefore estimated to be dependent on the connectivity strength, the state and control values. Robustness of the probabilistic controller is proved via the Liapunov method. Furthermore, based on the residual error of the network states we introduce the definition of stochastic pinning controllability. A coupled map lattice with spatiotemporal chaos is taken as an example to illustrate all theoretical developments. The theoretical derivation is complemented by its validation on two representative examples.
Resumo:
Robust controllers for nonlinear stochastic systems with functional uncertainties can be consistently designed using probabilistic control methods. In this paper a generalised probabilistic controller design for the minimisation of the Kullback-Leibler divergence between the actual joint probability density function (pdf) of the closed loop control system, and an ideal joint pdf is presented emphasising how the uncertainty can be systematically incorporated in the absence of reliable systems models. To achieve this objective all probabilistic models of the system are estimated from process data using mixture density networks (MDNs) where all the parameters of the estimated pdfs are taken to be state and control input dependent. Based on this dependency of the density parameters on the input values, explicit formulations to the construction of optimal generalised probabilistic controllers are obtained through the techniques of dynamic programming and adaptive critic methods. Using the proposed generalised probabilistic controller, the conditional joint pdfs can be made to follow the ideal ones. A simulation example is used to demonstrate the implementation of the algorithm and encouraging results are obtained.
Resumo:
In this paper a new framework has been applied to the design of controllers which encompasses nonlinearity, hysteresis and arbitrary density functions of forward models and inverse controllers. Using mixture density networks, the probabilistic models of both the forward and inverse dynamics are estimated such that they are dependent on the state and the control input. The optimal control strategy is then derived which minimizes uncertainty of the closed loop system. In the absence of reliable plant models, the proposed control algorithm incorporates uncertainties in model parameters, observations, and latent processes. The local stability of the closed loop system has been established. The efficacy of the control algorithm is demonstrated on two nonlinear stochastic control examples with additive and multiplicative noise.
Resumo:
L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.
Resumo:
Our research has shown that schedules can be built mimicking a human scheduler by using a set of rules that involve domain knowledge. This chapter presents a Bayesian Optimization Algorithm (BOA) for the nurse scheduling problem that chooses such suitable scheduling rules from a set for each nurse’s assignment. Based on the idea of using probabilistic models, the BOA builds a Bayesian network for the set of promising solutions and samples these networks to generate new candidate solutions. Computational results from 52 real data instances demonstrate the success of this approach. It is also suggested that the learning mechanism in the proposed algorithm may be suitable for other scheduling problems.
Resumo:
Human leukocyte antigen (HLA) haplotypes are frequently evaluated for population history inferences and association studies. However, the available typing techniques for the main HLA loci usually do not allow the determination of the allele phase and the constitution of a haplotype, which may be obtained by a very time-consuming and expensive family-based segregation study. Without the family-based study, computational inference by probabilistic models is necessary to obtain haplotypes. Several authors have used the expectation-maximization (EM) algorithm to determine HLA haplotypes, but high levels of erroneous inferences are expected because of the genetic distance among the main HLA loci and the presence of several recombination hotspots. In order to evaluate the efficiency of computational inference methods, 763 unrelated individuals stratified into three different datasets had their haplotypes manually defined in a family-based study of HLA-A, -B, -DRB1 and -DQB1 segregation, and these haplotypes were compared with the data obtained by the following three methods: the Expectation-Maximization (EM) and Excoffier-Laval-Balding (ELB) algorithms using the arlequin 3.11 software, and the PHASE method. When comparing the methods, we observed that all algorithms showed a poor performance for haplotype reconstruction with distant loci, estimating incorrect haplotypes for 38%-57% of the samples considering all algorithms and datasets. We suggest that computational haplotype inferences involving low-resolution HLA-A, HLA-B, HLA-DRB1 and HLA-DQB1 haplotypes should be considered with caution.
Resumo:
This paper presents a methodology for distribution networks reconfiguration in outage presence in order to choose the reconfiguration that presents the lower power losses. The methodology is based on statistical failure and repair data of the distribution power system components and uses fuzzy-probabilistic modelling for system component outage parameters. Fuzzy membership functions of system component outage parameters are obtained by statistical records. A hybrid method of fuzzy set and Monte Carlo simulation based on the fuzzy-probabilistic models allows catching both randomness and fuzziness of component outage parameters. Once obtained the system states by Monte Carlo simulation, a logical programming algorithm is applied to get all possible reconfigurations for every system state. In order to evaluate the line flows and bus voltages and to identify if there is any overloading, and/or voltage violation a distribution power flow has been applied to select the feasible reconfiguration with lower power losses. To illustrate the application of the proposed methodology to a practical case, the paper includes a case study that considers a real distribution network.
Fuzzy Monte Carlo mathematical model for load curtailment minimization in transmission power systems
Resumo:
This paper presents a methodology which is based on statistical failure and repair data of the transmission power system components and uses fuzzyprobabilistic modeling for system component outage parameters. Using statistical records allows developing the fuzzy membership functions of system component outage parameters. The proposed hybrid method of fuzzy set and Monte Carlo simulation based on the fuzzy-probabilistic models allows catching both randomness and fuzziness of component outage parameters. A network contingency analysis to identify any overloading or voltage violation in the network is performed once obtained the system states by Monte Carlo simulation. This is followed by a remedial action algorithm, based on optimal power flow, to reschedule generations and alleviate constraint violations and, at the same time, to avoid any load curtailment, if possible, or, otherwise, to minimize the total load curtailment, for the states identified by the contingency analysis. In order to illustrate the application of the proposed methodology to a practical case, the paper will include a case study for the Reliability Test System (RTS) 1996 IEEE 24 BUS.
Resumo:
OBJECTIVE: To determine health care costs and economic burden of epidemiological changes in diseases related to tobacco consumption. METHODS: A time-series analysis in Mexico (1994-2005) was carried out on seven health interventions: chronic obstructive pulmonary diseases, lung cancer with and without surgical intervention, asthma in smokers and non-smokers, full treatment course with nicotine gum, and full treatment course with nicotine patch. According with Box-Jenkins methodology, probabilistic models were developed to forecast the expected changes in the epidemiologic profile and the expected changes in health care services required for selected interventions. Health care costs were estimated following the instrumentation methods and validated with consensus technique. RESULTS: A comparison of the economic impact in 2006 vs. 2008 showed 20-90% increase in expected cases depending on the disease (p<0.05), and 25-93% increase in financial requirements (p<0.01). The study data suggest that changes in the demand for health services for patients with respiratory diseases related to tobacco consumption will continue showing an increasing trend. CONCLUSIONS: In economic terms, the growing number of cases expected during the study period indicates a process of internal competition and adds an element of intrinsic competition in the management of preventive and curative interventions. The study results support the assumption that if preventive programs remain unchanged, the increasing demands for curative health care may cause great financial and management challenges to the health care system of middle-income countries like Mexico.
Resumo:
This thesis presents the Fuzzy Monte Carlo Model for Transmission Power Systems Reliability based studies (FMC-TRel) methodology, which is based on statistical failure and repair data of the transmission power system components and uses fuzzyprobabilistic modeling for system component outage parameters. Using statistical records allows developing the fuzzy membership functions of system component outage parameters. The proposed hybrid method of fuzzy set and Monte Carlo simulation based on the fuzzy-probabilistic models allows catching both randomness and fuzziness of component outage parameters. A network contingency analysis to identify any overloading or voltage violation in the network is performed once obtained the system states. This is followed by a remedial action algorithm, based on Optimal Power Flow, to reschedule generations and alleviate constraint violations and, at the same time, to avoid any load curtailment, if possible, or, otherwise, to minimize the total load curtailment, for the states identified by the contingency analysis. For the system states that cause load curtailment, an optimization approach is applied to reduce the probability of occurrence of these states while minimizing the costs to achieve that reduction. This methodology is of most importance for supporting the transmission system operator decision making, namely in the identification of critical components and in the planning of future investments in the transmission power system. A case study based on Reliability Test System (RTS) 1996 IEEE 24 Bus is presented to illustrate with detail the application of the proposed methodology.
Resumo:
Most of distributed generation and smart grid research works are dedicated to network operation parameters studies, reliability, etc. However, many of these works normally uses traditional test systems, for instance, IEEE test systems. This paper proposes voltage magnitude and reliability studies in presence of fault conditions, considering realistic conditions found in countries like Brazil. The methodology considers a hybrid method of fuzzy set and Monte Carlo simulation based on the fuzzy-probabilistic models and a remedial action algorithm which is based on optimal power flow. To illustrate the application of the proposed method, the paper includes a case study that considers a real 12-bus sub-transmission network.
Resumo:
Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot
Resumo:
Several studies have reported high performance of simple decision heuristics multi-attribute decision making. In this paper, we focus on situations where attributes are binary and analyze the performance of Deterministic-Elimination-By-Aspects (DEBA) and similar decision heuristics. We consider non-increasing weights and two probabilistic models for the attribute values: one where attribute values are independent Bernoulli randomvariables; the other one where they are binary random variables with inter-attribute positive correlations. Using these models, we show that good performance of DEBA is explained by the presence of cumulative as opposed to simple dominance. We therefore introduce the concepts of cumulative dominance compliance and fully cumulative dominance compliance and show that DEBA satisfies those properties. We derive a lower bound with which cumulative dominance compliant heuristics will choose a best alternative and show that, even with many attributes, this is not small. We also derive an upper bound for the expected loss of fully cumulative compliance heuristics and show that this is moderateeven when the number of attributes is large. Both bounds are independent of the values ofthe weights.
Resumo:
Abstract One of the most important issues in molecular biology is to understand regulatory mechanisms that control gene expression. Gene expression is often regulated by proteins, called transcription factors which bind to short (5 to 20 base pairs),degenerate segments of DNA. Experimental efforts towards understanding the sequence specificity of transcription factors is laborious and expensive, but can be substantially accelerated with the use of computational predictions. This thesis describes the use of algorithms and resources for transcriptionfactor binding site analysis in addressing quantitative modelling, where probabilitic models are built to represent binding properties of a transcription factor and can be used to find new functional binding sites in genomes. Initially, an open-access database(HTPSELEX) was created, holding high quality binding sequences for two eukaryotic families of transcription factors namely CTF/NF1 and LEFT/TCF. The binding sequences were elucidated using a recently described experimental procedure called HTP-SELEX, that allows generation of large number (> 1000) of binding sites using mass sequencing technology. For each HTP-SELEX experiments we also provide accurate primary experimental information about the protein material used, details of the wet lab protocol, an archive of sequencing trace files, and assembled clone sequences of binding sequences. The database also offers reasonably large SELEX libraries obtained with conventional low-throughput protocols.The database is available at http://wwwisrec.isb-sib.ch/htpselex/ and and ftp://ftp.isrec.isb-sib.ch/pub/databases/htpselex. The Expectation-Maximisation(EM) algorithm is one the frequently used methods to estimate probabilistic models to represent the sequence specificity of transcription factors. We present computer simulations in order to estimate the precision of EM estimated models as a function of data set parameters(like length of initial sequences, number of initial sequences, percentage of nonbinding sequences). We observed a remarkable robustness of the EM algorithm with regard to length of training sequences and the degree of contamination. The HTPSELEX database and the benchmarked results of the EM algorithm formed part of the foundation for the subsequent project, where a statistical framework called hidden Markov model has been developed to represent sequence specificity of the transcription factors CTF/NF1 and LEF1/TCF using the HTP-SELEX experiment data. The hidden Markov model framework is capable of both predicting and classifying CTF/NF1 and LEF1/TCF binding sites. A covariance analysis of the binding sites revealed non-independent base preferences at different nucleotide positions, providing insight into the binding mechanism. We next tested the LEF1/TCF model by computing binding scores for a set of LEF1/TCF binding sequences for which relative affinities were determined experimentally using non-linear regression. The predicted and experimentally determined binding affinities were in good correlation.
Resumo:
Mass transfer kinetics in osmotic dehydration is usually modeled by Fick's law, empirical models and probabilistic models. The aim of this study was to determine the applicability of Peleg model to investigate the mass transfer during osmotic dehydration of mackerel (Scomber japonicus) slices at different temperatures. Osmotic dehydration was performed on mackerel slices by cooking-infusion in solutions with glycerol and salt (a w = 0.64) at different temperatures: 50, 70, and 90 ºC. Peleg rate constant (K1) (h(g/gdm)-1) varied with temperature variation from 0.761 to 0.396 for water loss, from 5.260 to 2.947 for salt gain, and from 0.854 to 0.566 for glycerol intake. In all cases, it followed the Arrhenius relationship (R²>0.86). The Ea (kJ / mol) values obtained were 16.14; 14.21, and 10.12 for water, salt, and glycerol, respectively. The statistical parameters that qualify the goodness of fit (R²>0.91 and RMSE<0.086) indicate promising applicability of Peleg model.