873 resultados para DNA Sequence, Hidden Markov Model, Bayesian Model, Sensitive Analysis, Markov Chain Monte Carlo
Resumo:
Objective: Effective management of multi-resistant organisms is an important issue for hospitals both in Australia and overseas. This study investigates the utility of using Bayesian Network (BN) analysis to examine relationships between risk factors and colonization with Vancomycin Resistant Enterococcus (VRE). Design: Bayesian Network Analysis was performed using infection control data collected over a period of 36 months (2008-2010). Setting: Princess Alexandra Hospital (PAH), Brisbane. Outcome of interest: Number of new VRE Isolates Methods: A BN is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph (DAG). BN enables multiple interacting agents to be studied simultaneously. The initial BN model was constructed based on the infectious disease physician‟s expert knowledge and current literature. Continuous variables were dichotomised by using third quartile values of year 2008 data. BN was used to examine the probabilistic relationships between VRE isolates and risk factors; and to establish which factors were associated with an increased probability of a high number of VRE isolates. Software: Netica (version 4.16). Results: Preliminary analysis revealed that VRE transmission and VRE prevalence were the most influential factors in predicting a high number of VRE isolates. Interestingly, several factors (hand hygiene and cleaning) known through literature to be associated with VRE prevalence, did not appear to be as influential as expected in this BN model. Conclusions: This preliminary work has shown that Bayesian Network Analysis is a useful tool in examining clinical infection prevention issues, where there is often a web of factors that influence outcomes. This BN model can be restructured easily enabling various combinations of agents to be studied.
Resumo:
A long query provides more useful hints for searching relevant documents, but it is likely to introduce noise which affects retrieval performance. In order to smooth such adverse effect, it is important to reduce noisy terms, introduce and boost additional relevant terms. This paper presents a comprehensive framework, called Aspect Hidden Markov Model (AHMM), which integrates query reduction and expansion, for retrieval with long queries. It optimizes the probability distribution of query terms by utilizing intra-query term dependencies as well as the relationships between query terms and words observed in relevance feedback documents. Empirical evaluation on three large-scale TREC collections demonstrates that our approach, which is automatic, achieves salient improvements over various strong baselines, and also reaches a comparable performance to a state of the art method based on user’s interactive query term reduction and expansion.
Resumo:
This paper develops maximum likelihood (ML) estimation schemes for finite-state semi-Markov chains in white Gaussian noise. We assume that the semi-Markov chain is characterised by transition probabilities of known parametric from with unknown parameters. We reformulate this hidden semi-Markov model (HSM) problem in the scalar case as a two-vector homogeneous hidden Markov model (HMM) problem in which the state consist of the signal augmented by the time to last transition. With this reformulation we apply the expectation Maximumisation (EM ) algorithm to obtain ML estimates of the transition probabilities parameters, Markov state levels and noise variance. To demonstrate our proposed schemes, motivated by neuro-biological applications, we use a damped sinusoidal parameterised function for the transition probabilities.
Resumo:
In this paper, we propose a risk-sensitive approach to parameter estimation for hidden Markov models (HMMs). The parameter estimation approach considered exploits estimation of various functions of the state, based on model estimates. We propose certain practical suboptimal risk-sensitive filters to estimate the various functions of the state during transients, rather than optimal risk-neutral filters as in earlier studies. The estimates are asymptotically optimal, if asymptotically risk neutral, and can give significantly improved transient performance, which is a very desirable objective for certain engineering applications. To demonstrate the improvement in estimation simulation studies are presented that compare parameter estimation based on risk-sensitive filters with estimation based on risk-neutral filters.
Resumo:
Most of the existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudo-data from the model at each iteration. However, the computational cost of these simulations can be prohibitive for high dimensional data. An important example is the Potts model, which is commonly used in image analysis. Images encountered in real world applications can have millions of pixels, therefore scalability is a major concern. We apply ABC with a synthetic likelihood to the hidden Potts model with additive Gaussian noise. Using a pre-processing step, we fit a binding function to model the relationship between the model parameters and the synthetic likelihood parameters. Our numerical experiments demonstrate that the precomputed binding function dramatically improves the scalability of ABC, reducing the average runtime required for model fitting from 71 hours to only 7 minutes. We also illustrate the method by estimating the smoothing parameter for remotely sensed satellite imagery. Without precomputation, Bayesian inference is impractical for datasets of that scale.
Resumo:
The total entropy utility function is considered for the dual purpose of Bayesian design for model discrimination and parameter estimation. A sequential design setting is proposed where it is shown how to efficiently estimate the total entropy utility for a wide variety of data types. Utility estimation relies on forming particle approximations to a number of intractable integrals which is afforded by the use of the sequential Monte Carlo algorithm for Bayesian inference. A number of motivating examples are considered for demonstrating the performance of total entropy in comparison to utilities for model discrimination and parameter estimation. The results suggest that the total entropy utility selects designs which are efficient under both experimental goals with little compromise in achieving either goal. As such, the total entropy utility is advocated as a general utility for Bayesian design in the presence of model uncertainty.
Resumo:
Mutations of UDP-N-acetyl-alpha-D-galactosamine polypeptide N-acetyl galactosaminyl transferase 3 (GALNT3) result in familial tumoural calcinosis (FTC) and the hyperostosis-hyperphosphataemia syndrome (HHS), which are autosomal recessive disorders characterised by soft-tissue calcification and hyperphosphataemia. To facilitate in vivo studies of these heritable disorders of phosphate homeostasis, we embarked on establishing a mouse model by assessing progeny of mice treated with the chemical mutagen N-ethyl-N-nitrosourea (ENU), and identified a mutant mouse, TCAL, with autosomal recessive inheritance of ectopic calcification, which involved multiple tissues, and hyperphosphataemia; the phenotype was designated TCAL and the locus, Tcal. TCAL males were infertile with loss of Sertoli cells and spermatozoa, and increased testicular apoptosis. Genetic mapping localized Tcal to chromosome 2 (62.64-71.11 Mb) which contained the Galnt3. DNA sequence analysis identified a Galnt3 missense mutation (Trp589Arg) in TCAL mice. Transient transfection of wild-type and mutant Galnt3-enhanced green fluorescent protein (EGFP) constructs in COS-7 cells revealed endoplasmic reticulum retention of the Trp589Arg mutant and Western blot analysis of kidney homogenates demonstrated defective glycosylation of Galnt3 in Tcal/Tcal mice. Tcal/Tcal mice had normal plasma calcium and parathyroid hormone concentrations; decreased alkaline phosphatase activity and intact Fgf23 concentrations; and elevation of circulating 1,25-dihydroxyvitamin D. Quantitative reverse transcriptase-PCR (qRT-PCR) revealed that Tcal/Tcal mice had increased expression of Galnt3 and Fgf23 in bone, but that renal expression of Klotho, 25-hydroxyvitamin D-1α-hydroxylase (Cyp27b1), and the sodium-phosphate co-transporters type-IIa and -IIc was similar to that in wild-type mice. Thus, TCAL mice have the phenotypic features of FTC and HHS, and provide a model for these disorders of phosphate metabolism. © 2012 Esapa et al.
Resumo:
Chronic kidney disease (CKD) is characterized by renal fibrosis that can lead to end-stage renal failure, and studies have supported a strong genetic influence on the risk of developing CKD. However, investigations of the underlying molecular mechanisms are hampered by the lack of suitable hereditary models in animals. We therefore sought to establish hereditary mouse models for CKD and renal fibrosis by investigating mice treated with the chemical mutagen N-ethyl-N-nitrosourea, and identified a mouse with autosomal recessive renal failure, designated RENF. Three-week old RENF mice were smaller than their littermates, whereas at birth they had been of similar size. RENF mice, at 4-weeks of age, had elevated concentrations of plasma urea and creatinine, indicating renal failure, which was associated with small and irregularly shaped kidneys. Genetic studies using DNA from 10 affected mice and 91 single nucleotide polymorphisms mapped the Renf locus to a 5.8Mbp region on chromosome 17E1.3. DNA sequencing of the xanthine dehydrogenase (Xdh) gene revealed a nonsense mutation at codon 26 that co-segregated with affected RENF mice. The Xdh mutation resulted in loss of hepatic XDH and renal Cyclooxygenase-2 (COX-2) expression. XDH mutations in man cause xanthinuria with undetectable plasma uric acid levels and three RENF mice had plasma uric acid levels below the limit of detection. Histological analysis of RENF kidney sections revealed abnormal arrangement of glomeruli, intratubular casts, cellular infiltration in the interstitial space, and interstitial fibrosis. TUNEL analysis of RENF kidney sections showed extensive apoptosis predominantly affecting the tubules. Thus, we have established a mouse model for autosomal recessive early-onset renal failure due to a nonsense mutation in Xdh that is a model for xanthinuria in man. This mouse model could help to increase our understanding of the molecular mechanisms associated with renal fibrosis and the specific roles of XDH and uric acid. © 2012 Piret et al.
Resumo:
The various types of chain folding and possible intraloop as well as interloop base pairing in human telomeric DNA containing d(TTAG(3)) repeats have been investigated by model-building, molecular mechanics, and molecular dynamics techniques. Model-building and molecular mechanics studies indicate that it is possible to build a variety of energetically favorable folded-back structures with the two TTA loops on same side and the 5' end thymines in the two loops forming TATA tetrads involving a number of different intraloop as well as interloop A:T pairing schemes. In these folded-back structures, although both intraloop and interloop Watson-Crick pairing is feasible, no structure is possible with interloop Hoogsteen pairing. MD studies of representative structures indicate that the guanine-tetraplex stem is very rigid and, while the loop regions are relatively much more flexible, most of the hydrogen bonds remain intact throughout the 350-ps in vacuo simulation. The various possible TTA loop structures, although they are energetically similar, have characteristic inter proton distances, which could give rise to unique cross-peaks in two-dimensional nuclear Overhauser effect spectroscopy (NOESY) experiments. These folded-back structures with A:T pairings in the loop region help in rationalizing the data from chemical probing and other biochemical studies on human telomeric DNA.
Resumo:
Approximate Bayesian computation (ABC) has become a popular technique to facilitate Bayesian inference from complex models. In this article we present an ABC approximation designed to perform biased filtering for a Hidden Markov Model when the likelihood function is intractable. We use a sequential Monte Carlo (SMC) algorithm to both fit and sample from our ABC approximation of the target probability density. This approach is shown to, empirically, be more accurate w.r.t.~the original filter than competing methods. The theoretical bias of our method is investigated; it is shown that the bias goes to zero at the expense of increased computational effort. Our approach is illustrated on a constrained sequential lasso for portfolio allocation to 15 constituents of the FTSE 100 share index.
Resumo:
Many problems in control and signal processing can be formulated as sequential decision problems for general state space models. However, except for some simple models one cannot obtain analytical solutions and has to resort to approximation. In this thesis, we have investigated problems where Sequential Monte Carlo (SMC) methods can be combined with a gradient based search to provide solutions to online optimisation problems. We summarise the main contributions of the thesis as follows. Chapter 4 focuses on solving the sensor scheduling problem when cast as a controlled Hidden Markov Model. We consider the case in which the state, observation and action spaces are continuous. This general case is important as it is the natural framework for many applications. In sensor scheduling, our aim is to minimise the variance of the estimation error of the hidden state with respect to the action sequence. We present a novel SMC method that uses a stochastic gradient algorithm to find optimal actions. This is in contrast to existing works in the literature that only solve approximations to the original problem. In Chapter 5 we presented how an SMC can be used to solve a risk sensitive control problem. We adopt the use of the Feynman-Kac representation of a controlled Markov chain flow and exploit the properties of the logarithmic Lyapunov exponent, which lead to a policy gradient solution for the parameterised problem. The resulting SMC algorithm follows a similar structure with the Recursive Maximum Likelihood(RML) algorithm for online parameter estimation. In Chapters 6, 7 and 8, dynamic Graphical models were combined with with state space models for the purpose of online decentralised inference. We have concentrated more on the distributed parameter estimation problem using two Maximum Likelihood techniques, namely Recursive Maximum Likelihood (RML) and Expectation Maximization (EM). The resulting algorithms can be interpreted as an extension of the Belief Propagation (BP) algorithm to compute likelihood gradients. In order to design an SMC algorithm, in Chapter 8 uses a nonparametric approximations for Belief Propagation. The algorithms were successfully applied to solve the sensor localisation problem for sensor networks of small and medium size.
Resumo:
DNA recognition is an essential biological process responsible for the regulation of cellular functions including protein synthesis and cell division and is implicated in the mechanism of action of some anticancer drugs. Studies directed towards defining the elements responsible for sequence specific DNA recognition through the study of the interactions of synthetic organic ligands with DNA are described.
DNA recognition by poly-N-methylpyrrolecarboxamides was studied by the synthesis and characterization of a series of molecules where the number of contiguous N-methylpyrrolecarboxamide units was increased from 2 to 9. The effect of this incremental change in structure on DNA recognition has been investigated at base pair resolution using affinity cleaving and MPE•Fe(II) footprinting techniques. These studies led to a quantitative relationship between the number of amides in the molecule and the DNA binding site size. This relationship is called the n + 1 rule and it states that a poly-N methylpyrrolecarboxamide molecule with n amides will bind n + 1 base pairs of DNA. This rule is consistent with a model where the carboxamides of these compounds form three center bridging hydrogen bonds between adjacent base pairs on opposite strands of the helix. The poly-N methylpyrrolecarboxamide recognition element was found to preferentially bind poly dA•poly dT stretches; however, both binding site selection and orientation were found to be affected by flanking sequences. Cleavage of large DNA is also described.
One approach towards the design of molecules that bind large sequences of double helical DNA sequence specifically is to couple DNA binding subunits of similar or diverse base pair specificity. Bis-EDTA-distamycin-fumaramide (BEDF) is an octaamide dimer of two tri-N methylpyrrolecarboxamide subunits linked by fumaramide. DNA recognition by BEDF was compared to P7E, an octaamide molecule containing seven consecutive pyrroles. These two compounds were found to recognize the same sites on pBR322 with approximately the same affinities demonstrating that fumaramide is an effective linking element for Nmethylpyrrolecarboxamide recognition subunits. Further studies involved the synthesis and characterization of a trimer of tetra-N-methylpyrrolecarboxamide subunits linked by β-alanine ((P4)_(3)E). This trimerization produced a molecule which is capable of recognizing 16 base pairs of A•T DNA, more than a turn and a half of the DNA helix.
DNA footprinting is a powerful direct method for determining the binding sites of proteins and small molecules on heterogeneous DNA. It was found that attachment of EDTA•Fe(II) to spermine creates a molecule, SE•Fe(II), which binds and cleaves DNA sequence neutrally. This lack of specificity provides evidence that at the nucleotide level polyamines recognize heterogeneous DNA independent of sequence and allows SE•Fe(II) to be used as a footprinting reagent. SE•Fe(II) was compared with two other small molecule footprinting reagents, EDTA•Fe(II) and MPE•Fe(II).
Resumo:
A series of eight related analogs of distamycin A has been synthesized. Footprinting and affinity cleaving reveal that only two of the analogs, pyridine-2- car box amide-netropsin (2-Py N) and 1-methylimidazole-2-carboxamide-netrops in (2-ImN), bind to DNA with a specificity different from that of the parent compound. A new class of sites, represented by a TGACT sequence, is a strong site for 2-PyN binding, and the major recognition site for 2-ImN on DNA. Both compounds recognize the G•C bp specifically, although A's and T's in the site may be interchanged without penalty. Additional A•T bp outside the binding site increase the binding affinity. The compounds bind in the minor groove of the DNA sequence, but protect both grooves from dimethylsulfate. The binding evidence suggests that 2-PyN or 2-ImN binding induces a DNA conformational change.
In order to understand this sequence specific complexation better, the Ackers quantitative footprinting method for measuring individual site affinity constants has been extended to small molecules. MPE•Fe(II) cleavage reactions over a 10^5 range of free ligand concentrations are analyzed by gel electrophoresis. The decrease in cleavage is calculated by densitometry of a gel autoradiogram. The apparent fraction of DNA bound is then calculated from the amount of cleavage protection. The data is fitted to a theoretical curve using non-linear least squares techniques. Affinity constants at four individual sites are determined simultaneously. The distamycin A analog binds solely at A•T rich sites. Affinities range from 10^(6)- 10^(7)M^(-1) The data for parent compound D fit closely to a monomeric binding curve. 2-PyN binds both A•T sites and the TGTCA site with an apparent affinity constant of 10^(5) M^(-1). 2-ImN binds A•T sites with affinities less than 5 x 10^(4) M^(-1). The affinity of 2-ImN for the TGTCA site does not change significantly from the 2-PyN value. At the TGTCA site, the experimental data fit a dimeric binding curve better than a monomeric curve. Both 2-PyN and 2-ImN have substantially lower DNA affinities than closely related compounds.
In order to probe the requirements of this new binding site, fourteen other derivatives have been synthesized and tested. All compounds that recognize the TGTCA site have a heterocyclic aromatic nitrogen ortho to the N or C-terminal amide of the netropsin subunit. Specificity is strongly affected by the overall length of the small molecule. Only compounds that consist of at least three aromatic rings linked by amides exhibit TGTCA site binding. Specificity is only weakly altered by substitution on the pyridine ring, which correlates best with steric factors. A model is proposed for TGTCA site binding that has as its key feature hydrogen bonding to both G's by the small molecule. The specificity is determined by the sequence dependence of the distance between G's.
One derivative of 2-PyN exhibits pH dependent sequence specificity. At low pH, 4-dimethylaminopyridine-2-carboxamide-netropsin binds tightly to A•T sites. At high pH, 4-Me_(2)NPyN binds most tightly to the TGTCA site. In aqueous solution, this compound protonates at the pyridine nitrogen at pH 6. Thus presence of the protonated form correlates with A•T specificity.
The binding site of a class of eukaryotic transcriptional activators typified by yeast protein GCN4 and the mammalian oncogene Jun contains a strong 2-ImN binding site. Specificity requirements for the protein and small molecule are similar. GCN4 and 2-lmN bind simultaneously to the same binding site. GCN4 alters the cleavage pattern of 2-ImN-EDTA derivative at only one of its binding sites. The details of the interaction suggest that GCN4 alters the conformation of an AAAAAAA sequence adjacent to its binding site. The presence of a yeast counterpart to Jun partially blocks 2-lmN binding. The differences do not appear to be caused by direct interactions between 2-lmN and the proteins, but by induced conformational changes in the DNA protein complex. It is likely that the observed differences in complexation are involved in the varying sequence specificity of these proteins.
Resumo:
Pyrrole–Imidazole polyamides are programmable, cell-permeable small molecules that bind in the minor groove of double-stranded DNA sequence-specifically. Polyamide binding has been shown to alter the local helical structure of DNA, disrupt protein-DNA interactions, and modulate endogenous gene expression. Py–Im polyamides targeted to the androgen receptor-DNA interface have been observed to decrease expression of androgen-regulated genes, upregulate p53, and induce apoptosis in a hormone-sensitive prostate cancer cell line. Here we report that androgen response element (ARE)-targeted polyamides induced DNA replication stress in a hormone-insensitive prostate cancer cell line. The ATR checkpoint kinase was activated in response to this stress, causing phosphorylation of MCM2, and FANCD2 was monoubiquitinated. Surprisingly, little single-stranded DNA was exhibited, and the ATR targets RPA2 and Chk1 were not phosphorylated. We conclude that polyamide induces relatively low level replication stress, and suggest inhibition of the replicative helicase as a putative mechanism based on in vitro assays. We also demonstrate polyamide-induced inhibition of DNA replication in cell free extracts from X. laevis oocytes. In this system, inhibition of chromatin decondensation is observed, preventing DNA replication initiation. Finally, we show that Py-Im polyamides targeted to the ARE and ETS binding sequence downregulate AR- and ERG-driven signaling in a prostate cancer cell line harboring the TMPRSS2-ERG fusion. In a mouse xenograft model, ARE-targeted polyamide treatment reduced growth of the tumor.
Resumo:
Feature-based vocoders, e.g., STRAIGHT, offer a way to manipulate the perceived characteristics of the speech signal in speech transformation and synthesis. For the harmonic model, which provide excellent perceived quality, features for the amplitude parameters already exist (e.g., Line Spectral Frequencies (LSF), Mel-Frequency Cepstral Coefficients (MFCC)). However, because of the wrapping of the phase parameters, phase features are more difficult to design. To randomize the phase of the harmonic model during synthesis, a voicing feature is commonly used, which distinguishes voiced and unvoiced segments. However, voice production allows smooth transitions between voiced/unvoiced states which makes voicing segmentation sometimes tricky to estimate. In this article, two-phase features are suggested to represent the phase of the harmonic model in a uniform way, without voicing decision. The synthesis quality of the resulting vocoder has been evaluated, using subjective listening tests, in the context of resynthesis, pitch scaling, and Hidden Markov Model (HMM)-based synthesis. The experiments show that the suggested signal model is comparable to STRAIGHT or even better in some scenarios. They also reveal some limitations of the harmonic framework itself in the case of high fundamental frequencies.