Biblioteca Digital

5 resultados para MAXIMIZATION

em Duke University

Finite mixture distributions, sequential likelihood and the EM algorithm

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A popular way to account for unobserved heterogeneity is to assume that the data are drawn from a finite mixture distribution. A barrier to using finite mixture models is that parameters that could previously be estimated in stages must now be estimated jointly: using mixture distributions destroys any additive separability of the log-likelihood function. We show, however, that an extension of the EM algorithm reintroduces additive separability, thus allowing one to estimate parameters sequentially during each maximization step. In establishing this result, we develop a broad class of estimators for mixture models. Returning to the likelihood problem, we show that, relative to full information maximum likelihood, our sequential estimator can generate large computational savings with little loss of efficiency.

Veja mais

Probabilistic choice: A simple invariance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

When subjects must choose repeatedly between two or more alternatives, each of which dispenses reward on a probabilistic basis (two-armed bandit ), their behavior is guided by the two possible outcomes, reward and nonreward. The simplest stochastic choice rule is that the probability of choosing an alternative increases following a reward and decreases following a nonreward (reward following ). We show experimentally and theoretically that animal subjects behave as if the absolute magnitudes of the changes in choice probability caused by reward and nonreward do not depend on the response which produced the reward or nonreward (source independence ), and that the effects of reward and nonreward are in constant ratio under fixed conditions (effect-ratio invariance )--properties that fit the definition of satisficing . Our experimental results are either not predicted by, or are inconsistent with, other theories of free-operant choice such as Bush-Mosteller, molar maximization, momentary maximizing, and melioration (matching).

Veja mais

Designing Climate Mitigation Policy

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper provides an exhaustive review of critical issues in the design of climate mitigation policy by pulling together key findings and controversies from diverse literatures on mitigation costs, damage valuation, policy instrument choice, technological innovation, and international climate policy. We begin with the broadest issue of how high assessments suggest the near and medium term price on greenhouse gases would need to be, both under cost-effective stabilization of global climate and under net benefit maximization or Pigouvian emissions pricing. The remainder of the paper focuses on the appropriate scope of regulation, issues in policy instrument choice, complementary technology policy, and international policy architectures.

Veja mais

Cross-Domain Multitask Learning with Latent Probit Models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Learning multiple tasks across heterogeneous domains is a challenging problem since the feature space may not be the same for different tasks. We assume the data in multiple tasks are generated from a latent common domain via sparse domain transforms and propose a latent probit model (LPM) to jointly learn the domain transforms, and the shared probit classifier in the common domain. To learn meaningful task relatedness and avoid over-fitting in classification, we introduce sparsity in the domain transforms matrices, as well as in the common classifier. We derive theoretical bounds for the estimation error of the classifier in terms of the sparsity of domain transforms. An expectation-maximization algorithm is derived for learning the LPM. The effectiveness of the approach is demonstrated on several real datasets.

Veja mais

Computational Inference of Genome-Wide Protein-DNA Interactions Using High-Throughput Genomic Data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Transcriptional regulation has been studied intensively in recent decades. One important aspect of this regulation is the interaction between regulatory proteins, such as transcription factors (TF) and nucleosomes, and the genome. Different high-throughput techniques have been invented to map these interactions genome-wide, including ChIP-based methods (ChIP-chip, ChIP-seq, etc.), nuclease digestion methods (DNase-seq, MNase-seq, etc.), and others. However, a single experimental technique often only provides partial and noisy information about the whole picture of protein-DNA interactions. Therefore, the overarching goal of this dissertation is to provide computational developments for jointly modeling different experimental datasets to achieve a holistic inference on the protein-DNA interaction landscape.

We first present a computational framework that can incorporate the protein binding information in MNase-seq data into a thermodynamic model of protein-DNA interaction. We use a correlation-based objective function to model the MNase-seq data and a Markov chain Monte Carlo method to maximize the function. Our results show that the inferred protein-DNA interaction landscape is concordant with the MNase-seq data and provides a mechanistic explanation for the experimentally collected MNase-seq fragments. Our framework is flexible and can easily incorporate other data sources. To demonstrate this flexibility, we use prior distributions to integrate experimentally measured protein concentrations.

We also study the ability of DNase-seq data to position nucleosomes. Traditionally, DNase-seq has only been widely used to identify DNase hypersensitive sites, which tend to be open chromatin regulatory regions devoid of nucleosomes. We reveal for the first time that DNase-seq datasets also contain substantial information about nucleosome translational positioning, and that existing DNase-seq data can be used to infer nucleosome positions with high accuracy. We develop a Bayes-factor-based nucleosome scoring method to position nucleosomes using DNase-seq data. Our approach utilizes several effective strategies to extract nucleosome positioning signals from the noisy DNase-seq data, including jointly modeling data points across the nucleosome body and explicitly modeling the quadratic and oscillatory DNase I digestion pattern on nucleosomes. We show that our DNase-seq-based nucleosome map is highly consistent with previous high-resolution maps. We also show that the oscillatory DNase I digestion pattern is useful in revealing the nucleosome rotational context around TF binding sites.

Finally, we present a state-space model (SSM) for jointly modeling different kinds of genomic data to provide an accurate view of the protein-DNA interaction landscape. We also provide an efficient expectation-maximization algorithm to learn model parameters from data. We first show in simulation studies that the SSM can effectively recover underlying true protein binding configurations. We then apply the SSM to model real genomic data (both DNase-seq and MNase-seq data). Through incrementally increasing the types of genomic data in the SSM, we show that different data types can contribute complementary information for the inference of protein binding landscape and that the most accurate inference comes from modeling all available datasets.

This dissertation provides a foundation for future research by taking a step toward the genome-wide inference of protein-DNA interaction landscape through data integration.

Veja mais

5 resultados para MAXIMIZATION

em Duke University

Filtro por publicador

Finite mixture distributions, sequential likelihood and the EM algorithm

Probabilistic choice: A simple invariance.

Designing Climate Mitigation Policy

Cross-Domain Multitask Learning with Latent Probit Models

Computational Inference of Genome-Wide Protein-DNA Interactions Using High-Throughput Genomic Data