112 resultados para Hidden homelessness
Resumo:
A nonparametric Bayesian extension of Factor Analysis (FA) is proposed where observed data $\mathbf{Y}$ is modeled as a linear superposition, $\mathbf{G}$, of a potentially infinite number of hidden factors, $\mathbf{X}$. The Indian Buffet Process (IBP) is used as a prior on $\mathbf{G}$ to incorporate sparsity and to allow the number of latent features to be inferred. The model's utility for modeling gene expression data is investigated using randomly generated data sets based on a known sparse connectivity matrix for E. Coli, and on three biological data sets of increasing complexity.
Resumo:
Using an entropy argument, it is shown that stochastic context-free grammars (SCFG's) can model sources with hidden branching processes more efficiently than stochastic regular grammars (or equivalently HMM's). However, the automatic estimation of SCFG's using the Inside-Outside algorithm is limited in practice by its O(n3) complexity. In this paper, a novel pre-training algorithm is described which can give significant computational savings. Also, the need for controlling the way that non-terminals are allocated to hidden processes is discussed and a solution is presented in the form of a grammar minimization procedure. © 1990.
Resumo:
Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often incomplete, i.e. different numbers of filled rank positions cause heterogeneity in the data. We propose a mixture approach for clustering of heterogeneous rank data. Rankings of different lengths can be described and compared by means of a single probabilistic model. A maximum entropy approach avoids hidden assumptions about missing rank positions. Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model. Experiments on both synthetic data and real-world data demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process.
Resumo:
We extend previous work on fully unsupervised part-of-speech tagging. Using a non-parametric version of the HMM, called the infinite HMM (iHMM), we address the problem of choosing the number of hidden states in unsupervised Markov models for PoS tagging. We experiment with two non-parametric priors, the Dirichlet and Pitman-Yor processes, on the Wall Street Journal dataset using a parallelized implementation of an iHMM inference algorithm. We evaluate the results with a variety of clustering evaluation metrics and achieve equivalent or better performances than previously reported. Building on this promising result we evaluate the output of the unsupervised PoS tagger as a direct replacement for the output of a fully supervised PoS tagger for the task of shallow parsing and compare the two evaluations. © 2009 ACL and AFNLP.
Resumo:
Commentators suggest that to survive in developed economies manufacturing firms have to move up the value chain, innovating and creating ever more sophisticated products and services, so they do not have to compete on the basis of cost. While this strategy is proving increasingly popular with policy makers and academics there is limited empirical evidence to explore the extent to which it is being adopted in practice. And if so, what the impact of this servitization of manufacturing might be. This paper seeks to fill a gap in the literature by presenting empirical evidence on the range and extent of servitization. Data are drawn from the OSIRIS database on 10,028 firms incorporated in 25 different countries. The paper presents an analysis of these data which suggests that: [i] manufacturing firms in developed economies are adopting a range of servitization strategies-12 separate approaches to servitization are identified; [ii] these 12 categories can be used to extend the traditional three options for servitization-product oriented Product-Service Systems, use oriented Product-Service Systems and result oriented Product-Service Systems, by adding two new categories "integration oriented Product-Service Systems" and "service oriented Product-Service Systems"; [iii] while the manufacturing firms that have servitized are larger than traditional manufacturing firms in terms of sales revenues, at the aggregate level they also generate lower profits as a % of sales; [iv] these findings are moderated by firm size (measured in terms of numbers of employees). In smaller firms servitization appears to pay off while in larger firms it proves more problematic; and [v] there are some hidden risks associated with servitization-the sample contains a greater proportion of bankrupt servitized firms than would be expected. © Springer Science + Business Media, LLC 2009.
Resumo:
Hidden Markov model (HMM)-based speech synthesis systems possess several advantages over concatenative synthesis systems. One such advantage is the relative ease with which HMM-based systems are adapted to speakers not present in the training dataset. Speaker adaptation methods used in the field of HMM-based automatic speech recognition (ASR) are adopted for this task. In the case of unsupervised speaker adaptation, previous work has used a supplementary set of acoustic models to estimate the transcription of the adaptation data. This paper first presents an approach to the unsupervised speaker adaptation task for HMM-based speech synthesis models which avoids the need for such supplementary acoustic models. This is achieved by defining a mapping between HMM-based synthesis models and ASR-style models, via a two-pass decision tree construction process. Second, it is shown that this mapping also enables unsupervised adaptation of HMM-based speech synthesis models without the need to perform linguistic analysis of the estimated transcription of the adaptation data. Third, this paper demonstrates how this technique lends itself to the task of unsupervised cross-lingual adaptation of HMM-based speech synthesis models, and explains the advantages of such an approach. Finally, listener evaluations reveal that the proposed unsupervised adaptation methods deliver performance approaching that of supervised adaptation.
Resumo:
Spread Transform (ST) is a quantization watermarking algorithm in which vectors of the wavelet coefficients of a host work are quantized, using one of two dithered quantizers, to embed hidden information bits; Loo had some success in applying such a scheme to still images. We extend ST to the video watermarking problem. Visibility considerations require that each spreading vector refer to corresponding pixels in each of several frames, that is, a multi-frame embedding approach. Use of the hierarchical complex wavelet transform (CWT) for a visual mask reduces computation and improves robustness to jitter and valumetric scaling. We present a method of recovering temporal synchronization at the detector, and give initial results demonstrating the robustness and capacity of the scheme.
Resumo:
Statistical dependencies among wavelet coefficients are commonly represented by graphical models such as hidden Markov trees (HMTs). However, in linear inverse problems such as deconvolution, tomography, and compressed sensing, the presence of a sensing or observation matrix produces a linear mixing of the simple Markovian dependency structure. This leads to reconstruction problems that are non-convex optimizations. Past work has dealt with this issue by resorting to greedy or suboptimal iterative reconstruction methods. In this paper, we propose new modeling approaches based on group-sparsity penalties that leads to convex optimizations that can be solved exactly and efficiently. We show that the methods we develop perform significantly better in de-convolution and compressed sensing applications, while being as computationally efficient as standard coefficient-wise approaches such as lasso. © 2011 IEEE.
Resumo:
An increasingly common scenario in building speech synthesis and recognition systems is training on inhomogeneous data. This paper proposes a new framework for estimating hidden Markov models on data containing both multiple speakers and multiple languages. The proposed framework, speaker and language factorization, attempts to factorize speaker-/language-specific characteristics in the data and then model them using separate transforms. Language-specific factors in the data are represented by transforms based on cluster mean interpolation with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by transforms based on constrained maximum-likelihood linear regression. Experimental results on statistical parametric speech synthesis show that the proposed framework enables data from multiple speakers in different languages to be used to: train a synthesis system; synthesize speech in a language using speaker characteristics estimated in a different language; and adapt to a new language. © 2012 IEEE.
Resumo:
We present a new haplotype-based approach for inferring local genetic ancestry of individuals in an admixed population. Most existing approaches for local ancestry estimation ignore the latent genetic relatedness between ancestral populations and treat them as independent. In this article, we exploit such information by building an inheritance model that describes both the ancestral populations and the admixed population jointly in a unified framework. Based on an assumption that the common hypothetical founder haplotypes give rise to both the ancestral and the admixed population haplotypes, we employ an infinite hidden Markov model to characterize each ancestral population and further extend it to generate the admixed population. Through an effective utilization of the population structural information under a principled nonparametric Bayesian framework, the resulting model is significantly less sensitive to the choice and the amount of training data for ancestral populations than state-of-the-art algorithms. We also improve the robustness under deviation from common modeling assumptions by incorporating population-specific scale parameters that allow variable recombination rates in different populations. Our method is applicable to an admixed population from an arbitrary number of ancestral populations and also performs competitively in terms of spurious ancestry proportions under a general multiway admixture assumption. We validate the proposed method by simulation under various admixing scenarios and present empirical analysis results from a worldwide-distributed dataset from the Human Genome Diversity Project.
Resumo:
In this paper, we consider Bayesian interpolation and parameter estimation in a dynamic sinusoidal model. This model is more flexible than the static sinusoidal model since it enables the amplitudes and phases of the sinusoids to be time-varying. For the dynamic sinusoidal model, we derive a Bayesian inference scheme for the missing observations, hidden states and model parameters of the dynamic model. The inference scheme is based on a Markov chain Monte Carlo method known as Gibbs sampler. We illustrate the performance of the inference scheme to the application of packet-loss concealment of lost audio and speech packets. © EURASIP, 2010.
Resumo:
Construction industry is a sector that is renowned for the slow uptake of new technologies. This is usually due to the conservative nature of this sector that relies heavily on tried and tested and successful old business practices. However, there is an eagerness in this industry to adopt Building Information Modelling (BIM) technologies to capture and record accurate information about a building project. But vast amounts of information and knowledge about the construction process is typically hidden within informal social interactions that take place in the work environment. In this paper we present a vision where smartphones and tablet devices carried by construction workers are used to capture the interaction and communication between workers in the field. Informal chats about decisions taken in the field, impromptu formation of teams, identification of key persons for certain tasks, and tracking the flow of information across the project community, are some pieces of information that could be captured by employing social sensing in the field. This information can not only be used during the construction to improve the site processes but it can also be exploited by the end user during maintenance of the building. We highlight the challenges that need to be overcome for this mobile and social sensing system to become a reality. © 2012 ACM.
Resumo:
We show that the sensor self-localization problem can be cast as a static parameter estimation problem for Hidden Markov Models and we implement fully decentralized versions of the Recursive Maximum Likelihood and on-line Expectation-Maximization algorithms to localize the sensor network simultaneously with target tracking. For linear Gaussian models, our algorithms can be implemented exactly using a distributed version of the Kalman filter and a novel message passing algorithm. The latter allows each node to compute the local derivatives of the likelihood or the sufficient statistics needed for Expectation-Maximization. In the non-linear case, a solution based on local linearization in the spirit of the Extended Kalman Filter is proposed. In numerical examples we demonstrate that the developed algorithms are able to learn the localization parameters. © 2012 IEEE.
Resumo:
In this paper we formulate the nonnegative matrix factorisation (NMF) problem as a maximum likelihood estimation problem for hidden Markov models and propose online expectation-maximisation (EM) algorithms to estimate the NMF and the other unknown static parameters. We also propose a sequential Monte Carlo approximation of our online EM algorithm. We show the performance of the proposed method with two numerical examples. © 2012 IFAC.
Resumo:
Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman's coalescent, Dirichlet diffusion trees and Wishart processes.