2 resultados para parallelization

em Cambridge University Engineering Department Publications Database


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper compares parallel and distributed implementations of an iterative, Gibbs sampling, machine learning algorithm. Distributed implementations run under Hadoop on facility computing clouds. The probabilistic model under study is the infinite HMM [1], in which parameters are learnt using an instance blocked Gibbs sampling, with a step consisting of a dynamic program. We apply this model to learn part-of-speech tags from newswire text in an unsupervised fashion. However our focus here is on runtime performance, as opposed to NLP-relevant scores, embodied by iteration duration, ease of development, deployment and debugging. © 2010 IEEE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes a structured SVM framework suitable for noise-robust medium/large vocabulary speech recognition. Several theoretical and practical extensions to previous work on small vocabulary tasks are detailed. The joint feature space based on word models is extended to allow context-dependent triphone models to be used. By interpreting the structured SVM as a large margin log-linear model, illustrates that there is an implicit assumption that the prior of the discriminative parameter is a zero mean Gaussian. However, depending on the definition of likelihood feature space, a non-zero prior may be more appropriate. A general Gaussian prior is incorporated into the large margin training criterion in a form that allows the cutting plan algorithm to be directly applied. To further speed up the training process, 1-slack algorithm, caching competing hypothesis and parallelization strategies are also proposed. The performance of structured SVMs is evaluated on noise corrupted medium vocabulary speech recognition task: AURORA 4. © 2011 IEEE.