8 resultados para weighted model
em Cambridge University Engineering Department Publications Database
Resumo:
In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaptation may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences. ©2010 IEEE.
Resumo:
Language models (LMs) are often constructed by building multiple individual component models that are combined using context independent interpolation weights. By tuning these weights, using either perplexity or discriminative approaches, it is possible to adapt LMs to a particular task. This paper investigates the use of context dependent weighting in both interpolation and test-time adaptation of language models. Depending on the previous word contexts, a discrete history weighting function is used to adjust the contribution from each component model. As this dramatically increases the number of parameters to estimate, robust weight estimation schemes are required. Several approaches are described in this paper. The first approach is based on MAP estimation where interpolation weights of lower order contexts are used as smoothing priors. The second approach uses training data to ensure robust estimation of LM interpolation weights. This can also serve as a smoothing prior for MAP adaptation. A normalized perplexity metric is proposed to handle the bias of the standard perplexity criterion to corpus size. A range of schemes to combine weight information obtained from training data and test data hypotheses are also proposed to improve robustness during context dependent LM adaptation. In addition, a minimum Bayes' risk (MBR) based discriminative training scheme is also proposed. An efficient weighted finite state transducer (WFST) decoding algorithm for context dependent interpolation is also presented. The proposed technique was evaluated using a state-of-the-art Mandarin Chinese broadcast speech transcription task. Character error rate (CER) reductions up to 7.3 relative were obtained as well as consistent perplexity improvements. © 2012 Elsevier Ltd. All rights reserved.
Resumo:
Discrete element modeling is being used increasingly to simulate flow in fluidized beds. These models require complex measurement techniques to provide validation for the approximations inherent in the model. This paper introduces the idea of modeling the experiment to ensure that the validation is accurate. Specifically, a 3D, cylindrical gas-fluidized bed was simulated using a discrete element model (DEM) for particle motion coupled with computational fluid dynamics (CFD) to describe the flow of gas. The results for time-averaged, axial velocity during bubbling fluidization were compared with those from magnetic resonance (MR) experiments made on the bed. The DEM-CFD data were postprocessed with various methods to produce time-averaged velocity maps for comparison with the MR results, including a method which closely matched the pulse sequence and data processing procedure used in the MR experiments. The DEM-CFD results processed with the MR-type time-averaging closely matched experimental MR results, validating the DEM-CFD model. Analysis of different averaging procedures confirmed that MR time-averages of dynamic systems correspond to particle-weighted averaging, rather than frame-weighted averaging, and also demonstrated that the use of Gaussian slices in MR imaging of dynamic systems is valid. © 2013 American Chemical Society.
Resumo:
Flow measurement data at the district meter area (DMA) level has the potential for burst detection in the water distribution systems. This work investigates using a polynomial function fitted to the historic flow measurements based on a weighted least-squares method for automatic burst detection in the U.K. water distribution networks. This approach, when used in conjunction with an expectationmaximization (EM) algorithm, can automatically select useful data from the historic flow measurements, which may contain normal and abnormal operating conditions in the distribution network, e.g., water burst. Thus, the model can estimate the normal water flow (nonburst condition), and hence the burst size on the water distribution system can be calculated from the difference between the measured flow and the estimated flow. The distinguishing feature of this method is that the burst detection is fully unsupervised, and the burst events that have occurred in the historic data do not affect the procedure and bias the burst detection algorithm. Experimental validation of the method has been carried out using a series of flushing events that simulate burst conditions to confirm that the simulated burst sizes are capable of being estimated correctly. This method was also applied to eight DMAs with known real burst events, and the results of burst detections are shown to relate to the water company's records of pipeline reparation work. © 2014 American Society of Civil Engineers.