Biblioteca Digital

One important issue in designing state-of-the-art LVCSR systems is the choice of acoustic units. Context dependent (CD) phones remain the dominant form of acoustic units. They can capture the co-articulatory effect in speech via explicit modelling. However, for other more complicated phonological processes, they rely on the implicit modelling ability of the underlying statistical models. Alternatively, it is possible to construct acoustic models based on higher level linguistic units, for example, syllables, to explicitly capture these complex patterns. When sufficient training data is available, this approach may show an advantage over implicit acoustic modelling. In this paper a wide range of acoustic units are investigated to improve LVCSR system performance. Significant error rate gains up to 7.1% relative (0.8% abs.) were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using word and syllable position dependent triphone and quinphone models. © 2011 IEEE.

Veja mais

Independent component analysis based frontal face detection

Relevância:

80.00% 80.00%

Publicador:

Veja mais

Independent component analysis based frontal face detection

Relevância:

80.00% 80.00%

Publicador:

Veja mais

Self-similar breakup of near-inviscid liquids

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The final stages of pinchoff and breakup of dripping droplets of near-inviscid Newtonian fluids are studied experimentally for pure water and ethanol. High-speed imaging and image analysis are used to determine the angle and the minimum neck size of the cone-shaped extrema of the ligaments attached to dripping droplets in the final microseconds before pinchoff. The angle is shown to steadily approach the value of 18.0 ±0.4, independently of the initial flow conditions or the type of breakup. The filament thins and necks following a τ2 /3 law in terms of the time remaining until pinchoff, regardless of the initial conditions. The observed behavior confirms theoretical predictions. © 2012 American Physical Society.

Veja mais

Self-similar breakup of near-inviscid liquids.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The final stages of pinchoff and breakup of dripping droplets of near-inviscid Newtonian fluids are studied experimentally for pure water and ethanol. High-speed imaging and image analysis are used to determine the angle and the minimum neck size of the cone-shaped extrema of the ligaments attached to dripping droplets in the final microseconds before pinchoff. The angle is shown to steadily approach the value of 18.0 ± 0.4°, independently of the initial flow conditions or the type of breakup. The filament thins and necks following a τ(2/3) law in terms of the time remaining until pinchoff, regardless of the initial conditions. The observed behavior confirms theoretical predictions.

Veja mais

Syllable language models for Mandarin speech recognition: exploiting character language models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance.

Veja mais

48 resultados para Word and image

Filtro por publicador