7 resultados para De-normalisation
em Cambridge University Engineering Department Publications Database
Resumo:
Human listeners can identify vowels regardless of speaker size, although the sound waves for an adult and a child speaking the ’same’ vowel would differ enormously. The differences are mainly due to the differences in vocal tract length (VTL) and glottal pulse rate (GPR) which are both related to body size. Automatic speech recognition machines are notoriously bad at understanding children if they have been trained on the speech of an adult. In this paper, we propose that the auditory system adapts its analysis of speech sounds, dynamically and automatically to the GPR and VTL of the speaker on a syllable-to-syllable basis. We illustrate how this rapid adaptation might be performed with the aid of a computational version of the auditory image model, and we propose that an auditory preprocessor of this form would improve the robustness of speech recognisers.
2D PIV measurements in the near field of grid turbulence using stitched fields from multiple cameras
Resumo:
We present measurements of grid turbulence using 2D particle image velocimetry taken immediately downstream from the grid at a Reynolds number of Re M = 16500 where M is the rod spacing. A long field of view of 14M x 4M in the down- and cross-stream directions was achieved by stitching multiple cameras together. Two uniform biplanar grids were selected to have the same M and pressure drop but different rod diameter D and crosssection. A large data set (10 4 vector fields) was obtained to ensure good convergence of second-order statistics. Estimations of the dissipation rate ε of turbulent kinetic energy (TKE) were found to be sensitive to the number of meansquared velocity gradient terms included and not whether the turbulence was assumed to adhere to isotropy or axisymmetry. The resolution dependency of different turbulence statistics was assessed with a procedure that does not rely on the dissipation scale η. The streamwise evolution of the TKE components and ε was found to collapse across grids when the rod diameter was included in the normalisation. We argue that this should be the case between all regular grids when the other relevant dimensionless quantities are matched and the flow has become homogeneous across the stream. Two-point space correlation functions at x/M = 1 show evidence of complex wake interactions which exhibit a strong Reynolds number dependence. However, these changes in initial conditions disappear indicating rapid cross-stream homogenisation. On the other hand, isotropy was, as expected, not found to be established by x/M = 12 for any case studied. © Springer-Verlag 2012.
Resumo:
The task in keyword spotting (KWS) is to hypothesise times at which any of a set of key terms occurs in audio. An important aspect of such systems are the scores assigned to these hypotheses, the accuracy of which have a significant impact on performance. Estimating these scores may be formulated as a confidence estimation problem, where a measure of confidence is assigned to each key term hypothesis. In this work, a set of discriminative features is defined, and combined using a conditional random field (CRF) model for improved confidence estimation. An extension to this model to directly address the problem of score normalisation across key terms is also introduced. The implicit score normalisation which results from applying this approach to separate systems in a hybrid configuration yields further benefits. Results are presented which show notable improvements in KWS performance using the techniques presented in this work. © 2013 IEEE.
Resumo:
Spatial normalisation is a key element of statistical parametric mapping and related techniques for analysing cohort statistics on voxel arrays and surfaces. The normalisation process involves aligning each individual specimen to a template using some sort of registration algorithm. Any misregistration will result in data being mapped onto the template at the wrong location. At best, this will introduce spatial imprecision into the subsequent statistical analysis. At worst, when the misregistration varies systematically with a covariate of interest, it may lead to false statistical inference. Since misregistration generally depends on the specimen's shape, we investigate here the effect of allowing for shape as a confound in the statistical analysis, with shape represented by the dominant modes of variation observed in the cohort. In a series of experiments on synthetic surface data, we demonstrate how allowing for shape can reveal true effects that were previously masked by systematic misregistration, and also guard against misinterpreting systematic misregistration as a true effect. We introduce some heuristics for disentangling misregistration effects from true effects, and demonstrate the approach's practical utility in a case study of the cortical bone distribution in 268 human femurs.