301 resultados para Speech synthesis Data processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of hidden Markov models is placed in a connectionist framework, and an alternative approach to improving their ability to discriminate between classes is described. Using a network style of training, a measure of discrimination based on the a posteriori probability of state occupation is proposed, and the theory for its optimization using error back-propagation and gradient ascent is presented. The method is shown to be numerically well behaved, and results are presented which demonstrate that when using a simple threshold test on the probability of state occupation, the proposed optimization scheme leads to improved recognition performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new architecture which integrates recurrent input transformations (RIT) and continuous density HMMs. The basic HMM structure is extended to accommodate recurrent neural networks which transform the input observations before they enter the Gaussian output distributions associated with the states of the HMM. During training the parameters of both HMM and RIT are simultaneously optimized according to the Maximum Mutual Information (MMI) criterion. Results are presented for the E-set recognition task which demonstrate the ability of recurrent input transformations to exploit longer term correlations in the speech signal and to give improved discrimination.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A model of the auditory periphery assembled from analog network submodels of all the relevant anatomical structures is described. There is bidirectional coupling between networks representing the outer ear, middle ear and cochlea. A simple voltage source representation of the outer hair cells provides level-dependent basilar membrane curves. The networks are translated into efficient computational modules by means of wave digital filtering. A feedback unit regulates the average firing rate at the output of an inner hair cell module via a simplified modelling of the dynamics of the descending paths to the peripheral ear. This leads to a digital model of the entire auditory periphery with applications to both speech and hearing research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports our experiences with a phoneme recognition system for the TIMIT database which uses multiple mixture continuous density monophone HMMs trained using MMI. A comprehensive set of results are presented comparing the ML and MMI training criteria for both diagonal and full covariance models. These results using simple monophone HMMs show clear performance gains achieved by MMI training, and are comparable to the best reported by others including those which use context-dependent models. In addition, the paper discusses a number of performance and implementation issues which are crucial to successful MMI training.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An optical fiber strain sensing technique, based on Brillouin Optical Time Domain Reflectometry (BOTDR), was used to obtain the full deformation profile of a secant pile wall during construction of an adjacent basement in London. Details of the installation of sensors as well as data processing are described. By installing optical fiber down opposite sides of the pile, the distributed strain profiles obtained can be used to give both the axial and lateral movements along the pile. Measurements obtained from the BOTDR were found in good agreement with inclinometer data from the adjacent piles. The relative merits of the two different techniques are discussed. © 2007 ASCE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a Bayesian method for polyphonic music description. The method first divides an input audio signal into a series of sections called snapshots, and then estimates parameters such as fundamental frequencies and amplitudes of the notes contained in each snapshot. The parameter estimation process is based on a frequency domain modelling and Gibbs sampling. Experimental results obtained from audio signals of test note patterns are encouraging; the accuracy is better than 80% for the estimation of fundamental frequencies in terms of semitones and instrument names when the number of simultaneous notes is two.