Biblioteca Digital

Boltzmann machines offer a new and exciting approach to automatic speech recognition, and provide a rigorous mathematical formalism for parallel computing arrays. In this paper we briefly summarize Boltzmann machine theory, and present results showing their ability to recognize both static and time-varying speech patterns. A machine with 2000 units was able to distinguish between the 11 steady-state vowels in English with an accuracy of 85%. The stability of the learning algorithm and methods of preprocessing and coding speech data before feeding it to the machine are also discussed. A new type of unit called a carry input unit, which involves a type of state-feedback, was developed for the processing of time-varying patterns and this was tested on a few short sentences. Use is made of the implications of recent work into associative memory, and the modelling of neural arrays to suggest a good configuration of Boltzmann machines for this sort of pattern recognition.

Veja mais

Modified Kanerva model. Results for real time word recognition

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes results obtained using the modified Kanerva model to perform word recognition in continuous speech after being trained on the multi-speaker Alvey 'Hotel' speech corpus. Theoretical discoveries have recently enabled us to increase the speed of execution of part of the model by two orders of magnitude over that previously reported by Prager & Fallside. The memory required for the operation of the model has been similarly reduced. The recognition accuracy reaches 95% without syntactic constraints when tested on different data from seven trained speakers. Real time simulation of a model with 9,734 active units is now possible in both training and recognition modes using the Alvey PARSIFAL transputer array. The modified Kanerva model is a static network consisting of a fixed nonlinear mapping (location matching) followed by a single layer of conventional adaptive links. A section of preprocessed speech is transformed by the non-linear mapping to a high dimensional representation. From this intermediate representation a simple linear mapping is able to perform complex pattern discrimination to form the output, indicating the nature of the speech features present in the input window.

Veja mais

Continuous speech recognition for the TIMIT database using neural networks

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Four types of neural networks which have previously been established for speech recognition and tested on a small, seven-speaker, 100-sentence database are applied to the TIMIT database. The networks are a recurrent network phoneme recognizer, a modified Kanerva model morph recognizer, a compositional representation phoneme-to-word recognizer, and a modified Kanerva model morph-to-word recognizer. The major result is for the recurrent net, giving a phoneme recognition accuracy of 57% from the si and sx sentences. The Kanerva morph recognizer achieves 66.2% accuracy for a small subset of the sa and sx sentences. The results for the word recognizers are incomplete.

Veja mais

The modified Kanerva model for automatic speech recognition

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A parallel processing network derived from Kanerva's associative memory theory Kanerva 1984 is shown to be able to train rapidly on connected speech data and recognize further speech data with a label error rate of 0·68%. This modified Kanerva model can be trained substantially faster than other networks with comparable pattern discrimination properties. Kanerva presented his theory of a self-propagating search in 1984, and showed theoretically that large-scale versions of his model would have powerful pattern matching properties. This paper describes how the design for the modified Kanerva model is derived from Kanerva's original theory. Several designs are tested to discover which form may be implemented fastest while still maintaining versatile recognition performance. A method is developed to deal with the time varying nature of the speech signal by recognizing static patterns together with a fixed quantity of contextual information. In order to recognize speech features in different contexts it is necessary for a network to be able to model disjoint pattern classes. This type of modelling cannot be performed by a single layer of links. Network research was once held back by the inability of single-layer networks to solve this sort of problem, and the lack of a training algorithm for multi-layer networks. Rumelhart, Hinton & Williams 1985 provided one solution by demonstrating the "back propagation" training algorithm for multi-layer networks. A second alternative is used in the modified Kanerva model. A non-linear fixed transformation maps the pattern space into a space of higher dimensionality in which the speech features are linearly separable. A single-layer network may then be used to perform the recognition. The advantage of this solution over the other using multi-layer networks lies in the greater power and speed of the single-layer network training algorithm. © 1989.

Veja mais

A comparison of the Boltzmann machine and the back propagation network as recognizers of static speech patterns

Relevância:

10.00% 10.00%

Publicador:

Veja mais

Efficient Implementation of Spatially-Varying 3-D Ultrasound Deconvolution

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract—There are sometimes occasions when ultrasound beamforming is performed with only a subset of the total data that will eventually be available. The most obvious example is a mechanically-swept (wobbler) probe in which the three-dimensional data block is formed from a set of individual B-scans. In these circumstances, non-blind deconvolution can be used to improve the resolution of the data. Unfortunately, most of these situations involve large blocks of three-dimensional data. Furthermore, the ultrasound blur function varies spatially with distance from the transducer. These two facts make the deconvolution process time-consuming to implement. This paper is about ways to address this problem and produce spatially-varying deconvolution of large blocks of three-dimensional data in a matter of seconds. We present two approaches, one based on hardware and the other based on software. We compare the time they each take to achieve similar results and discuss the computational resources and form of blur model that each requires.

Veja mais

Real-time quasi-static ultrasound elastography

Relevância:

10.00% 10.00%

Publicador:

Veja mais

Deconvolution and elastography based on dimensional ultrasound

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper is in two parts and addresses two of getting more information out of the RF signal from three-dimensional (3D) mechanically-swept medical ultrasound . The first topic is the use of non-blind deconvolution improve the clarity of the data, particularly in the direction to the individual B-scans. The second topic is imaging. We present a robust and efficient approach to estimation and display of axial strain information. deconvolution, we calculate an estimate of the point-spread at each depth in the image using Field II. This is used as of an Expectation Maximisation (EM) framework in which ultrasound scatterer field is modelled as the product of (a) a smooth function and (b) a fine-grain varying function. the E step, a Wiener filter is used to estimate the scatterer based on an assumed piecewise smooth component. In the M , wavelet de-noising is used to estimate the piecewise smooth from the scatterer field. strain imaging, we use a quasi-static approach with efficient based algorithms. Our contributions lie in robust and 3D displacement tracking, point-wise quality-weighted , and a stable display that shows not only strain but an indication of the quality of the data at each point in the . This enables clinicians to see where the strain estimate is and where it is mostly noise. deconvolution, we present in-vivo images and simulations quantitative performance measures. With the blurred 3D taken as OdB, we get an improvement in signal to noise ratio 4.6dB with a Wiener filter alone, 4.36dB with the ForWaRD and S.18dB with our EM algorithm. For strain imaging show images based on 2D and 3D data and describe how full D analysis can be performed in about 20 seconds on a typical . We will also present initial results of our clinical study to explore the applications of our system in our local hospital. © 2008 IEEE.

Veja mais

Ultrasonic imaging of 3D displacement vectors using a simulated 2D array and beamsteering.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most quasi-static ultrasound elastography methods image only the axial strain, derived from displacements measured in the direction of ultrasound propagation. In other directions, the beam lacks high resolution phase information and displacement estimation is therefore less precise. However, these estimates can be improved by steering the ultrasound beam through multiple angles and combining displacements measured along the different beam directions. Previously, beamsteering has only considered the 2D case to improve the lateral displacement estimates. In this paper, we extend this to 3D using a simulated 2D array to steer both laterally and elevationally in order to estimate the full 3D displacement vector over a volume. The method is tested on simulated and phantom data using a simulated 6-10MHz array, and the precision of displacement estimation is measured with and without beamsteering. In simulations, we found a statistically significant improvement in the precision of lateral and elevational displacement estimates: lateral precision 35.69μm unsteered, 3.70μm steered; elevational precision 38.67μm unsteered, 3.64μm steered. Similar results were found in the phantom data: lateral precision 26.51μm unsteered, 5.78μm steered; elevational precision 28.92μm unsteered, 11.87μm steered. We conclude that volumetric 3D beamsteering improves the precision of lateral and elevational displacement estimates.

Veja mais

Ultrasonic imaging of 3D displacement vectors using a simulated 2D array and beamsteering

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most quasi-static ultrasound elastography methods image only the axial strain, derived from displacements measured in the direction of ultrasound propagation. In other directions, the beam lacks high resolution phase information and displacement estimation is therefore less precise. However, these estimates can be improved by steering the ultrasound beam through multiple angles and combining displacements measured along the different beam directions. Previously, beamsteering has only considered the 2D case to improve the lateral displacement estimates. In this paper, we extend this to 3D using a simulated 2D array to steer both laterally and elevationally in order to estimate the full 3D displacement vector over a volume. The method is tested on simulated and phantom data using a simulated 6-10 MHz array, and the precision of displacement estimation is measured with and without beamsteering. In simulations, we found a statistically significant improvement in the precision of lateral and elevational displacement estimates: lateral precision 35.69 μm unsteered, 3.70 μm steered; elevational precision 38.67 μm unsteered, 3.64 μm steered. Similar results were found in the phantom data: lateral precision 26.51 μm unsteered, 5.78 μm steered; elevational precision 28.92 μm unsteered, 11.87 μm steered. We conclude that volumetric 3D beamsteering improves the precision of lateral and elevational displacement estimates. © 2012 Elsevier B.V. All rights reserved.

Veja mais

365 resultados para Prager Fenstersturz

Filtro por publicador