322 resultados para Speech Processing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the design and implementation of a unique undergraduate program in signal processing at the Queensland University of Technology (QUT). The criteria that influenced the choice of the subjects and the laboratories developed to support them are presented. A recently established Signal Processing Research Centre (SPRC) has played an important role in the development of the signal processing teaching program. The SPRC also provides training opportunities for postgraduate studies and research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The School of Electrical and Electronic Systems Engineering at Queensland University of Technology, Brisbane, Australia (QUT), offers three bachelor degree courses in electrical and computer engineering. In all its courses there is a strong emphasis on signal processing. A newly established Signal Processing Research Centre (SPRC) has played an important role in the development of the signal processing units in these courses. This paper describes the unique design of the undergraduate program in signal processing at QUT, the laboratories developed to support it, and the criteria that influenced the design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses the principal domains of auto- and cross-trispectra. It is shown that the cumulant and moment based trispectra are identical except on certain planes in trifrequency space. If these planes are avoided, their principal domains can be derived by considering the regions of symmetry of the fourth order spectral moment. The fourth order averaged periodogram will then serve as an estimate for both cumulant and moment trispectra. Statistics of estimates of normalised trispectra or tricoherence are also discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An approach to pattern recognition using invariant parameters based on higher-order spectra is presented. In particular, bispectral invariants are used to classify one-dimensional shapes. The bispectrum, which is translation invariant, is integrated along straight lines passing through the origin in bifrequency space. The phase of the integrated bispectrum is shown to be scale- and amplification-invariant. A minimal set of these invariants is selected as the feature vector for pattern classification. Pattern recognition using higher-order spectral invariants is fast, suited for parallel implementation, and works for signals corrupted by Gaussian noise. The classification technique is shown to distinguish two similar but different bolts given their one-dimensional profiles

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new approach to recognition of images using invariant features based on higher-order spectra is presented. Higher-order spectra are translation invariant because translation produces linear phase shifts which cancel. Scale and amplification invariance are satisfied by the phase of the integral of a higher-order spectrum along a radial line in higher-order frequency space because the contour of integration maps onto itself and both the real and imaginary parts are affected equally by the transformation. Rotation invariance is introduced by deriving invariants from the Radon transform of the image and using the cyclic-shift invariance property of the discrete Fourier transform magnitude. Results on synthetic and actual images show isolated, compact clusters in feature space and high classification accuracies

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Features derived from the trispectra of DFT magnitude slices are used for multi-font digit recognition. These features are insensitive to translation, rotation, or scaling of the input. They are also robust to noise. Classification accuracy tests were conducted on a common data base of 256× 256 pixel bilevel images of digits in 9 fonts. Randomly rotated and translated noisy versions were used for training and testing. The results indicate that the trispectral features are better than moment invariants and affine moment invariants. They achieve a classification accuracy of 95% compared to about 81% for Hu's (1962) moment invariants and 39% for the Flusser and Suk (1994) affine moment invariants on the same data in the presence of 1% impulse noise using a 1-NN classifier. For comparison, a multilayer perceptron with no normalization for rotations and translations yields 34% accuracy on 16× 16 pixel low-pass filtered and decimated versions of the same data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An application of image processing techniques to recognition of hand-drawn circuit diagrams is presented. The scanned image of a diagram is pre-processed to remove noise and converted to bilevel. Morphological operations are applied to obtain a clean, connected representation using thinned lines. The diagram comprises of nodes, connections and components. Nodes and components are segmented using appropriate thresholds on a spatially varying object pixel density. Connection paths are traced using a pixel-stack. Nodes are classified using syntactic analysis. Components are classified using a combination of invariant moments, scalar pixel-distribution features, and vector relationships between straight lines in polygonal representations. A node recognition accuracy of 82% and a component recognition accuracy of 86% was achieved on a database comprising 107 nodes and 449 components. This recogniser can be used for layout “beautification” or to generate input code for circuit analysis and simulation packages

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. We have previously shown (Int. Conf. on Acoustics, Speech and Signal Proc., vol. 6, pp. 3693-3696, May 1998) that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms either subsystem individually. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have developed digital image registration program for a MC 68000 based fundus image processing system (FIPS). FIPS not only is capable of executing typical image processing algorithms in spatial as well as Fourier domain, the execution time for many operations has been made much quicker by using a hybrid of "C", Fortran and MC6000 assembly languages.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The phase of an analytic signal constructed from the autocorrelation function of a signal contains significant information about the shape of the signal. Using Bedrosian's (1963) theorem for the Hilbert transform it is proved that this phase is robust to multiplicative noise if the signal is baseband and the spectra of the signal and the noise do not overlap. Higher-order spectral features are interpreted in this context and shown to extract nonlinear phase information while retaining robustness. The significance of the result is that prior knowledge of the spectra is not required.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the feasibility of the application of an Imputer in a multiple choice answer sheet marking system based on image processing techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gray‘s (2000) revised Reinforcement Sensitivity Theory (r-RST) was used to investigate personality effects on information processing biases to gain-framed and loss-framed anti-speeding messages and the persuasiveness of these messages. The r-RST postulates that behaviour is regulated by two major motivational systems: reward system or punishment system. It was hypothesised that both message processing and persuasiveness would be dependent upon an individual‘s sensitivity to reward or punishment. Student drivers (N = 133) were randomly assigned to view one of four anti-speeding messages or no message (control group). Individual processing differences were then measured using a lexical decision task, prior to participants completing a personality and persuasion questionnaire. Results indicated that participants who were more sensitive to reward showed a marginally significant (p = .050) tendency to report higher intentions to comply with the social gain-framed message and demonstrate a cognitive processing bias towards this message, than those with lower reward sensitivity.