3 resultados para speech signals
em Aston University Research Archive
Resumo:
Speech comprises dynamic and heterogeneous acoustic elements, yet it is heard as a single perceptual stream even when accompanied by other sounds. The relative contributions of grouping “primitives” and of speech-specific grouping factors to the perceptual coherence of speech are unclear, and the acoustical correlates of the latter remain unspecified. The parametric manipulations possible with simplified speech signals, such as sine-wave analogues, make them attractive stimuli to explore these issues. Given that the factors governing perceptual organization are generally revealed only where competition operates, the second-formant competitor (F2C) paradigm was used, in which the listener must resist competition to optimize recognition [Remez et al., Psychol. Rev. 101, 129-156 (1994)]. Three-formant (F1+F2+F3) sine-wave analogues were derived from natural sentences and presented dichotically (one ear = F1+F2C+F3; opposite ear = F2). Different versions of F2C were derived from F2 using separate manipulations of its amplitude and frequency contours. F2Cs with time-varying frequency contours were highly effective competitors, regardless of their amplitude characteristics. In contrast, F2Cs with constant frequency contours were completely ineffective. Competitor efficacy was not due to energetic masking of F3 by F2C. These findings indicate that modulation of the frequency, but not the amplitude, contour is critical for across-formant grouping.
Resumo:
There has been considerable recent research into the connection between Parkinson's disease (PD) and speech impairment. Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to predict PD symptom severity using speech signals have been introduced. In this paper, we test how accurately these novel algorithms can be used to discriminate PD subjects from healthy controls. In total, we compute 132 dysphonia measures from sustained vowels. Then, we select four parsimonious subsets of these dysphonia measures using four feature selection algorithms, and map these feature subsets to a binary classification response using two statistical classifiers: random forests and support vector machines. We use an existing database consisting of 263 samples from 43 subjects, and demonstrate that these new dysphonia measures can outperform state-of-the-art results, reaching almost 99% overall classification accuracy using only ten dysphonia features. We find that some of the recently proposed dysphonia measures complement existing algorithms in maximizing the ability of the classifiers to discriminate healthy controls from PD subjects. We see these results as an important step toward noninvasive diagnostic decision support in PD.
Resumo:
This thesis describes the investigation of an adaptive method of attenuation control for digital speech signals in an analogue-digital environment and its effects on the transmission performance of a national telecommunication network. The first part gives the design of a digital automatic gain control, able to operate upon a P.C.M. signal in its companded form and whose operation is based upon the counting of peaks of the digital speech signal above certain threshold levels. A study was ma.de of a digital automatic gain control (d.a.g.c.) in open-loop configuration and closed-loop configuration. The former was adopted as the means for carrying out the automatic control of attenuation. It was simulated and tested, both objectively and subjectively. The final part is the assessment of the effects on telephone connections of a d.a.g.c. that introduces gains of 6 dB or 12 dB. This work used a Telephone Connection Assessment Model developed at The University of Aston in Birmingham. The subjective tests showed that the d.a.g.c. gives advantage for listeners when the speech level is very low. The benefit is not great when speech is only a little quieter than preferred. The assessment showed that, when a standard British Telecom earphone is used, insertion of gain is desirable if speech voltage across the earphone terminals is below an upper limit of -38 dBV. People commented upon the presence of an adaptive-like effect during the tests. This could be the reason why they voted against the insertion of gain at level only little quieter than preferred, when they may otherwise have judged it to be desirable. A telephone connection with a d.a.g.c. in has a degree of difficulty less than half of that without it. The score Excellent plus Good is 10-30% greater.