906 resultados para sinusoidal signals
Resumo:
Automatic Speech Recognition (ASR) has matured into a technology which is becoming more common in our everyday lives, and is emerging as a necessity to minimise driver distraction when operating in-car systems such as navigation and infotainment. In “noise-free” environments, word recognition performance of these systems has been shown to approach 100%, however this performance degrades rapidly as the level of background noise is increased. Speech enhancement is a popular method for making ASR systems more ro- bust. Single-channel spectral subtraction was originally designed to improve hu- man speech intelligibility and many attempts have been made to optimise this algorithm in terms of signal-based metrics such as maximised Signal-to-Noise Ratio (SNR) or minimised speech distortion. Such metrics are used to assess en- hancement performance for intelligibility not speech recognition, therefore mak- ing them sub-optimal ASR applications. This research investigates two methods for closely coupling subtractive-type enhancement algorithms with ASR: (a) a computationally-efficient Mel-filterbank noise subtraction technique based on likelihood-maximisation (LIMA), and (b) in- troducing phase spectrum information to enable spectral subtraction in the com- plex frequency domain. Likelihood-maximisation uses gradient-descent to optimise parameters of the enhancement algorithm to best fit the acoustic speech model given a word se- quence known a priori. Whilst this technique is shown to improve the ASR word accuracy performance, it is also identified to be particularly sensitive to non-noise mismatches between the training and testing data. Phase information has long been ignored in spectral subtraction as it is deemed to have little effect on human intelligibility. In this work it is shown that phase information is important in obtaining highly accurate estimates of clean speech magnitudes which are typically used in ASR feature extraction. Phase Estimation via Delay Projection is proposed based on the stationarity of sinusoidal signals, and demonstrates the potential to produce improvements in ASR word accuracy in a wide range of SNR. Throughout the dissertation, consideration is given to practical implemen- tation in vehicular environments which resulted in two novel contributions – a LIMA framework which takes advantage of the grounding procedure common to speech dialogue systems, and a resource-saving formulation of frequency-domain spectral subtraction for realisation in field-programmable gate array hardware. The techniques proposed in this dissertation were evaluated using the Aus- tralian English In-Car Speech Corpus which was collected as part of this work. This database is the first of its kind within Australia and captures real in-car speech of 50 native Australian speakers in seven driving conditions common to Australian environments.
Resumo:
This thesis deals with the problem of the instantaneous frequency (IF) estimation of sinusoidal signals. This topic plays significant role in signal processing and communications. Depending on the type of the signal, two major approaches are considered. For IF estimation of single-tone or digitally-modulated sinusoidal signals (like frequency shift keying signals) the approach of digital phase-locked loops (DPLLs) is considered, and this is Part-I of this thesis. For FM signals the approach of time-frequency analysis is considered, and this is Part-II of the thesis. In part-I we have utilized sinusoidal DPLLs with non-uniform sampling scheme as this type is widely used in communication systems. The digital tanlock loop (DTL) has introduced significant advantages over other existing DPLLs. In the last 10 years many efforts have been made to improve DTL performance. However, this loop and all of its modifications utilizes Hilbert transformer (HT) to produce a signal-independent 90-degree phase-shifted version of the input signal. Hilbert transformer can be realized approximately using a finite impulse response (FIR) digital filter. This realization introduces further complexity in the loop in addition to approximations and frequency limitations on the input signal. We have tried to avoid practical difficulties associated with the conventional tanlock scheme while keeping its advantages. A time-delay is utilized in the tanlock scheme of DTL to produce a signal-dependent phase shift. This gave rise to the time-delay digital tanlock loop (TDTL). Fixed point theorems are used to analyze the behavior of the new loop. As such TDTL combines the two major approaches in DPLLs: the non-linear approach of sinusoidal DPLL based on fixed point analysis, and the linear tanlock approach based on the arctan phase detection. TDTL preserves the main advantages of the DTL despite its reduced structure. An application of TDTL in FSK demodulation is also considered. This idea of replacing HT by a time-delay may be of interest in other signal processing systems. Hence we have analyzed and compared the behaviors of the HT and the time-delay in the presence of additive Gaussian noise. Based on the above analysis, the behavior of the first and second-order TDTLs has been analyzed in additive Gaussian noise. Since DPLLs need time for locking, they are normally not efficient in tracking the continuously changing frequencies of non-stationary signals, i.e. signals with time-varying spectra. Nonstationary signals are of importance in synthetic and real life applications. An example is the frequency-modulated (FM) signals widely used in communication systems. Part-II of this thesis is dedicated for the IF estimation of non-stationary signals. For such signals the classical spectral techniques break down, due to the time-varying nature of their spectra, and more advanced techniques should be utilized. For the purpose of instantaneous frequency estimation of non-stationary signals there are two major approaches: parametric and non-parametric. We chose the non-parametric approach which is based on time-frequency analysis. This approach is computationally less expensive and more effective in dealing with multicomponent signals, which are the main aim of this part of the thesis. A time-frequency distribution (TFD) of a signal is a two-dimensional transformation of the signal to the time-frequency domain. Multicomponent signals can be identified by multiple energy peaks in the time-frequency domain. Many real life and synthetic signals are of multicomponent nature and there is little in the literature concerning IF estimation of such signals. This is why we have concentrated on multicomponent signals in Part-H. An adaptive algorithm for IF estimation using the quadratic time-frequency distributions has been analyzed. A class of time-frequency distributions that are more suitable for this purpose has been proposed. The kernels of this class are time-only or one-dimensional, rather than the time-lag (two-dimensional) kernels. Hence this class has been named as the T -class. If the parameters of these TFDs are properly chosen, they are more efficient than the existing fixed-kernel TFDs in terms of resolution (energy concentration around the IF) and artifacts reduction. The T-distributions has been used in the IF adaptive algorithm and proved to be efficient in tracking rapidly changing frequencies. They also enables direct amplitude estimation for the components of a multicomponent
Resumo:
Advanced bus-clamping pulse width modulation (ABCPWM) techniques are advantageous in terms of line current distortion and inverter switching loss in voltage source inverter-fed applications. However, the PWM waveforms corresponding to these techniques are not amenable to carrier-based generation. The modulation process in ABCPWM methods is analyzed here from a “per-phase” perspective. It is shown that three sets of descendant modulating functions (or modified modulating functions) can be generated from the three-phase sinusoidal signals. Each set of the modified modulating functions can be used to produce the PWM waveform of a given phase in a computationally efficient manner. Theoretical results and experimental investigations on a 5hp motor drive are presented
Resumo:
The use of special units for logarithmic ratio quantities is reviewed. The neper is used with a natural logarithm (logarithm to the base e) to express the logarithm of the amplitude ratio of two pure sinusoidal signals, particularly in the context of linear systems where it is desired to represent the gain or loss in amplitude of a single-frequency signal between the input and output. The bel, and its more commonly used submultiple, the decibel, are used with a decadic logarithm (logarithm to the base 10) to measure the ratio of two power-like quantities, such as a mean square signal or a mean square sound pressure in acoustics. Thus two distinctly different quantities are involved. In this review we define the quantities first, without reference to the units, as is standard practice in any system of quantities and units. We show that two different definitions of the quantity power level, or logarithmic power ratio, are possible. We show that this leads to two different interpretations for the meaning and numerical values of the units bel and decibel. We review the question of which of these alternative definitions is actually used, or is used by implication, by workers in the field. Finally, we discuss the relative advantages of the alternative definitions.
Resumo:
Human tremor can be defined as a somewhat rhythmic and quick movement of one or more body parts. In some people, it is a symptom of a neurological disorder. From the mathematical point of view, human tremor can be defined as a weighted contribution of different sinusoidal signals which causes oscillations of some parts of the body. This sinusoidal is repeated over time, but its amplitude and frequency change slowly. This is why amplitude and frequency are considered important factors in the tremor characterization, and thus for its diagnosis. In this paper, a tool for the prediagnosis of the human tremor is presented. This tool uses a low cost device (<$40) and allows to compute the main factors of the human tremor accurately. Real cases have been tested using the algorithms developed in this investigation. The patients suffered from different tremor severities, and the components of amplitude and frequency were computed using a series of tests. These additional measures will help the experts to make better diagnoses allowing them to focus on specific stages of the test or get an overview of these tests. From the experimental, we stated that not all tests are valid for every patient to give a diagnosis. Guided by years of experience, the expert will decide which test or set of tests are the most appropriate for a patient.
Resumo:
Purpose To develop a signal processing paradigm for extracting ERG responses to temporal sinusoidal modulation with contrasts ranging from below perceptual threshold to suprathreshold contrasts. To estimate the magnitude of intrinsic noise in ERG signals at different stimulus contrasts. Methods Photopic test stimuli were generated using a 4-primary Maxwellian view optical system. The 4-primary lights were sinusoidally temporally modulated in-phase (36 Hz; 2.5 - 50% Michelson). The stimuli were presented in 1 s epochs separated by a 1 ms blank interval and repeated 160 times (160.16 s duration) during the recording of the continuous flicker ERG from the right eye using DTL fiber electrodes. After artefact rejection, the ERG signal was extracted using Fourier methods in each of the 1 s epochs where a stimulus was presented. The signal processing allows for computation of the intrinsic noise distribution in addition to the signal to noise (SNR) ratio. Results We provide the initial report that the ERG intrinsic noise distribution is independent of stimulus contrast whereas SNR decreases linearly with decreasing contrast until the noise limit at ~2.5%. The 1ms blank intervals between epochs de-correlated the ERG signal at the line frequency (50 Hz) and thus increased the SNR of the averaged response. We confirm that response amplitude increases linearly with stimulus contrast. The phase response shows a shallow positive relationship with stimulus contrast. Conclusions This new technique will enable recording of intrinsic noise in ERG signals above and below perceptual visual threshold and is suitable for measurement of continuous rod and cone ERGs across a range of temporal frequencies, and post-receptoral processing in the primary retinogeniculate pathways at low stimulus contrasts. The intrinsic noise distribution may have application as a biomarker for detecting changes in disease progression or treatment efficacy.
Resumo:
The objective of the author's on-going research is to explore the feasibility of determining reliable in situ curves of shear modulus as a function of strain using the dynamic test. The purpose of this paper is limited to investigating what material stiffness is measured from a dynamic test, focusing on the harmonic excitation test. A one-dimensional discrete model with nonlinear material properties is used for this purpose. When a sinusoidal load is applied, the cross-correlation of signals from different depths estimates a wave velocity close to the one calculated from the secant modulus in the stress-strain loops under steady-state conditions. The variables that contributed to changing the average slope of the stress-strain loop also influence the estimate of the wave velocity from cross-correlation. Copyright ASCE 2007.
Resumo:
The application of inverse filtering techniques for high-quality singing voice analysis/synthesis is discussed. In the context of source-filter models, inverse filtering provides a noninvasive method to extract the voice source, and thus to study voice quality. Although this approach is widely used in speech synthesis, this is not the case in singing voice. Several studies have proved that inverse filtering techniques fail in the case of singing voice, the reasons being unclear. In order to shed light on this problem, we will consider here an additional feature of singing voice, not present in speech: the vibrato. Vibrato has been traditionally studied by sinusoidal modeling. As an alternative, we will introduce here a novel noninteractive source filter model that incorporates the mechanisms of vibrato generation. This model will also allow the comparison of the results produced by inverse filtering techniques and by sinusoidal modeling, as they apply to singing voice and not to speech. In this way, the limitations of these conventional techniques, described in previous literature, will be explained. Both synthetic signals and singer recordings are used to validate and compare the techniques presented in the paper.
Resumo:
The aim in the current work is the development of a method to characterize force sensors under sinusoidal excitations using a primary standard as the source of traceability. During this work the influence factors have been studied and a method to minimise their contributions, as well as the corrections to be performed under dynamic conditions have been established. These results will allow the realization of an adequate characterization of force sensors under sinusoidal excitations, which will be essential for its further proper use under dynamic conditions. The traceability of the sensor characterization is based in the direct definition of force as mass multiplied by acceleration. To do so, the sensor is loaded with different calibrated loads and is maint ained under different sinusoidal accelerations by means of a vibration shaker system that is able to generate accelerations up to 100 m/s2 with frequencies from 5 Hz up to 2400 Hz. The acceleration is measured by means of a laser vibrometer with traceabili ty to the units of time and length. A multiple channel data acquisition system is also required to simultaneously acquire the electrical output signals of the involved instrument in real time.
Resumo:
Owls and other animals, including humans, use the difference in arrival time of sounds between the ears to determine the direction of a sound source in the horizontal plane. When an interaural time difference (ITD) is conveyed by a narrowband signal such as a tone, human beings may fail to derive the direction represented by that ITD. This is because they cannot distinguish the true ITD contained in the signal from its phase equivalents that are ITD ± nT, where T is the period of the stimulus tone and n is an integer. This uncertainty is called phase-ambiguity. All ITD-sensitive neurons in birds and mammals respond to an ITD and its phase equivalents when the ITD is contained in narrowband signals. It is not known, however, if these animals show phase-ambiguity in the localization of narrowband signals. The present work shows that barn owls (Tyto alba) experience phase-ambiguity in the localization of tones delivered by earphones. We used sound-induced head-turning responses to measure the sound-source directions perceived by two owls. In both owls, head-turning angles varied as a sinusoidal function of ITD. One owl always pointed to the direction represented by the smaller of the two ITDs, whereas a second owl always chose the direction represented by the larger ITD (i.e., ITD − T).
Resumo:
All signals that appear to be periodic have some sort of variability from period to period regardless of how stable they appear to be in a data plot. A true sinusoidal time series is a deterministic function of time that never changes and thus has zero bandwidth around the sinusoid's frequency. A zero bandwidth is impossible in nature since all signals have some intrinsic variability over time. Deterministic sinusoids are used to model cycles as a mathematical convenience. Hinich [IEEE J. Oceanic Eng. 25 (2) (2000) 256-261] introduced a parametric statistical model, called the randomly modulated periodicity (RMP) that allows one to capture the intrinsic variability of a cycle. As with a deterministic periodic signal the RMP can have a number of harmonics. The likelihood ratio test for this model when the amplitudes and phases are known is given in [M.J. Hinich, Signal Processing 83 (2003) 1349-13521. A method for detecting a RMP whose amplitudes and phases are unknown random process plus a stationary noise process is addressed in this paper. The only assumption on the additive noise is that it has finite dependence and finite moments. Using simulations based on a simple RMP model we show a case where the new method can detect the signal when the signal is not detectable in a standard waterfall spectrograrn display. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
* Supported by INTAS 2000-626, INTAS YSF 03-55-1969, INTAS INNO 182, and TIC 2003-09319-c03-03.