926 resultados para signal noise


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart, by the sympathetic and parasympathetic branches of the autonomic nervous system. Heart rate variability analysis is an important tool to observe the heart's ability to respond to normal regulatory impulses that affect its rhythm. A computer-based intelligent system for analysis of cardiac states is very useful in diagnostics and disease management. Like many bio-signals, HRV signals are nonlinear in nature. Higher order spectral analysis (HOS) is known to be a good tool for the analysis of nonlinear systems and provides good noise immunity. In this work, we studied the HOS of the HRV signals of normal heartbeat and seven classes of arrhythmia. We present some general characteristics for each of these classes of HRV signals in the bispectrum and bicoherence plots. We also extracted features from the HOS and performed an analysis of variance (ANOVA) test. The results are very promising for cardiac arrhythmia classification with a number of features yielding a p-value < 0.02 in the ANOVA test.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an analysis of phasor measurement method for tracking the fundamental power frequency to show if it has the performance necessary to cope with the requirements of power system protection and control. In this regard, several computer simulations presenting the conditions of a typical power system signal especially those highly distorted by harmonics, noise and offset, are provided to evaluate the response of the Phasor Measurement (PM) technique. A new method, which can shorten the delay of estimation, has also been proposed for the PM method to work for signals free of even-order harmonics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In an automotive environment, the performance of a speech recognition system is affected by environmental noise if the speech signal is acquired directly from a microphone. Speech enhancement techniques are therefore necessary to improve the speech recognition performance. In this paper, a field-programmable gate array (FPGA) implementation of dual-microphone delay-and-sum beamforming (DASB) for speech enhancement is presented. As the first step towards a cost-effective solution, the implementation described in this paper uses a relatively high-end FPGA device to facilitate the verification of various design strategies and parameters. Experimental results show that the proposed design can produce output waveforms close to those generated by a theoretical (floating-point) model with modest usage of FPGA resources. Speech recognition experiments are also conducted on enhanced in-car speech waveforms produced by the FPGA in order to compare recognition performance with the floating-point representation running on a PC.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: The classic study of Sumby and Pollack (1954, JASA, 26(2), 212-215) demonstrated that visual information aided speech intelligibility under noisy auditory conditions. Their work showed that visual information is especially useful under low signal-to-noise conditions where the auditory signal leaves greater margins for improvement. We investigated whether simulated cataracts interfered with the ability of participants to use visual cues to help disambiguate the auditory signal in the presence of auditory noise. Methods: Participants in the study were screened to ensure normal visual acuity (mean of 20/20) and normal hearing (auditory threshold ≤ 20 dB HL). Speech intelligibility was tested under an auditory only condition and two visual conditions: normal vision and simulated cataracts. The light scattering effects of cataracts were imitated using cataract-simulating filters. Participants wore blacked-out glasses in the auditory only condition and lens-free frames in the normal auditory-visual condition. Individual sentences were spoken by a live speaker in the presence of prerecorded four-person background babble set to a speech-to-noise ratio (SNR) of -16 dB. The SNR was determined in a preliminary experiment to support 50% correct identification of sentence under the auditory only conditions. The speaker was trained to match the rate, intensity and inflections of a prerecorded audio track of everyday speech sentences. The speaker was blind to the visual conditions of the participant to control for bias.Participants’ speech intelligibility was measured by comparing the accuracy of their written account of what they believed the speaker to have said to the actual spoken sentence. Results: Relative to the normal vision condition, speech intelligibility was significantly poorer when participants wore simulated catarcts. Conclusions: The results suggest that cataracts may interfere with the acquisition of visual cues to speech perception.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a clustered approach for blind beamfoming from ad-hoc microphone arrays. In such arrangements, microphone placement is arbitrary and the speaker may be close to one, all or a subset of microphones at a given time. Practical issues with such a configuration mean that some microphones might be better discarded due to poor input signal to noise ratio (SNR) or undesirable spatial aliasing effects from large inter-element spacings when beamforming. Large inter-microphone spacings may also lead to inaccuracies in delay estimation during blind beamforming. In such situations, using a cluster of microphones (ie, a sub-array), closely located both to each other and to the desired speech source, may provide more robust enhancement than the full array. This paper proposes a method for blind clustering of microphones based on the magnitude square coherence function, and evaluates the method on a database recorded using various ad-hoc microphone arrangements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic Speech Recognition (ASR) has matured into a technology which is becoming more common in our everyday lives, and is emerging as a necessity to minimise driver distraction when operating in-car systems such as navigation and infotainment. In “noise-free” environments, word recognition performance of these systems has been shown to approach 100%, however this performance degrades rapidly as the level of background noise is increased. Speech enhancement is a popular method for making ASR systems more ro- bust. Single-channel spectral subtraction was originally designed to improve hu- man speech intelligibility and many attempts have been made to optimise this algorithm in terms of signal-based metrics such as maximised Signal-to-Noise Ratio (SNR) or minimised speech distortion. Such metrics are used to assess en- hancement performance for intelligibility not speech recognition, therefore mak- ing them sub-optimal ASR applications. This research investigates two methods for closely coupling subtractive-type enhancement algorithms with ASR: (a) a computationally-efficient Mel-filterbank noise subtraction technique based on likelihood-maximisation (LIMA), and (b) in- troducing phase spectrum information to enable spectral subtraction in the com- plex frequency domain. Likelihood-maximisation uses gradient-descent to optimise parameters of the enhancement algorithm to best fit the acoustic speech model given a word se- quence known a priori. Whilst this technique is shown to improve the ASR word accuracy performance, it is also identified to be particularly sensitive to non-noise mismatches between the training and testing data. Phase information has long been ignored in spectral subtraction as it is deemed to have little effect on human intelligibility. In this work it is shown that phase information is important in obtaining highly accurate estimates of clean speech magnitudes which are typically used in ASR feature extraction. Phase Estimation via Delay Projection is proposed based on the stationarity of sinusoidal signals, and demonstrates the potential to produce improvements in ASR word accuracy in a wide range of SNR. Throughout the dissertation, consideration is given to practical implemen- tation in vehicular environments which resulted in two novel contributions – a LIMA framework which takes advantage of the grounding procedure common to speech dialogue systems, and a resource-saving formulation of frequency-domain spectral subtraction for realisation in field-programmable gate array hardware. The techniques proposed in this dissertation were evaluated using the Aus- tralian English In-Car Speech Corpus which was collected as part of this work. This database is the first of its kind within Australia and captures real in-car speech of 50 native Australian speakers in seven driving conditions common to Australian environments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is a need in industry for a commodity polyethylene film with controllable degradation properties that will degrade in an environmentally neutral way, for applications such as shopping bags and packaging film. Additives such as starch have been shown to accelerate the degradation of plastic films, however control of degradation is required so that the film will retain its mechanical properties during storage and use, and then degrade when no longer required. By the addition of a photocatalyst it is hoped that polymer film will breakdown with exposure to sunlight. Furthermore, it is desired that the polymer film will degrade in the dark, after a short initial exposure to sunlight. Research has been undertaken into the photo- and thermo-oxidative degradation processes of 25 ìm thick LLDPE (linear low density polyethylene) film containing titania from different manufacturers. Films were aged in a suntest or in an oven at 50 °C, and the oxidation product formation was followed using IR spectroscopy. Degussa P25, Kronos 1002, and various organic-modified and doped titanias of the types Satchleben Hombitan and Hunstsman Tioxide incorporated into LLDPE films were assessed for photoactivity. Degussa P25 was found to be the most photoactive with UVA and UVC exposure. Surface modification of titania was found to reduce photoactivity. Crystal phase is thought to be among the most important factors when assessing the photoactivity of titania as a photocatalyst for degradation. Pre-irradiation with UVA or UVC for 24 hours of the film containing 3% Degussa P25 titania prior to aging in an oven resulted in embrittlement in ca. 200 days. The multivariate data analysis technique PCA (principal component analysis) was used as an exploratory tool to investigate the IR spectral data. Oxidation products formed in similar relative concentrations across all samples, confirming that titania was catalysing the oxidation of the LLDPE film without changing the oxidation pathway. PCA was also employed to compare rates of degradation in different films. PCA enabled the discovery of water vapour trapped inside cavities formed by oxidation by titania particles. Imaging ATR/FTIR spectroscopy with high lateral resolution was used in a novel experiment to examine the heterogeneous nature of oxidation of a model polymer compound caused by the presence of titania particles. A model polymer containing Degussa P25 titania was solvent cast onto the internal reflection element of the imaging ATR/FTIR and the oxidation under UVC was examined over time. Sensitisation of 5 ìm domains by titania resulted in areas of relatively high oxidation product concentration. The suitability of transmission IR with a synchrotron light source to the study of polymer film oxidation was assessed as the Australian Synchrotron in Melbourne, Australia. Challenges such as interference fringes and poor signal-to-noise ratio need to be addressed before this can become a routine technique.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Virtual 3D models of long bones are increasingly being used for implant design and research applications. The current gold standard for the acquisition of such data is Computed Tomography (CT) scanning. Due to radiation exposure, CT is generally limited to the imaging of clinical cases and cadaver specimens. Magnetic Resonance Imaging (MRI) does not involve ionising radiation and therefore can be used to image selected healthy human volunteers for research purposes. The feasibility of MRI as alternative to CT for the acquisition of morphological bone data of the lower extremity has been demonstrated in recent studies [1, 2]. Some of the current limitations of MRI are long scanning times and difficulties with image segmentation in certain anatomical regions due to poor contrast between bone and surrounding muscle tissues. Higher field strength scanners promise to offer faster imaging times or better image quality. In this study image quality at 1.5T is quantitatively compared to images acquired at 3T. --------- The femora of five human volunteers were scanned using 1.5T and 3T MRI scanners from the same manufacturer (Siemens) with similar imaging protocols. A 3D flash sequence was used with TE = 4.66 ms, flip angle = 15° and voxel size = 0.5 × 0.5 × 1 mm. PA-Matrix and body matrix coils were used to cover the lower limb and pelvis respectively. Signal to noise ratio (SNR) [3] and contrast to noise ratio (CNR) [3] of the axial images from the proximal, shaft and distal regions were used to assess the quality of images from the 1.5T and 3T scanners. The SNR was calculated for the muscle and bone-marrow in the axial images. The CNR was calculated for the muscle to cortex and cortex to bone marrow interfaces, respectively. --------- Preliminary results (one volunteer) show that the SNR of muscle for the shaft and distal regions was higher in 3T images (11.65 and 17.60) than 1.5T images (8.12 and 8.11). For the proximal region the SNR of muscles was higher in 1.5T images (7.52) than 3T images (6.78). The SNR of bone marrow was slightly higher in 1.5T images for both proximal and shaft regions, while it was lower in the distal region compared to 3T images. The CNR between muscle and bone of all three regions was higher in 3T images (4.14, 6.55 and 12.99) than in 1.5T images (2.49, 3.25 and 9.89). The CNR between bone-marrow and bone was slightly higher in 1.5T images (4.87, 12.89 and 10.07) compared to 3T images (3.74, 10.83 and 10.15). These results show that the 3T images generated higher contrast between bone and the muscle tissue than the 1.5T images. It is expected that this improvement of image contrast will significantly reduce the time required for the mainly manual segmentation of the MR images. Future work will focus on optimizing the 3T imaging protocol for reducing chemical shift and susceptibility artifacts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The detection of voice activity is a challenging problem, especially when the level of acoustic noise is high. Most current approaches only utilise the audio signal, making them susceptible to acoustic noise. An obvious approach to overcome this is to use the visual modality. The current state-of-the-art visual feature extraction technique is one that uses a cascade of visual features (i.e. 2D-DCT, feature mean normalisation, interstep LDA). In this paper, we investigate the effectiveness of this technique for the task of visual voice activity detection (VAD), and analyse each stage of the cascade and quantify the relative improvement in performance gained by each successive stage. The experiments were conducted on the CUAVE database and our results highlight that the dynamics of the visual modality can be used to good effect to improve visual voice activity detection performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the impact of carrier frequency offset (CFO) on Single Carrier wireless communication systems with Frequency Domain Equalization (SC-FDE). We show that CFO in SC-FDE systems causes irrecoverable channel estimation error, which leads to inter-symbol-interference (ISI). The impact of CFO on SC-FDE and OFDM is compared in the presence of CFO and channel estimation errors. Closed form expressions of signal to interference and noise ratio (SINR) are derived for both systems, and verified by simulation results. We find that when channel estimation errors are considered, SC-FDE is similarly or even more sensitive to CFO, compared to OFDM. In particular, in SC-FDE systems, CFO mainly deteriorates the system performance via degrading the channel estimation. Both analytical and simulation results highlight the importance of accurate CFO estimation in SC-FDE systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this chapter we propose clipping with amplitude and phase corrections to reduce the peak-to-average power ratio (PAR) of orthogonal frequency division multiplexed (OFDM) signals in high-speed wireless local area networks defined in IEEE 802.11a physical layer. The proposed techniques can be implemented with a small modification at the transmitter and the receiver remains standard compliant. PAR reduction as much as 4dB can be achieved by selecting a suitable clipping ratio and a correction factor depending on the constellation used. Out of band noise (OBN) is also reduced.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multiresolution techniques are being extensively used in signal processing literature. This paper has two parts, in the first part we derive a relationship between the general degradation model (Y=BX+W) at coarse and fine resolutions. In the second part we develop a signal restoration scheme in a multiresolution framework and demonstrate through experiments that the knowledge of the relationship between the degradation model at different resolutions helps in obtaining computationally efficient restoration scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Microphone arrays have been used in various applications to capture conversations, such as in meetings and teleconferences. In many cases, the microphone and likely source locations are known \emph{a priori}, and calculating beamforming filters is therefore straightforward. In ad-hoc situations, however, when the microphones have not been systematically positioned, this information is not available and beamforming must be achieved blindly. In achieving this, a commonly neglected issue is whether it is optimal to use all of the available microphones, or only an advantageous subset of these. This paper commences by reviewing different approaches to blind beamforming, characterising them by the way they estimate the signal propagation vector and the spatial coherence of noise in the absence of prior knowledge of microphone and speaker locations. Following this, a novel clustered approach to blind beamforming is motivated and developed. Without using any prior geometrical information, microphones are first grouped into localised clusters, which are then ranked according to their relative distance from a speaker. Beamforming is then performed using either the closest microphone cluster, or a weighted combination of clusters. The clustered algorithms are compared to the full set of microphones in experiments on a database recorded on different ad-hoc array geometries. These experiments evaluate the methods in terms of signal enhancement as well as performance on a large vocabulary speech recognition task.