994 resultados para spontaneous noise
Resumo:
This correspondence presents a microphone array shape calibration procedure for diffuse noise environments. The procedure estimates intermicrophone distances by fitting the measured noise coherence with its theoretical model and then estimates the array geometry using classical multidimensional scaling. The technique is validated on noise recordings from two office environments.
Resumo:
Purpose: To investigate speed regulation during overground running on undulating terrain. Methods: Following an initial laboratory session to calculate physiological thresholds, eight experienced runners completed a spontaneously paced time trial over 3 laps of an outdoor course involving uphill, downhill and level sections. A portable gas analyser, GPS receiver and activity monitor were used to collect physiological, speed and stride frequency data. Results: Participants ran 23% slower on uphills and 13.8% faster on downhills compared with level sections. Speeds on level sections were significantly different for 78.4 ± 7.0 seconds following an uphill and 23.6 ± 2.2 seconds following a downhill. Speed changes were primarily regulated by stride length which was 20.5% shorter uphill and 16.2% longer downhill, while stride frequency was relatively stable. Oxygen consumption averaged 100.4% of runner’s individual ventilatory thresholds on uphills, 78.9% on downhills and 89.3% on level sections. 89% of group level speed was predicted using a modified gradient factor. Individuals adopted distinct pacing strategies, both across laps and as a function of gradient. Conclusions: Speed was best predicted using a weighted factor to account for prior and current gradients. Oxygen consumption (VO2) limited runner’s speeds only on uphill sections, and was maintained in line with individual ventilatory thresholds. Running speed showed larger individual variation on downhill sections, while speed on the level was systematically influenced by the preceding gradient. Runners who varied their pace more as a function of gradient showed a more consistent level of oxygen consumption. These results suggest that optimising time on the level sections after hills offers the greatest potential to minimise overall time when running over undulating terrain.
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but these approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks are an alternative that optimise parameters of enhancement algorithms based on state sequences generated for utterances with known transcriptions. Previous reports of LIMA frameworks have shown significant promise for improving speech recognition accuracies under additive background noise for a range of speech enhancement techniques. In this paper we discuss the drawbacks of the LIMA approach when multiple layers of acoustic mismatch are present – namely background noise and speaker accent. Experimentation using LIMA-based Mel-filterbank noise subtraction on American and Australian English in-car speech databases supports this discussion, demonstrating that inferior speech recognition performance occurs when a second layer of mismatch is seen during evaluation.
Resumo:
A software tool (DRONE) has been developed to evaluate road traffic noise in a large area with the consideration of network dynamic traffic flow and the buildings. For more precise estimation of noise in urban network where vehicles are mainly in stop and go running conditions, vehicle sound power level (for acceleration/deceleration cruising and ideal vehicle) is incorporated in DRONE. The calculation performance of DRONE is increased by evaluating the noise in two steps of first estimating the unit noise database and then integrating it with traffic simulation. Details of the process from traffic simulation to contour maps are discussed in the paper and the implementation of DRONE on Tsukuba city is presented
Resumo:
This paper discusses the areawide Dynamic ROad traffic NoisE (DRONE) simulator, and its implementation as a tool for noise abatement policy evaluation. DRONE involves integrating a road traffic noise estimation model with a traffic simulator to estimate road traffic noise in urban networks. An integrated traffic simulation-noise estimation model provides an interface for direct input of traffic flow properties from simulation model to noise estimation model that in turn estimates the noise on a spatial and temporal scale. The output from DRONE is linked with a geographical information system for visual representation of noise levels in the form of noise contour maps.
Resumo:
A road traffic noise prediction model (ASJ MODEL-1998) has been integrated with a road traffic simulator (AVENUE) to produce the Dynamic areawide Road traffic NoisE simulator-DRONE. This traffic-noise-GIS based integrated tool is upgraded to predict noise levels in built-up areas. The integration of traffic simulation with a noise model provides dynamic access to traffic flow characteristics and hence automated and detailed predictions of traffic noise. The prediction is not only on the spatial scale but also on temporal scale. The linkage with GIS gives a visual representation to noise pollution in the form of dynamic areawide traffic noise contour maps. The application of DRONE on a real world built-up area is also presented.
Resumo:
The QUT-NOISE-TIMIT corpus consists of 600 hours of noisy speech sequences designed to enable a thorough evaluation of voice activity detection (VAD) algorithms across a wide variety of common background noise scenarios. In order to construct the final mixed-speech database, a collection of over 10 hours of background noise was conducted across 10 unique locations covering 5 common noise scenarios, to create the QUT-NOISE corpus. This background noise corpus was then mixed with speech events chosen from the TIMIT clean speech corpus over a wide variety of noise lengths, signal-to-noise ratios (SNRs) and active speech proportions to form the mixed-speech QUT-NOISE-TIMIT corpus. The evaluation of five baseline VAD systems on the QUT-NOISE-TIMIT corpus is conducted to validate the data and show that the variety of noise available will allow for better evaluation of VAD systems than existing approaches in the literature.
Resumo:
A combined specular reflection and diffusion model using the radiosity technique was developed to calculate road traffic noise level on residential balconies. The model is capable of numerous geometrical configurations for a single balcony situated in the centre of a street canyon. The geometry of the balcony and the street can be altered with width,length and height. The model was used to calculate for three different geometrical and acoustic absorption characteristics for a balcony. The calculated results are presented in this paper.
Resumo:
The theory of nonlinear dyamic systems provides some new methods to handle complex systems. Chaos theory offers new concepts, algorithms and methods for processing, enhancing and analyzing the measured signals. In recent years, researchers are applying the concepts from this theory to bio-signal analysis. In this work, the complex dynamics of the bio-signals such as electrocardiogram (ECG) and electroencephalogram (EEG) are analyzed using the tools of nonlinear systems theory. In the modern industrialized countries every year several hundred thousands of people die due to sudden cardiac death. The Electrocardiogram (ECG) is an important biosignal representing the sum total of millions of cardiac cell depolarization potentials. It contains important insight into the state of health and nature of the disease afflicting the heart. Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart by the sympathetic and parasympathetic branches of the autonomic nervous system. Heart rate variability analysis is an important tool to observe the heart's ability to respond to normal regulatory impulses that affect its rhythm. A computerbased intelligent system for analysis of cardiac states is very useful in diagnostics and disease management. Like many bio-signals, HRV signals are non-linear in nature. Higher order spectral analysis (HOS) is known to be a good tool for the analysis of non-linear systems and provides good noise immunity. In this work, we studied the HOS of the HRV signals of normal heartbeat and four classes of arrhythmia. This thesis presents some general characteristics for each of these classes of HRV signals in the bispectrum and bicoherence plots. Several features were extracted from the HOS and subjected an Analysis of Variance (ANOVA) test. The results are very promising for cardiac arrhythmia classification with a number of features yielding a p-value < 0.02 in the ANOVA test. An automated intelligent system for the identification of cardiac health is very useful in healthcare technology. In this work, seven features were extracted from the heart rate signals using HOS and fed to a support vector machine (SVM) for classification. The performance evaluation protocol in this thesis uses 330 subjects consisting of five different kinds of cardiac disease conditions. The classifier achieved a sensitivity of 90% and a specificity of 89%. This system is ready to run on larger data sets. In EEG analysis, the search for hidden information for identification of seizures has a long history. Epilepsy is a pathological condition characterized by spontaneous and unforeseeable occurrence of seizures, during which the perception or behavior of patients is disturbed. An automatic early detection of the seizure onsets would help the patients and observers to take appropriate precautions. Various methods have been proposed to predict the onset of seizures based on EEG recordings. The use of nonlinear features motivated by the higher order spectra (HOS) has been reported to be a promising approach to differentiate between normal, background (pre-ictal) and epileptic EEG signals. In this work, these features are used to train both a Gaussian mixture model (GMM) classifier and a Support Vector Machine (SVM) classifier. Results show that the classifiers were able to achieve 93.11% and 92.67% classification accuracy, respectively, with selected HOS based features. About 2 hours of EEG recordings from 10 patients were used in this study. This thesis introduces unique bispectrum and bicoherence plots for various cardiac conditions and for normal, background and epileptic EEG signals. These plots reveal distinct patterns. The patterns are useful for visual interpretation by those without a deep understanding of spectral analysis such as medical practitioners. It includes original contributions in extracting features from HRV and EEG signals using HOS and entropy, in analyzing the statistical properties of such features on real data and in automated classification using these features with GMM and SVM classifiers.
Resumo:
This paper presents a method of voice activity detection (VAD) suitable for high noise scenarios, based on the fusion of two complementary systems. The first system uses a proposed non-Gaussianity score (NGS) feature based on normal probability testing. The second system employs a histogram distance score (HDS) feature that detects changes in the signal through conducting a template-based similarity measure between adjacent frames. The decision outputs by the two systems are then merged using an open-by-reconstruction fusion stage. Accuracy of the proposed method was compared to several baseline VAD methods on a database created using real recordings of a variety of high-noise environments.
Resumo:
This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second zone system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state of the art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.
Resumo:
Since the launch of the ‘Clean Delhi, Green Delhi’ campaign in 2003, slums have become a significant social and political issue in India’s capital city. Through this campaign, the state, in collaboration with Delhi’s middle class through the ‘Bhagidari system’ (literally translated as ‘participatory system’), aims to transform Delhi into a ‘world-class city’ that offers a sanitised, aesthetically appealing urban experience to its citizens and Western visitors. In 2007, Delhi won the bid to host the 2010 Commonwealth Games; since then, this agenda has acquired an urgent, almost violent, impetus to transform Delhi into an environmentally friendly, aesthetically appealing and ‘truly international city’. Slums and slum-dwellers, with their ‘filth, dirt, and noise’, have no place in this imagined city. The violence inflicted upon slum-dwellers, including the denial of their judicial rights, is justified on these accounts. In addition, the juridical discourse since 2000 has ‘re-problematised slums as ‘nuisance’. The rising antagonism of the middle-classes against the poor, supported by the state’s ambition to have a ‘world-class city’, has allowed a new rhetoric to situate the slums in the city. These representations articulate slums as homogenised spaces of experience and identity. The ‘illegal’ status of slum-dwellers, as encroachers upon public space, is stretched to involve ‘social, cultural, and moral’ decadence and depravity. This thesis is an ethnographic exploration of everyday life in a prominent slum settlement in Delhi. It sensually examines the social, cultural and political materiality of slums, and the relationship of slums with the middle class. In doing so, it highlights the politics of sensorial ordering of slums as ‘filthy, dirty, and noisy’ by the middle classes to calcify their position as ‘others’ in order to further segregate, exclude and discriminate the slums. The ethnographic experience in the slums, however, highlights a complex sensorial ordering and politics of its own. Not only are the interactions between diverse communities in slums highly restricted and sensually ordained, but the middle class is identified as a sensual ‘other’, and its sensual practices prohibited. This is significant in two ways. First, it highlights the multiplicity of social, cultural experience and engagement in the slums, thereby challenging its homogenised representation. Second, the ethnographic exploration allowed me to frame a distinct sense of self amongst the slums, which is denied in mainstream discourses, and allowed me to identify the slums’ own ’others’, middle class being one of them. This thesis highlights sound – its production, performances and articulations – as an act with social, cultural, and political implications and manifestations. ‘Noise’ can be understood as a political construct to identify ‘others’ – and both slum-dwellers and the middle classes identify different sonic practices as noise to situate the ‘other’ sonically. It is within this context that this thesis frames the position of Listener and Hearer, which corresponds to their social-political positions. These positions can be, and are, resisted and circumvented through sonic practices. For instance, amplification tactics in the Karimnagar slums, which are understood as ‘uncultured, callous activities to just create more noise’ by the slums’ middle-class neighbours, also serve definite purposes in shaping and navigating the space through the slums’ soundscapes, asserting a presence that is otherwise denied. Such tactics allow the residents to define their sonic territories and scope of sonic performances; they are significant in terms of exerting one’s position, territory and identity, and they are very important in subverting hierarchies. The residents of the Karimnagar slums have to negotiate many social, cultural, moral and political prejudices in their everyday lives. Their identity is constantly under scrutiny and threat. However, the sonic cultures and practices in the Karimnagar slums allow their residents to exert a definite sonic presence – which the middle class has to hear. The articulation of noise and silence is an act manifesting, referencing and resisting social, cultural, and political power and hierarchies.
Resumo:
Spontaneous facial expressions differ from posed ones in appearance, timing and accompanying head movements. Still images cannot provide timing or head movement information directly. However, indirectly the distances between key points on a face extracted from a still image using active shape models can capture some movement and pose changes. This information is superposed on information about non-rigid facial movement that is also part of the expression. Does geometric information improve the discrimination between spontaneous and posed facial expressions arising from discrete emotions? We investigate the performance of a machine vision system for discrimination between posed and spontaneous versions of six basic emotions that uses SIFT appearance based features and FAP geometric features. Experimental results on the NVIE database demonstrate that fusion of geometric information leads only to marginal improvement over appearance features. Using fusion features, surprise is the easiest emotion (83.4% accuracy) to be distinguished, while disgust is the most difficult (76.1%). Our results find different important facial regions between discriminating posed versus spontaneous version of one emotion and classifying the same emotion versus other emotions. The distribution of the selected SIFT features shows that mouth is more important for sadness, while nose is more important for surprise, however, both the nose and mouth are important for disgust, fear, and happiness. Eyebrows, eyes, nose and mouth are important for anger.