The characteristics of moving sound sources have strong implications on the listener's distance perception and the estimation of velocity. Modifications of the typical sound emissions as they are currently occurring due to the tendency towards electromobility have an impact on the pedestrian's safety in road traffic. Thus, investigations of the relevant cues for velocity and distance perception of moving sound sources are not only of interest for the psychoacoustic community, but also for several applications, like e.g. virtual reality, noise pollution and safety aspects of road traffic. This article describes a series of psychoacoustic experiments in this field. Dichotic and diotic stimuli of a set of real-life recordings taken from a passing passenger car and a motorcycle were presented to test subjects who in turn were asked to determine the velocity of the object and its minimal distance from the listener. The results of these psychoacoustic experiments show that the estimated velocity is strongly linked to the object's distance. Furthermore, it could be shown that binaural cues contribute significantly to the perception of velocity. In a further experiment, it was shown that - independently of the type of the vehicle - the main parameter for distance determination is the maximum sound pressure level at the listener's position. The article suggests a system architecture for the adequate consideration of moving sound sources in virtual auditory environments. Virtual environments can thus be used to investigate the influence of new vehicle powertrain concepts and the related sound emissions of these vehicles on the pedestrians' ability to estimate the distance and velocity of moving objects.


Barn owls can localize a sound source using either the map of auditory space contained in the optic tectum or the auditory forebrain. The auditory thalamus, nucleus ovoidalis (N.Ov), is situated between these two auditory areas, and its inactivation precludes the use of the auditory forebrain for sound localization. We examined the sources of inputs to the N.Ov as well as their patterns of termination within the nucleus. We also examined the response of single neurons within the N.Ov to tonal stimuli and sound localization cues. Afferents to the N.Ov originated with a diffuse population of neurons located bilaterally within the lateral shell, core, and medial shell subdivisions of the central nucleus of the inferior colliculus. Additional afferent input originated from the ipsilateral ventral nucleus of the lateral lemniscus. No afferent input was provided to the N.Ov from the external nucleus of the inferior colliculus or the optic tectum. The N.Ov was tonotopically organized with high frequencies represented dorsally and low frequencies ventrally. Although neurons in the N.Ov responded to localization cues, there was no apparent topographic mapping of these cues within the nucleus, in contrast to the tectal pathway. However, nearly all possible types of binaural response to sound localization cues were represented. These findings suggest that in the thalamo-telencephalic auditory pathway, sound localization is subserved by a nontopographic representation of auditory space.


Our current understanding of the sound-generating mechanism in the songbird vocal organ, the syrinx, is based on indirect evidence and theoretical treatments. The classical avian model of sound production postulates that the medial tympaniform membranes (MTM) are the principal sound generators. We tested the role of the MTM in sound generation and studied the songbird syrinx more directly by filming it endoscopically. After we surgically incapacitated the MTM as a vibratory source, zebra finches and cardinals were not only able to vocalize, but sang nearly normal song. This result shows clearly that the MTM are not the principal sound source. The endoscopic images of the intact songbird syrinx during spontaneous and brain stimulation-induced vocalizations illustrate the dynamics of syringeal reconfiguration before phonation and suggest a different model for sound production. Phonation is initiated by rostrad movement and stretching of the syrinx. At the same time, the syrinx is closed through movement of two soft tissue masses, the medial and lateral labia, into the bronchial lumen. Sound production always is accompanied by vibratory motions of both labia, indicating that these vibrations may be the sound source. However, because of the low temporal resolution of the imaging system, the frequency and phase of labial vibrations could not be assessed in relation to that of the generated sound. Nevertheless, in contrast to the previous model, these observations show that both labia contribute to aperture control and strongly suggest that they play an important role as principal sound generators.


One of the most popular techniques for creating spatialized virtual sounds is based on the use of Head-Related Transfer Functions (HRTFs). HRTFs are signal processing models that represent the modifications undergone by the acoustic signal as it travels from a sound source to each of the listener's eardrums. These modifications are due to the interaction of the acoustic waves with the listener's torso, shoulders, head and pinnae, or outer ears. As such, HRTFs are somewhat different for each listener. For a listener to perceive synthesized 3-D sound cues correctly, the synthesized cues must be similar to the listener's own HRTFs. ^ One can measure individual HRTFs using specialized recording systems, however, these systems are prohibitively expensive and restrict the portability of the 3-D sound system. HRTF-based systems also face several computational challenges. This dissertation presents an alternative method for the synthesis of binaural spatialized sounds. The sound entering the pinna undergoes several reflective, diffractive and resonant phenomena, which determine the HRTF. Using signal processing tools, such as Prony's signal modeling method, an appropriate set of time delays and a resonant frequency were used to approximate the measured Head-Related Impulse Responses (HRIRs). Statistical analysis was used to find out empirical equations describing how the reflections and resonances are determined by the shape and size of the pinna features obtained from 3D images of 15 experimental subjects modeled in the project. These equations were used to yield “Model HRTFs” that can create elevation effects. ^ Listening tests conducted on 10 subjects show that these model HRTFs are 5% more effective than generic HRTFs when it comes to localizing sounds in the frontal plane. The number of reversals (perception of sound source above the horizontal plane when actually it is below the plane and vice versa) was also reduced by 5.7%, showing the perceptual effectiveness of this approach. The model is simple, yet versatile because it relies on easy to measure parameters to create an individualized HRTF. This low-order parameterized model also reduces the computational and storage demands, while maintaining a sufficient number of perceptually relevant spectral cues. ^


Social structure is a key determinant of population biology and is central to the way animals exploit their environment. The risk of predation is often invoked as an important factor influencing the evolution of social structure in cetaceans and other mammals, but little direct information is available about how cetaceans actually respond to predators or other perceived threats. The playback of sounds to an animal is a powerful tool for assessing behavioral responses to predators, but quantifying behavioral responses to playback experiments requires baseline knowledge of normal behavioral patterns and variation. The central goal of my dissertation is to describe baseline foraging behavior for the western Atlantic short-finnned pilot whales (Globicephala macrohynchus) and examine the role of social organization in their response to predators. To accomplish this I used multi-sensor digital acoustic tags (DTAGs), satellite-linked time-depth recorders (SLTDR), and playback experiments to study foraging behavior and behavioral response to predators in pilot whales. Fine scale foraging strategies and population level patterns were identified by estimating the body size and examining the location and movement around feeding events using data collected with DTAGs deployed on 40 pilot whales in summers of 2008-2014 off the coast of Cape Hatteras, North Carolina. Pilot whales were found to forage throughout the water column and performed feeding buzzes at depths ranging from 29-1176 meters. The results indicated potential habitat segregation in foraging depth in short-finned pilot whales with larger individuals foraging on average at deeper depths. Calculated aerobic dive limit for large adult males was approximately 6 minutes longer than that of females and likely facilitated the difference in foraging depth. Furthermore, the buzz frequency and speed around feeding attempts indicate this population pilot whales are likely targeting multiple small prey items. Using these results, I built decision trees to inform foraging dive classification in coarse, long-term dive data collected with SLTDRs deployed on 6 pilot whales in the summers of 2014 and 2015 in the same area off the coast of North Carolina. I used these long term foraging records to compare diurnal foraging rates and depths, as well as classify bouts with a maximum likelihood method, and evaluate behavioral aerobic dive limits (ADLB) through examination of dive durations and inter-dive intervals. Dive duration was the best predictor of foraging, with dives >400.6 seconds classified as foraging, and a 96% classification accuracy. There were no diurnal patterns in foraging depth or rates and average duration of bouts was 2.94 hours with maximum bout durations lasting up to 14 hours. The results indicated that pilot whales forage in relatively long bouts and the ADLB indicate that pilot whales rarely, if ever exceed their aerobic limits. To evaluate the response to predators I used controlled playback experiments to examine the behavioral responses of 10 of the tagged short-finned pilot whales off Cape Hatteras, North Carolina and 4 Risso’s dolphins (Grampus griseus) off Southern California to the calls of mammal-eating killer whales (MEK). Both species responded to a subset of MEK calls with increased movement, swim speed and increased cohesion of the focal groups, but the two species exhibited different directional movement and vocal responses. Pilot whales increased their call rate and approached the sound source, but Risso’s dolphins exhibited no change in their vocal behavior and moved in a rapid, directed manner away from the source. Thus, at least to a sub-set of mammal-eating killer whale calls, these two study species reacted in a manner that is consistent with their patterns of social organization. Pilot whales, which live in relatively permanent groups bound by strong social bonds, responded in a manner that built on their high levels of social cohesion. In contrast, Risso’s dolphins exhibited an exaggerated flight response and moved rapidly away from the sound source. The fact that both species responded strongly to a select number of MEK calls, suggests that structural features of signals play critical contextual roles in the probability of response to potential threats in odontocete cetaceans.


Sound localisation is defined as the ability to identify the position of a sound source. The brain employs two cues to achieve this functionality for the horizontal plane, interaural time difference (ITD) by means of neurons in the medial superior olive (MSO) and interaural intensity difference (IID) by neurons of the lateral superior olive (LSO), both located in the superior olivary complex of the auditory pathway. This paper presents spiking neuron architectures of the MSO and LSO. An implementation of the Jeffress model using spiking neurons is presented as a representation of the MSO, while a spiking neuron architecture showing how neurons of the medial nucleus of the trapezoid body interact with LSO neurons to determine the azimuthal angle is discussed. Experimental results to support this work are presented.


The relationship between neuronal acuity and behavioral performance was assessed in the barn owl (Tyto alba), a nocturnal raptor renowned for its ability to localize sounds and for the topographic representation of auditory space found in the midbrain. We measured discrimination of sound-source separation using a newly developed procedure involving the habituation and recovery of the pupillary dilation response. The smallest discriminable change of source location was found to be about two times finer in azimuth than in elevation. Recordings from neurons in its midbrain space map revealed that their spatial tuning, like the spatial discrimination behavior, was also better in azimuth than in elevation by a factor of about two. Because the PDR behavioral assay is mediated by the same circuitry whether discrimination is assessed in azimuth or in elevation, this difference in vertical and horizontal acuity is likely to reflect a true difference in sensory resolution, without additional confounding effects of differences in motor performance in the two dimensions. Our results, therefore, are consistent with the hypothesis that the acuity of the midbrain space map determines auditory spatial discrimination.


An inverse problem for the wave equation is a mathematical formulation of the problem to convert measurements of sound waves to information about the wave speed governing the propagation of the waves. This doctoral thesis extends the theory on the inverse problems for the wave equation in cases with partial measurement data and also considers detection of discontinuous interfaces in the wave speed. A possible application of the theory is obstetric sonography in which ultrasound measurements are transformed into an image of the fetus in its mother's uterus. The wave speed inside the body can not be directly observed but sound waves can be produced outside the body and their echoes from the body can be recorded. The present work contains five research articles. In the first and the fifth articles we show that it is possible to determine the wave speed uniquely by using far apart sound sources and receivers. This extends a previously known result which requires the sound waves to be produced and recorded in the same place. Our result is motivated by a possible application to reflection seismology which seeks to create an image of the Earth s crust from recording of echoes stimulated for example by explosions. For this purpose, the receivers can not typically lie near the powerful sound sources. In the second article we present a sound source that allows us to recover many essential features of the wave speed from the echo produced by the source. Moreover, these features are known to determine the wave speed under certain geometric assumptions. Previously known results permitted the same features to be recovered only by sequential measurement of echoes produced by multiple different sources. The reduced number of measurements could increase the number possible applications of acoustic probing. In the third and fourth articles we develop an acoustic probing method to locate discontinuous interfaces in the wave speed. These interfaces typically correspond to interfaces between different materials and their locations are of interest in many applications. There are many previous approaches to this problem but none of them exploits sound sources varying freely in time. Our use of more variable sources could allow more robust implementation of the probing.


Binaural hearing studies show that the auditory system uses the phase-difference information in the auditory stimuli for localization of a sound source. Motivated by this finding, we present a method for demodulation of amplitude-modulated-frequency-modulated (AM-FM) signals using a ignal and its arbitrary phase-shifted version. The demodulation is achieved using two allpass filters, whose impulse responses are related through the fractional Hilbert transform (FrHT). The allpass filters are obtained by cosine-modulation of a zero-phase flat-top prototype halfband lowpass filter. The outputs of the filters are combined to construct an analytic signal (AS) from which the AM and FM are estimated. We show that, under certain assumptions on the signal and the filter structures, the AM and FM can be obtained exactly. The AM-FM calculations are based on the quasi-eigenfunction approximation. We then extend the concept to the demodulation of multicomponent signals using uniform and non-uniform cosine-modulated filterbank (FB) structures consisting of flat bandpass filters, including the uniform cosine-modulated, equivalent rectangular bandwidth (ERB), and constant-Q filterbanks. We validate the theoretical calculations by considering application on synthesized AM-FM signals and compare the performance in presence of noise with three other multiband demodulation techniques, namely, the Teager-energy-based approach, the Gabor's AS approach, and the linear transduction filter approach. We also show demodulation results for real signals.


The generation of sound by turbulent boundary layer flow at low Mach number over a rough wall is investigated by applying the theoretical model which describes the scattering of the turbulence near field into sound by roughness elements. Attention is focused on the numerical method to approximately quantify the absolute level of the roughness noise radiated to far field. Empirical models for the source statistics are obtained by scaling smooth-wall data through increased skin friction velocity and boundary layer thickness for the rough surface. Numerical integration is performed to determine the roughness noise, and it reproduces the spectral characteristics of the available empirical formula and experimental data. Experiments are conducted to measure the radiated sound from two rough plates in an open jet by four 1/2'' free-field condenser microphones. The measured noise spectra of the rough plates are above that of a smooth plate in 1-2.5 kHz frequency and exhibits encouraging agreement with the predicted spectra. Also, a phased microphone array is utilized to localize the sound source, and it confirms that the rough plates generate higher source strengthes in this frequency range. A parametric study illustrates that the roughness height and roughness density significantly affect the far-field radiated roughness noise with the roughness height having the dominant effect. The estimates of the roughness noise for a Boeing 757 sized aircraft wing show that in high frequency region the sound radiated from surface roughness may exceed that from the trailing edge, and higher overall sound pressure levels for the roughness noise are also observed.


An approach which combines direct numerical simulation (DNS) with the Lighthill acoustic analogy theory is used to study the potential noise sources during the transition process of a Mach 2.25 flat plate boundary layer. The quadrupole sound sources due to the flow fluctuations and the dipole sound sources due to the fluctuating surface stress are obtained. Numerical results suggest that formation of the high shear layers leads to a dramatic amplification of amplitude of the fluctuating quadrupole sound sources. Compared with the quadrupole sound source, the energy of dipole sound source is concentrated in the relatively low frequency range.


Neste trabalho é descrita a teoria necessária para a obtenção da grandeza denominada intensidade supersônica, a qual tem por objetivo identificar as regiões de uma fonte de ruído que efetivamente contribuem para a potência sonora, filtrando, consequentemente, a parcela referente às ondas sonoras recirculantes e evanescentes. É apresentada a abordagem de Fourier para a obtenção da intensidade supersônica em fontes com geometrias separáveis e a formulação numérica existente para a obtenção de um equivalente à intensidade supersônica em fontes sonoras com geometrias arbitrárias. Este trabalho apresenta como principal contribuição original, uma técnica para o cálculo de um equivalente à intensidade supersônica, denominado aqui de intensidade acústica útil, capaz de identificar as regiões de uma superfície vibrante de geometria arbitrária que efetivamente contribuem para a potência sonora que será radiada. Ao contrário da formulação numérica existente, o modelo proposto é mais direto, totalmente formulado na superfície vibrante, onde a potência sonora é obtida através de um operador (uma matriz) que relaciona a potência sonora radiada com a distribuição de velocidade normal à superfície vibrante, obtida com o uso do método de elementos finitos. Tal operador, chamado aqui de operador de potência, é Hermitiano, fato crucial para a obtenção da intensidade acússtica útil, após a aplicação da decomposição em autovalores e autovetores no operador de potência, e do critério de truncamento proposto. Exemplos de aplicações da intensidade acústica útil em superfícies vibrantes com a geometria de uma placa, de um cilindro com tampas e de um silenciador automotivo são apresentados, e os resultados são comparados com os obtidos via intensidade supersônica (placa) e via técnica numérica existente (cilindro), evidenciando que a intensidade acústica útil traz, como benefício adicional, uma redução em relação ao tempo computacional quando comparada com a técnica numérica existente.


根据激光多普勒测振技术进行声光通信的工作原理,设计一种新型、小型激光多普勒测振信号鉴频电路。该电路根据外差探测原理,本地振荡器输出信号与探测信号混频得到一路信号,经90°移相后的本地振荡器输出信号再与探测信号混频得到另一路信号,利用这两路信号得到了多普勒频移量和声源振动的频率。利用扬声器激发的水面模拟振源进行实验,表明该电路可有效测量的振动频率范围为300 Hz~10 kHz,证明可用于水下光声通信。


介绍了激光多普勒振动计(LDV)用于水下声光通信的应用背景,阐述了激光多普勒振动计的工作原理和两种相干检测方式。采用零差的相干探测方式,设计并实现了一套光纤结构的激光多普勒振动计。为了证明系统能够应用于水下声光通信,进行了对水下声源发出的声波频率和强度的探测实验。通过对实验数据的分析得出:第一,系统能够检测出水下声源发出的声波频率,对7 kHz附近的10个声波频率的测量标准偏差小于8 Hz; 第二,系统探测信号强度与水下声源发声的声压级成指数关系,对于水下目标通信所用的3.5 kHz和7 kHz声波频段的


Em engenharia, a modelagem computacional desempenha um papel importante na concepção de produtos e no desenvolvimento de técnicas de atenuação de ruído. Nesse contexto, esta tese investiga a intensidade acústica gerada pela radiação sonora de superfícies vibrantes. De modo específico, a pesquisa enfoca a identificação das regiões de uma fonte sonora que contribuem efetivamente para potência sonora radiada para o campo afastado, quando a frequência de excitação ocorre abaixo da frequência crítica de coincidência. São descritas as fundamentações teóricas de duas diferentes abordagens. A primeira delas, denominada intensidade supersônica (analítica) é calculada via transformadas de Fourier para fontes sonoras com geometrias separáveis. A segunda, denominada intensidade útil (numérica) é calculada através do método dos elementos de contorno clássico para fontes com geometrias arbitrárias. Em ambas, a identificação das regiões é feita pela filtragem das ondas não propagantes (evanescentes). O trabalho está centrado em duas propostas, a saber. A primeira delas, é a apresentação implementação e análise de uma nova técnica numérica para o cálculo da grandeza intensidade útil. Essa técnica constitui uma variante do método dos elementos de contorno (MEC), tendo como base o fato de as aproximações para as variáveis acústicas pressão e velocidade normal serem tomadas como constantes em cada elemento. E também no modo peculiar de obter a velocidade constante através da média de um certo número de velocidades interiores a cada elemento. Por esse motivo, a técnica recebe o nome de método de elemento de contorno com velocidade média (AVBEMAverage Velocity Boundary Element Method). A segunda, é a obtenção da solução forma fechada do campo de velocidade normal para placas retangulares com oito diferentes combinações de condições contorno clássicas. Então, a intensidade supersônica é estimada e comparada à intensidade acústica. Nos ensaios numéricos, a comparação da intensidade útil obtida via MEC clássico e via AVBEM é mostrada para ilustrar a eficiência computacional da técnica aqui proposta, que traz como benefício adicional o fato de poder ser utilizada uma malha menos refinada para as simulações e, consequentemente, economia significativa de recursos computacionais.