974 resultados para Acoustic signal classification
Resumo:
The applications of Automatic Vowel Recognition (AVR), which is a sub-part of fundamental importance in most of the speech processing systems, vary from automatic interpretation of spoken language to biometrics. State-of-the-art systems for AVR are based on traditional machine learning models such as Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), however, such classifiers can not deal with efficiency and effectiveness at the same time, existing a gap to be explored when real-time processing is required. In this work, we present an algorithm for AVR based on the Optimum-Path Forest (OPF), which is an emergent pattern recognition technique recently introduced in literature. Adopting a supervised training procedure and using speech tags from two public datasets, we observed that OPF has outperformed ANNs, SVMs, plus other classifiers, in terms of training time and accuracy. ©2010 IEEE.
Resumo:
The effect of snoring on the cardiovascular system is not well-known. In this study we analyzed the Heart Rate Variability (HRV) differences between light and heavy snorers. The experiments are done on the full-whole-night polysomnography (PSG) with ECG and audio channels from patient group (heavy snorer) and control group (light snorer), which are gender- and age-paired, totally 30 subjects. A feature Snoring Density (SND) of audio signal as classification criterion and HRV features are computed. Mann-Whitney statistical test and Support Vector Machine (SVM) classification are done to see the correlation. The result of this study shows that snoring has close impact on the HRV features. This result can provide a deeper insight into the physiological understand of snoring. © 2011 CCAL.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
In the present study, we evaluated peripheral and central auditory pathways in professional musicians (with and without hearing loss) compared to non-musicians. The goal was to verify if music exposure could affect auditory pathways as a whole. This is a prospective study that compared the results obtained between three groups (musicians with and without hearing loss and non-musicians). Thirty-two male individuals participated and they were assessed by: Immittance measurements, pure-tone air conduction thresholds at all frequencies from 0.25 to 20 kHz, Transient Evoked Otoacoustic Emissions, Auditory Brainstem Response (ABR), and Cognitive Potential. The musicians showed worse hearing thresholds in both conventional and high frequency audiometry when compared to the non-musicians; the mean amplitude of Transient Evoked Otoacoustic Emissions was smaller in the musicians group, but the mean latencies of Auditory Brainstem Response and Cognitive Potential were diminished in the musicians when compared to the non-musicians. Our findings suggest that the population of musicians is at risk for developing music-induced hearing loss. However, the electrophysiological evaluation showed that latency waves of ABR and P300 were diminished in musicians, which may suggest that the auditory training to which these musicians are exposed acts as a facilitator of the acoustic signal transmission to the cortex.
Resumo:
[EN] Migrant biota transports carbon to the mesopelagic zone due to their feeding at the shallower layers and their defecation, respiration, excretion and mortality at depth. The so-called active flux has been considered a small number compared to gravitational sinking. Recent assessments in subtropical waters show an important effect due to predation by interzonal diel vertical migrants (DVMs). The consumption and subsequent transport of epipelagic zooplankton by DVMs (mainly micronekton) to the mesopelagic zone seemed similar to the mean gravitational export. However, the consequences of this active transport to the bathypelagic zone are almost unknown. Here, we show the effect of the Atlantic and Pacific equatorial upwelling systems on the vertical distribution of acoustic backscatter from the surface to bathypelagic depths. The enhancement of the acoustic signal below the upwelling zone was observed to reach 4000 m depth, coinciding with high abundances and activity of bacteria at those depths. The results suggest an active carbon transport from the epipelagic driven by zooplankton and micronekton, enhancing the efficiency of the biological pump and giving an insight about the fate of an increased productivity at the shallower layers of the ocean
Resumo:
In the present thesis we address the problem of detecting and localizing a small spherical target with characteristic electrical properties inside a volume of cylindrical shape, representing female breast, with MWI. One of the main works of this project is to properly extend the existing linear inversion algorithm from planar slice to volume reconstruction; results obtained, under the same conditions and experimental setup are reported for the two different approaches. Preliminar comparison and performance analysis of the reconstruction algorithms is performed via numerical simulations in a software-created environment: a single dipole antenna is used for illuminating the virtual breast phantom from different positions and, for each position, the corresponding scattered field value is registered. Collected data are then exploited in order to reconstruct the investigation domain, along with the scatterer position, in the form of image called pseudospectrum. During this process the tumor is modeled as a dielectric sphere of small radius and, for electromagnetic scattering purposes, it's treated as a point-like source. To improve the performance of reconstruction technique, we repeat the acquisition for a number of frequencies in a given range: the different pseudospectra, reconstructed from single frequency data, are incoherently combined with MUltiple SIgnal Classification (MUSIC) method which returns an overall enhanced image. We exploit multi-frequency approach to test the performance of 3D linear inversion reconstruction algorithm while varying the source position inside the phantom and the height of antenna plane. Analysis results and reconstructed images are then reported. Finally, we perform 3D reconstruction from experimental data gathered with the acquisition system in the microwave laboratory at DIFA, University of Bologna for a recently developed breast-phantom prototype; obtained pseudospectrum and performance analysis for the real model are reported.
Resumo:
Edges are important cues defining coherent auditory objects. As a model of auditory edges, sound on- and offset are particularly suitable to study their neural underpinnings because they contrast a specific physical input against no physical input. Change from silence to sound, that is onset, has extensively been studied and elicits transient neural responses bilaterally in auditory cortex. However, neural activity associated with sound onset is not only related to edge detection but also to novel afferent inputs. Edges at the change from sound to silence, that is offset, are not confounded by novel physical input and thus allow to examine neural activity associated with sound edges per se. In the first experiment, we used silent acquisition functional magnetic resonance imaging and found that the offset of pulsed sound activates planum temporale, superior temporal sulcus and planum polare of the right hemisphere. In the planum temporale and the superior temporal sulcus, offset response amplitudes were related to the pulse repetition rate of the preceding stimulation. In the second experiment, we found that these offset-responsive regions were also activated by single sound pulses, onset of sound pulse sequences and single sound pulse omissions within sound pulse sequences. However, they were not active during sustained sound presentation. Thus, our data show that circumscribed areas in right temporal cortex are specifically involved in identifying auditory edges. This operation is crucial for translating acoustic signal time series into coherent auditory objects.
Resumo:
In the last years, many analyses from acoustic signal processing have been used for different applications. In most cases, these sensor systems are based on the determination of times of flight for signals from every transducer. This paper presents a flat plate generalization method for impact detection and location over linear links or bars-based structures. The use of three piezoelectric sensors allow to achieve the position and impact time while the use of additional sensors lets cover a larger area of detection and avoid wrong timing difference measurements. An experimental setup and some experimental results are briefly presented.
Resumo:
At the level of the cochlear nucleus (CN), the auditory pathway divides into several parallel circuits, each of which provides a different representation of the acoustic signal. Here, the representation of the power spectrum of an acoustic signal is analyzed for two CN principal cells—chopper neurons of the ventral CN and type IV neurons of the dorsal CN. The analysis is based on a weighting function model that relates the discharge rate of a neuron to first- and second-order transformations of the power spectrum. In chopper neurons, the transformation of spectral level into rate is a linear (i.e., first-order) or nearly linear function. This transformation is a predominantly excitatory process involving multiple frequency components, centered in a narrow frequency range about best frequency, that usually are processed independently of each other. In contrast, type IV neurons encode spectral information linearly only near threshold. At higher stimulus levels, these neurons are strongly inhibited by spectral notches, a behavior that cannot be explained by level transformations of first- or second-order. Type IV weighting functions reveal complex excitatory and inhibitory interactions that involve frequency components spanning a wider range than that seen in choppers. These findings suggest that chopper and type IV neurons form parallel pathways of spectral information transmission that are governed by two different mechanisms. Although choppers use a predominantly linear mechanism to transmit tonotopic representations of spectra, type IV neurons use highly nonlinear processes to signal the presence of wide-band spectral features.
Resumo:
One of the most popular techniques for creating spatialized virtual sounds is based on the use of Head-Related Transfer Functions (HRTFs). HRTFs are signal processing models that represent the modifications undergone by the acoustic signal as it travels from a sound source to each of the listener's eardrums. These modifications are due to the interaction of the acoustic waves with the listener's torso, shoulders, head and pinnae, or outer ears. As such, HRTFs are somewhat different for each listener. For a listener to perceive synthesized 3-D sound cues correctly, the synthesized cues must be similar to the listener's own HRTFs. ^ One can measure individual HRTFs using specialized recording systems, however, these systems are prohibitively expensive and restrict the portability of the 3-D sound system. HRTF-based systems also face several computational challenges. This dissertation presents an alternative method for the synthesis of binaural spatialized sounds. The sound entering the pinna undergoes several reflective, diffractive and resonant phenomena, which determine the HRTF. Using signal processing tools, such as Prony's signal modeling method, an appropriate set of time delays and a resonant frequency were used to approximate the measured Head-Related Impulse Responses (HRIRs). Statistical analysis was used to find out empirical equations describing how the reflections and resonances are determined by the shape and size of the pinna features obtained from 3D images of 15 experimental subjects modeled in the project. These equations were used to yield “Model HRTFs” that can create elevation effects. ^ Listening tests conducted on 10 subjects show that these model HRTFs are 5% more effective than generic HRTFs when it comes to localizing sounds in the frontal plane. The number of reversals (perception of sound source above the horizontal plane when actually it is below the plane and vice versa) was also reduced by 5.7%, showing the perceptual effectiveness of this approach. The model is simple, yet versatile because it relies on easy to measure parameters to create an individualized HRTF. This low-order parameterized model also reduces the computational and storage demands, while maintaining a sufficient number of perceptually relevant spectral cues. ^
Resumo:
This thesis investigated the risk of accidental release of hydrocarbons during transportation and storage. Transportation of hydrocarbons from an offshore platform to processing units through subsea pipelines involves risk of release due to pipeline leakage resulting from corrosion, plastic deformation caused by seabed shakedown or damaged by contact with drifting iceberg. The environmental impacts of hydrocarbon dispersion can be severe. Overall safety and economic concerns of pipeline leakage at subsea environment are immense. A large leak can be detected by employing conventional technology such as, radar, intelligent pigging or chemical tracer but in a remote location like subsea or arctic, a small chronic leak may be undetected for a period of time. In case of storage, an accidental release of hydrocarbon from the storage tank could lead pool fire; further it could escalate to domino effects. This chain of accidents may lead to extremely severe consequences. Analyzing past accident scenarios it is observed that more than half of the industrial domino accidents involved fire as a primary event, and some other factors for instance, wind speed and direction, fuel type and engulfment of the compound. In this thesis, a computational fluid dynamics (CFD) approach is taken to model the subsea pipeline leak and the pool fire from a storage tank. A commercial software package ANSYS FLUENT Workbench 15 is used to model the subsea pipeline leakage. The CFD simulation results of four different types of fluids showed that the static pressure and pressure gradient along the axial length of the pipeline have a sharp signature variation near the leak orifice at steady state condition. Transient simulation is performed to obtain the acoustic signature of the pipe near leak orifice. The power spectral density (PSD) of acoustic signal is strong near the leak orifice and it dissipates as the distance and orientation from the leak orifice increase. The high-pressure fluid flow generates more noise than the low-pressure fluid flow. In order to model the pool fire from the storage tank, ANSYS CFX Workbench 14 is used. The CFD results show that the wind speed has significant contribution on the behavior of pool fire and its domino effects. The radiation contours are also obtained from CFD post processing, which can be applied for risk analysis. The outcome of this study will be helpful for better understanding of the domino effects of pool fire in complex geometrical settings of process industries. The attempt to reduce and prevent risks is discussed based on the results obtained from the numerical simulations of the numerical models.
Resumo:
Speech perception routinely takes place in noisy or degraded listening environments, leading to ambiguity in the identity of the speech token. Here, I present one review paper and two experimental papers that highlight cognitive and visual speech contributions to the listening process, particularly in challenging listening environments. First, I survey the literature linking audiometric age-related hearing loss and cognitive decline and review the four proposed causal mechanisms underlying this link. I argue that future research in this area requires greater consideration of the functional overlap between hearing and cognition. I also present an alternative framework for understanding causal relationships between age-related declines in hearing and cognition, with emphasis on the interconnected nature of hearing and cognition and likely contributions from multiple causal mechanisms. I also provide a number of testable hypotheses to examine how impairments in one domain may affect the other. In my first experimental study, I examine the direct contribution of working memory (through a cognitive training manipulation) on speech in noise comprehension in older adults. My results challenge the efficacy of cognitive training more generally, and also provide support for the contribution of sentence context in reducing working memory load. My findings also challenge the ubiquitous use of the Reading Span test as a pure test of working memory. In a second experimental (fMRI) study, I examine the role of attention in audiovisual speech integration, particularly when the acoustic signal is degraded. I demonstrate that attentional processes support audiovisual speech integration in the middle and superior temporal gyri, as well as the fusiform gyrus. My results also suggest that the superior temporal sulcus is sensitive to intelligibility enhancement, regardless of how this benefit is obtained (i.e., whether it is obtained through visual speech information or speech clarity). In addition, I also demonstrate that both the cingulo-opercular network and motor speech areas are recruited in difficult listening conditions. Taken together, these findings augment our understanding of cognitive contributions to the listening process and demonstrate that memory, working memory, and executive control networks may flexibly be recruited in order to meet listening demands in challenging environments.
Resumo:
Les techniques des directions d’arrivée (DOA) sont une voie prometteuse pour accroitre la capacité des systèmes et les services de télécommunications en permettant de mieux estimer le canal radio-mobile. Elles permettent aussi de suivre précisément des usagers cellulaires pour orienter les faisceaux d’antennes dans leur direction. S’inscrivant dans ce contexte, ce présent mémoire décrit étape par étape l’implémentation de l’algorithme de haut niveau MUSIC (MUltiple SIgnal Classification) sur une plateforme FPGA afin de déterminer en temps réel l’angle d’arrivée d’une ou des sources incidentes à un réseau d’antennes. Le concept du prototypage rapide des lois de commande (RCP) avec les outils de XilinxTM System generator (XSG) et du MBDK (Model Based Design Kit) de NutaqTM est le concept de développement utilisé. Ce concept se base sur une programmation de code haut niveau à travers des modèles, pour générer automatiquement un code de bas niveau. Une attention particulière est portée sur la méthode choisie pour résoudre le problème de la décomposition en valeurs et vecteurs propres de la matrice complexe de covariance par l’algorithme de Jacobi. L’architecture mise en place implémentant cette dernière dans le FPGA (Field Programmable Gate Array) est détaillée. Par ailleurs, il est prouvé que MUSIC ne peut effectuer une estimation intéressante de la position des sources sans une calibration préalable du réseau d’antennes. Ainsi, la technique de calibration par matrice G utilisée dans ce projet est présentée, en plus de son modèle d’implémentation. Enfin, les résultats expérimentaux du système mis à l’épreuve dans un environnement réel en présence d’une source puis de deux sources fortement corrélées sont illustrés et analysés.