945 resultados para Acoustic Arrays, Array Signal Processing, Calibration, Speech Enhancement


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a rectangular array antenna with a suitable signal-processing algorithm that is able to steer the beam in azimuth over a wide frequency band. In the previous approach, which was reported in the literature, an inverse discrete Fourier transform technique was proposed for obtaining the signal weighting coefficients. This approach was demonstrated for large arrays in which the physical parameters of the antenna elements were not considered. In this paper, a modified signal-weighting algorithm that works for arbitrary-size arrays is described. Its validity is demonstrated in examples of moderate-size arrays with real antenna elements. It is shown that in some cases, the original beam-forming algorithm fails, while the new algorithm is able to form the desired radiation pattern over a wide frequency band. The performance of the new algorithm is assessed for two cases when the mutual coupling between array elements is both neglected and taken into account.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation focuses on two vital challenges in relation to whale acoustic signals: detection and classification.

In detection, we evaluated the influence of the uncertain ocean environment on the spectrogram-based detector, and derived the likelihood ratio of the proposed Short Time Fourier Transform detector. Experimental results showed that the proposed detector outperforms detectors based on the spectrogram. The proposed detector is more sensitive to environmental changes because it includes phase information.

In classification, our focus is on finding a robust and sparse representation of whale vocalizations. Because whale vocalizations can be modeled as polynomial phase signals, we can represent the whale calls by their polynomial phase coefficients. In this dissertation, we used the Weyl transform to capture chirp rate information, and used a two dimensional feature set to represent whale vocalizations globally. Experimental results showed that our Weyl feature set outperforms chirplet coefficients and MFCC (Mel Frequency Cepstral Coefficients) when applied to our collected data.

Since whale vocalizations can be represented by polynomial phase coefficients, it is plausible that the signals lie on a manifold parameterized by these coefficients. We also studied the intrinsic structure of high dimensional whale data by exploiting its geometry. Experimental results showed that nonlinear mappings such as Laplacian Eigenmap and ISOMAP outperform linear mappings such as PCA and MDS, suggesting that the whale acoustic data is nonlinear.

We also explored deep learning algorithms on whale acoustic data. We built each layer as convolutions with either a PCA filter bank (PCANet) or a DCT filter bank (DCTNet). With the DCT filter bank, each layer has different a time-frequency scale representation, and from this, one can extract different physical information. Experimental results showed that our PCANet and DCTNet achieve high classification rate on the whale vocalization data set. The word error rate of the DCTNet feature is similar to the MFSC in speech recognition tasks, suggesting that the convolutional network is able to reveal acoustic content of speech signals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

By modification of the classical retrodirective arrays (RDAs) architecture a directional modulation (DM) transmitter can be realized without the need for synthesis. Importantly, through analytical analysis and exemplar simulations, it is proved that, besides the conventional DM application scenario, i.e., secure transmission to one legitimate receiver located along one spatial direction in free space, the proposed synthesis-free DM transmitter should also perform well for systems where there are more than one legitimate receivers positioned along different directions in free space, and where one or more legitimate receivers exist in a multipath environment. None of these have ever been achieved before using synthesis-free DM arrangements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The canonical representation of speech constitutes a perfect reconstruction (PR) analysis-synthesis system. Its parameters are the autoregressive (AR) model coefficients, the pitch period and the voiced and unvoiced components of the excitation represented as transform coefficients. Each set of parameters may be operated on independently. A time-frequency unvoiced excitation (TFUNEX) model is proposed that has high time resolution and selective frequency resolution. Improved time-frequency fit is obtained by using for antialiasing cancellation the clustering of pitch-synchronous transform tracks defined in the modulation transform domain. The TFUNEX model delivers high-quality speech while compressing the unvoiced excitation representation about 13 times over its raw transform coefficient representation for wideband speech.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Three-dimensional (3D) synthetic aperture radar (SAR) imaging via multiple-pass processing is an extension of interferometric SAR imaging. It exploits more than two flight passes to achieve a desired resolution in elevation. In this paper, a novel approach is developed to reconstruct a 3D space-borne SAR image with multiple-pass processing. It involves image registration, phase correction and elevational imaging. An image model matching is developed for multiple image registration, an eigenvector method is proposed for the phase correction and the elevational imaging is conducted using a Fourier transform or a super-resolution method for enhancement of elevational resolution. 3D SAR images are obtained by processing simulated data and real data from the first European Remote Sensing satellite (ERS-1) with the proposed approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes an implementation of a long distance echo canceller, operating on full-duplex with hands-free and in real-time with a single Digital Signal Processor (DSP). The proposed solution is based on short length adaptive filters centered on the positions of the most significant echoes, which are tracked by time delay estimators, for which we use a new approach. To deal with double talking situations a speech detector is employed. The floating-point DSP TMS320C6713 from Texas Instruments is used with software written in C++, with compiler optimizations for fast execution. The resulting algorithm enables long distance echo cancellation with low computational requirements, suited for embbeded systems. It reaches greater echo return loss enhancement and shows faster convergence speed when compared to the conventional approach. The experimental results approach the CCITT G.165 recommendation levels.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, existing 3D scanning cameras and microscopes in the market use digital or discrete sensors, such as CCDs or CMOS for object detection applications. However, these combined systems are not fast enough for some application scenarios since they require large data processing resources and can be cumbersome. Thereby, there is a clear interest in exploring the possibilities and performances of analogue sensors such as arrays of position sensitive detectors with the final goal of integrating them in 3D scanning cameras or microscopes for object detection purposes. The work performed in this thesis deals with the implementation of prototype systems in order to explore the application of object detection using amorphous silicon position sensors of 32 and 128 lines which were produced in the clean room at CENIMAT-CEMOP. During the first phase of this work, the fabrication and the study of the static and dynamic specifications of the sensors as well as their conditioning in relation to the existing scientific and technological knowledge became a starting point. Subsequently, relevant data acquisition and suitable signal processing electronics were assembled. Various prototypes were developed for the 32 and 128 array PSD sensors. Appropriate optical solutions were integrated to work together with the constructed prototypes, allowing the required experiments to be carried out and allowing the achievement of the results presented in this thesis. All control, data acquisition and 3D rendering platform software was implemented for the existing systems. All these components were combined together to form several integrated systems for the 32 and 128 line PSD 3D sensors. The performance of the 32 PSD array sensor and system was evaluated for machine vision applications such as for example 3D object rendering as well as for microscopy applications such as for example micro object movement detection. Trials were also performed involving the 128 array PSD sensor systems. Sensor channel non-linearities of approximately 4 to 7% were obtained. Overall results obtained show the possibility of using a linear array of 32/128 1D line sensors based on the amorphous silicon technology to render 3D profiles of objects. The system and setup presented allows 3D rendering at high speeds and at high frame rates. The minimum detail or gap that can be detected by the sensor system is approximately 350 μm when using this current setup. It is also possible to render an object in 3D within a scanning angle range of 15º to 85º and identify its real height as a function of the scanning angle and the image displacement distance on the sensor. Simple and not so simple objects, such as a rubber and a plastic fork, can be rendered in 3D properly and accurately also at high resolution, using this sensor and system platform. The nip structure sensor system can detect primary and even derived colors of objects by a proper adjustment of the integration time of the system and by combining white, red, green and blue (RGB) light sources. A mean colorimetric error of 25.7 was obtained. It is also possible to detect the movement of micrometer objects using the 32 PSD sensor system. This kind of setup offers the possibility to detect if a micro object is moving, what are its dimensions and what is its position in two dimensions, even at high speeds. Results show a non-linearity of about 3% and a spatial resolution of < 2µm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Gas sensing systems based on low-cost chemical sensor arrays are gaining interest for the analysis of multicomponent gas mixtures. These sensors show different problems, e.g., nonlinearities and slow time-response, which can be partially solved by digital signal processing. Our approach is based on building a nonlinear inverse dynamic system. Results for different identification techniques, including artificial neural networks and Wiener series, are compared in terms of measurement accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

At high magnetic field strengths (≥ 3T), the radiofrequency wavelength used in MRI is of the same order of magnitude of (or smaller than) the typical sample size, making transmit magnetic field (B1+) inhomogeneities more prominent. Methods such as radiofrequency-shimming and transmit SENSE have been proposed to mitigate these undesirable effects. A prerequisite for such approaches is an accurate and rapid characterization of the B1+ field in the organ of interest. In this work, a new phase-sensitive three-dimensional B1+-mapping technique is introduced that allows the acquisition of a 64 × 64 × 8 B1+-map in ≈ 20 s, yielding an accurate mapping of the relative B1+ with a 10-fold dynamic range (0.2-2 times the nominal B1+). Moreover, the predominant use of low flip angle excitations in the presented sequence minimizes specific absorption rate, which is an important asset for in vivo B1+-shimming procedures at high magnetic fields. The proposed methodology was validated in phantom experiments and demonstrated good results in phantom and human B1+-shimming using an 8-channel transmit-receive array.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The need for high performance, high precision, and energy saving in rotating machinery demands an alternative solution to traditional bearings. Because of the contactless operation principle, the rotating machines employing active magnetic bearings (AMBs) provide many advantages over the traditional ones. The advantages such as contamination-free operation, low maintenance costs, high rotational speeds, low parasitic losses, programmable stiffness and damping, and vibration insulation come at expense of high cost, and complex technical solution. All these properties make the use of AMBs appropriate primarily for specific and highly demanding applications. High performance and high precision control requires model-based control methods and accurate models of the flexible rotor. In turn, complex models lead to high-order controllers and feature considerable computational burden. Fortunately, in the last few years the advancements in signal processing devices provide new perspective on the real-time control of AMBs. The design and the real-time digital implementation of the high-order LQ controllers, which focus on fast execution times, are the subjects of this work. In particular, the control design and implementation in the field programmable gate array (FPGA) circuits are investigated. The optimal design is guided by the physical constraints of the system for selecting the optimal weighting matrices. The plant model is complemented by augmenting appropriate disturbance models. The compensation of the force-field nonlinearities is proposed for decreasing the uncertainty of the actuator. A disturbance-observer-based unbalance compensation for canceling the magnetic force vibrations or vibrations in the measured positions is presented. The theoretical studies are verified by the practical experiments utilizing a custom-built laboratory test rig. The test rig uses a prototyping control platform developed in the scope of this work. To sum up, the work makes a step in the direction of an embedded single-chip FPGA-based controller of AMBs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Digit speech recognition is important in many applications such as automatic data entry, PIN entry, voice dialing telephone, automated banking system, etc. This paper presents speaker independent speech recognition system for Malayalam digits. The system employs Mel frequency cepstrum coefficient (MFCC) as feature for signal processing and Hidden Markov model (HMM) for recognition. The system is trained with 21 male and female voices in the age group of 20 to 40 years and there was 98.5% word recognition accuracy (94.8% sentence recognition accuracy) on a test set of continuous digit recognition task.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modeling nonlinear systems using Volterra series is a century old method but practical realizations were hampered by inadequate hardware to handle the increased computational complexity stemming from its use. But interest is renewed recently, in designing and implementing filters which can model much of the polynomial nonlinearities inherent in practical systems. The key advantage in resorting to Volterra power series for this purpose is that nonlinear filters so designed can be made to work in parallel with the existing LTI systems, yielding improved performance. This paper describes the inclusion of a quadratic predictor (with nonlinearity order 2) with a linear predictor in an analog source coding system. Analog coding schemes generally ignore the source generation mechanisms but focuses on high fidelity reconstruction at the receiver. The widely used method of differential pnlse code modulation (DPCM) for speech transmission uses a linear predictor to estimate the next possible value of the input speech signal. But this linear system do not account for the inherent nonlinearities in speech signals arising out of multiple reflections in the vocal tract. So a quadratic predictor is designed and implemented in parallel with the linear predictor to yield improved mean square error performance. The augmented speech coder is tested on speech signals transmitted over an additive white gaussian noise (AWGN) channel.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses the implementation details of a child friendly, good quality, English text-to-speech (TTS) system that is phoneme-based, concatenative, easy to set up and use with little memory. Direct waveform concatenation and linear prediction coding (LPC) are used. Most existing TTS systems are unit-selection based, which use standard speech databases available in neutral adult voices.Here reduced memory is achieved by the concatenation of phonemes and by replacing phonetic wave files with their LPC coefficients. Linguistic analysis was used to reduce the algorithmic complexity instead of signal processing techniques. Sufficient degree of customization and generalization catering to the needs of the child user had been included through the provision for vocabulary and voice selection to suit the requisites of the child. Prosody had also been incorporated. This inexpensive TTS systemwas implemented inMATLAB, with the synthesis presented by means of a graphical user interface (GUI), thus making it child friendly. This can be used not only as an interesting language learning aid for the normal child but it also serves as a speech aid to the vocally disabled child. The quality of the synthesized speech was evaluated using the mean opinion score (MOS).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes certain findings of intonation and intensity study of emotive speech with the minimal use of signal processing algorithms. This study was based on six basic emotions and the neutral, elicited from 1660 English utterances obtained from the speech recordings of six Indian women. The correctness of the emotional content was verified through perceptual listening tests. Marked similarity was noted among pitch contours of like-worded, positive valence emotions, though no such similarity was observed among the four negative valence emotional expressions. The intensity patterns were also studied. The results of the study were validated using arbitrary television recordings for four emotions. The findings are useful to technical researchers, social psychologists and to the common man interested in the dynamics of vocal expression of emotions