94 resultados para spectral shift


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Secondary tasks such as cell phone calls or interaction with automated speech dialog systems (SDSs) increase the driver’s cognitive load as well as the probability of driving errors. This study analyzes speech production variations due to cognitive load and emotional state of drivers in real driving conditions. Speech samples were acquired from 24 female and 17 male subjects (approximately 8.5 h of data) while talking to a co-driver and communicating with two automated call centers, with emotional states (neutral, negative) and the number of necessary SDS query repetitions also labeled. A consistent shift in a number of speech production parameters (pitch, first format center frequency, spectral center of gravity, spectral energy spread, and duration of voiced segments) was observed when comparing SDS interaction against co-driver interaction; further increases were observed when considering negative emotion segments and the number of requested SDS query repetitions. A mel frequency cepstral coefficient based Gaussian mixture classifier trained on 10 male and 10 female sessions provided 91% accuracy in the open test set task of distinguishing co-driver interactions from SDS interactions, suggesting—together with the acoustic analysis—that it is possible to monitor the level of driver distraction directly from their speech.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Natural iowaite, magnesium–ferric oxychloride mineral having light green color originating from Australia has been characterized by EPR, optical, IR, and Raman spectroscopy. The optical spectrum exhibits a number of electronic bands due to both Fe(III) and Mn(II) ions in iowaite. From EPR studies, the g values are calculated for Fe(III) and g and A values for Mn(II). EPR and optical absorption studies confirm that Fe(III) and Mn(II) are in distorted octahedral geometry. The bands that appear both in NIR and Raman spectra are due to the overtones and combinations of water and carbonate molecules. Thus EPR, optical, and Raman spectroscopy have proven most useful for the study of the chemistry of natural iowaite and chemical changes in the mineral.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A voglite mineral sample of Volrite Canyon #1 mine, Frey Point, White Canyon Mine District, San Juan County, Utah, USA is used in the present study. An EPR study on powdered sample confirms the presence of Mn(II) and Cu(II). Optical absorption spectral results are due to Cu(II) which is in distorted octahedron. NIR results are indicating the presence of water fundamentals.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

NIR and IR spectroscopy has been applied for detection of chemical species and the nature of hydrogen bonding in arsenate complexes. The structure and spectral properties of copper(II) arsenate minerals chalcophyllite and chenevixite are compared with copper(II) sulphate minerals devilline, chalcoalumite and caledonite. Split NIR bands in the electronic spectrum of two ranges 11700-8500 cm-1 and 8500-7200 cm-1 confirm distortion of octahedral symmetry for Cu(II) in the arsenate complexes. The observed bands with maxima at 9860 and 7750 cm-1 are assigned to Cu(II) transitions 2B1g ® 2B2g and 2B1g ® 2A1g. Overlapping bands in the NIR region 4500-4000 cm-1 is the effect of multi anions OH-, (AsO4)3- and (SO4)2-. The observation of broad and diffuse bands in the range 3700-2900 cm-1 confirms strong hydrogen bonding in chalcophyllite relative to chenevixite. The position of the water bending vibrations indicates the water is strongly hydrogen bonded in the mineral structure. The strong absorption feature centred at 1644 cm-1 in chalcophyllite indicates water is strongly hydrogen bonded in the mineral structure. The H2O-bending vibrations shift to low wavenumbers in chenevixite and an additional band observed at 1390 cm-1 is related to carbonate impurity. The characterisation of IR spectra by ν3 antisymmetric stretching vibrations of (SO4)2- and (AsO4)3 ions near 1100 and 800 cm-1 respectively is the result of isomorphic substitution for arsenate by sulphate in both the minerals of chalcophyllite and chenevixite.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Robust image hashing seeks to transform a given input image into a shorter hashed version using a key-dependent non-invertible transform. These image hashes can be used for watermarking, image integrity authentication or image indexing for fast retrieval. This paper introduces a new method of generating image hashes based on extracting Higher Order Spectral features from the Radon projection of an input image. The feature extraction process is non-invertible, non-linear and different hashes can be produced from the same image through the use of random permutations of the input. We show that the transform is robust to typical image transformations such as JPEG compression, noise, scaling, rotation, smoothing and cropping. We evaluate our system using a verification-style framework based on calculating false match, false non-match likelihoods using the publicly available Uncompressed Colour Image database (UCID) of 1320 images. We also compare our results to Swaminathan’s Fourier-Mellin based hashing method with at least 1% EER improvement under noise, scaling and sharpening.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis deals with the problem of the instantaneous frequency (IF) estimation of sinusoidal signals. This topic plays significant role in signal processing and communications. Depending on the type of the signal, two major approaches are considered. For IF estimation of single-tone or digitally-modulated sinusoidal signals (like frequency shift keying signals) the approach of digital phase-locked loops (DPLLs) is considered, and this is Part-I of this thesis. For FM signals the approach of time-frequency analysis is considered, and this is Part-II of the thesis. In part-I we have utilized sinusoidal DPLLs with non-uniform sampling scheme as this type is widely used in communication systems. The digital tanlock loop (DTL) has introduced significant advantages over other existing DPLLs. In the last 10 years many efforts have been made to improve DTL performance. However, this loop and all of its modifications utilizes Hilbert transformer (HT) to produce a signal-independent 90-degree phase-shifted version of the input signal. Hilbert transformer can be realized approximately using a finite impulse response (FIR) digital filter. This realization introduces further complexity in the loop in addition to approximations and frequency limitations on the input signal. We have tried to avoid practical difficulties associated with the conventional tanlock scheme while keeping its advantages. A time-delay is utilized in the tanlock scheme of DTL to produce a signal-dependent phase shift. This gave rise to the time-delay digital tanlock loop (TDTL). Fixed point theorems are used to analyze the behavior of the new loop. As such TDTL combines the two major approaches in DPLLs: the non-linear approach of sinusoidal DPLL based on fixed point analysis, and the linear tanlock approach based on the arctan phase detection. TDTL preserves the main advantages of the DTL despite its reduced structure. An application of TDTL in FSK demodulation is also considered. This idea of replacing HT by a time-delay may be of interest in other signal processing systems. Hence we have analyzed and compared the behaviors of the HT and the time-delay in the presence of additive Gaussian noise. Based on the above analysis, the behavior of the first and second-order TDTLs has been analyzed in additive Gaussian noise. Since DPLLs need time for locking, they are normally not efficient in tracking the continuously changing frequencies of non-stationary signals, i.e. signals with time-varying spectra. Nonstationary signals are of importance in synthetic and real life applications. An example is the frequency-modulated (FM) signals widely used in communication systems. Part-II of this thesis is dedicated for the IF estimation of non-stationary signals. For such signals the classical spectral techniques break down, due to the time-varying nature of their spectra, and more advanced techniques should be utilized. For the purpose of instantaneous frequency estimation of non-stationary signals there are two major approaches: parametric and non-parametric. We chose the non-parametric approach which is based on time-frequency analysis. This approach is computationally less expensive and more effective in dealing with multicomponent signals, which are the main aim of this part of the thesis. A time-frequency distribution (TFD) of a signal is a two-dimensional transformation of the signal to the time-frequency domain. Multicomponent signals can be identified by multiple energy peaks in the time-frequency domain. Many real life and synthetic signals are of multicomponent nature and there is little in the literature concerning IF estimation of such signals. This is why we have concentrated on multicomponent signals in Part-H. An adaptive algorithm for IF estimation using the quadratic time-frequency distributions has been analyzed. A class of time-frequency distributions that are more suitable for this purpose has been proposed. The kernels of this class are time-only or one-dimensional, rather than the time-lag (two-dimensional) kernels. Hence this class has been named as the T -class. If the parameters of these TFDs are properly chosen, they are more efficient than the existing fixed-kernel TFDs in terms of resolution (energy concentration around the IF) and artifacts reduction. The T-distributions has been used in the IF adaptive algorithm and proved to be efficient in tracking rapidly changing frequencies. They also enables direct amplitude estimation for the components of a multicomponent

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The concept of radar was developed for the estimation of the distance (range) and velocity of a target from a receiver. The distance measurement is obtained by measuring the time taken for the transmitted signal to propagate to the target and return to the receiver. The target's velocity is determined by measuring the Doppler induced frequency shift of the returned signal caused by the rate of change of the time- delay from the target. As researchers further developed conventional radar systems it become apparent that additional information was contained in the backscattered signal and that this information could in fact be used to describe the shape of the target itself. It is due to the fact that a target can be considered to be a collection of individual point scatterers, each of which has its own velocity and time- delay. DelayDoppler parameter estimation of each of these point scatterers thus corresponds to a mapping of the target's range and cross range, thus producing an image of the target. Much research has been done in this area since the early radar imaging work of the 1960s. At present there are two main categories into which radar imaging falls. The first of these is related to the case where the backscattered signal is considered to be deterministic. The second is related to the case where the backscattered signal is of a stochastic nature. In both cases the information which describes the target's scattering function is extracted by the use of the ambiguity function, a function which correlates the backscattered signal in time and frequency with the transmitted signal. In practical situations, it is often necessary to have the transmitter and the receiver of the radar system sited at different locations. The problem in these situations is 'that a reference signal must then be present in order to calculate the ambiguity function. This causes an additional problem in that detailed phase information about the transmitted signal is then required at the receiver. It is this latter problem which has led to the investigation of radar imaging using time- frequency distributions. As will be shown in this thesis, the phase information about the transmitted signal can be extracted from the backscattered signal using time- frequency distributions. The principle aim of this thesis was in the development, and subsequent discussion into the theory of radar imaging, using time- frequency distributions. Consideration is first given to the case where the target is diffuse, ie. where the backscattered signal has temporal stationarity and a spatially white power spectral density. The complementary situation is also investigated, ie. where the target is no longer diffuse, but some degree of correlation exists between the time- frequency points. Computer simulations are presented to demonstrate the concepts and theories developed in the thesis. For the proposed radar system to be practically realisable, both the time- frequency distributions and the associated algorithms developed must be able to be implemented in a timely manner. For this reason an optical architecture is proposed. This architecture is specifically designed to obtain the required time and frequency resolution when using laser radar imaging. The complex light amplitude distributions produced by this architecture have been computer simulated using an optical compiler.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of appropriate features to characterize an output class or object is critical for all classification problems. This paper evaluates the capability of several spectral and texture features for object-based vegetation classification at the species level using airborne high resolution multispectral imagery. Image-objects as the basic classification unit were generated through image segmentation. Statistical moments extracted from original spectral bands and vegetation index image are used as feature descriptors for image objects (i.e. tree crowns). Several state-of-art texture descriptors such as Gray-Level Co-Occurrence Matrix (GLCM), Local Binary Patterns (LBP) and its extensions are also extracted for comparison purpose. Support Vector Machine (SVM) is employed for classification in the object-feature space. The experimental results showed that incorporating spectral vegetation indices can improve the classification accuracy and obtained better results than in original spectral bands, and using moments of Ratio Vegetation Index obtained the highest average classification accuracy in our experiment. The experiments also indicate that the spectral moment features also outperform or can at least compare with the state-of-art texture descriptors in terms of classification accuracy.