997 resultados para audio processing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present study explored whether semantic and motor systems are functionally interwoven via the use of a dual-task paradigm. According to embodied language accounts that propose an automatic and necessary involvement of the motor system in conceptual processing, concurrent processing of hand-related information should interfere more with hand movements than processing of unrelated body-part (i.e., foot, mouth) information. Across three experiments, 100 right-handed participants performed left- or right-hand tapping movements while repeatedly reading action words related to different body-parts, or different body-part names, in both aloud and silent conditions. Concurrent reading of single words related to specific body-parts, or the same words embedded in sentences differing in syntactic and phonological complexity (to manipulate context-relevant processing), and reading while viewing videos of the actions and body-parts described by the target words (to elicit visuomotor associations) all interfered with right-hand but not left-hand tapping rate. However, this motor interference was not affected differentially by hand-related stimuli. Thus, the results provide no support for proposals that body-part specific resources in cortical motor systems are shared between overt manual movements and meaning-related processing of words related to the hand.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automated digital recordings are useful for large-scale temporal and spatial environmental monitoring. An important research effort has been the automated classification of calling bird species. In this paper we examine a related task, retrieval of birdcalls from a database of audio recordings, similar to a user supplied query call. Such a retrieval task can sometimes be more useful than an automated classifier. We compare three approaches to similarity-based birdcall retrieval using spectral ridge features and two kinds of gradient features, structure tensor and the histogram of oriented gradients. The retrieval accuracy of our spectral ridge method is 94% compared to 82% for the structure tensor method and 90% for the histogram of gradients method. Additionally, this approach potentially offers a more compact representation and is more computationally efficient.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustic recordings of the environment provide an effective means to monitor bird species diversity. To facilitate exploration of acoustic recordings, we describe a content-based birdcall retrieval algorithm. A query birdcall is a region of spectrogram bounded by frequency and time. Retrieval depends on a similarity measure derived from the orientation and distribution of spectral ridges. The spectral ridge detection method caters for a broad range of birdcall structures. In this paper, we extend previous work by incorporating a spectrogram scaling step in order to improve the detection of spectral ridges. Compared to an existing approach based on MFCC features, our feature representation achieves better retrieval performance for multiple bird species in noisy recordings.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have prepared p-n junction organic photovoltaic cells using an all solution processing method with poly(3-hexylthiophene) (P3HT) as the donor and phenyl-C 61-butyric acid methyl ester (PCBM) as the acceptor. Interdigitated donor/acceptor interface morphology was observed in the device processed with the lowest boiling point solvent for PCBM used in this study. The influences of different solvents on donor/acceptor morphology and respective device performance were investigated simultaneously. The best device obtained had characteristically rough interface morphology with a peak to valley value ∼15 nm. The device displayed a power conversion efficiency of 1.78%, an open circuit voltage (V oc) 0.44 V, a short circuit current density (J sc) 9.4 mA/cm 2 and a fill factor 43%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective Self-report measures are typically used to assess the effectiveness of road safety advertisements. However, psychophysiological measures of persuasive processing (i.e., skin conductance response [SCR]) and objective driving measures of persuasive outcomes (i.e., in-vehicle GPS devices) may provide further insights into the effectiveness of these advertisements. This study aimed to explore the persuasive processing and outcomes of two anti-speeding advertisements by incorporating both self-report and objective measures of speeding behaviour. In addition, this study aimed to compare the findings derived from these different measurement approaches. Methods Young drivers (N = 20, Mage = 21.01 years) viewed either a positive or negative emotion-based anti-speeding television advertisement. Whilst viewing the advertisement, SCR activity was measured to assess ad-evoked arousal responses. The RoadScout® GPS device was then installed into participants’ vehicles for one week to measure on-road speed-related driving behaviour. Self-report measures assessed persuasive processing (emotional and arousal responses) and actual driving behaviour. Results There was general correspondence between the self-report measures of arousal and the SCR and between the self-report measure of actual driving behaviour and the objective driving data (as assessed via the GPS devices). Conclusions This study provides insights into how psychophysiological and GPS devices could be used as objective measures in conjunction with self-report measures to further understand the persuasive processes and outcomes of emotion-based anti-speeding advertisements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This research investigated differences and associations in performance in number processing and executive function for children attending primary school in a large Australian metropolitan city. In a cross-sectional study, performance of 25 children in the first full-time year of school, (Prep; mean age = 5.5 years) and 21 children in Year 3 (mean age = 8.5 years) completed three number processing tasks and three executive function tasks. Year 3 children consistently outperformed the Prep year children on measures of accuracy and reaction time, on the tasks of number comparison, calculation, shifting, and inhibition but not on number line estimation. The components of executive function (shifting, inhibition, and working memory) showed different patterns of correlation to performance on number processing tasks across the early years of school. Findings could be used to enhance teachers’ understanding about the role of the cognitive processes employed by children in numeracy learning, and so inform teachers’ classroom practices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustic classification of anurans (frogs) has received increasing attention for its promising application in biological and environment studies. In this study, a novel feature extraction method for frog call classification is presented based on the analysis of spectrograms. The frog calls are first automatically segmented into syllables. Then, spectral peak tracks are extracted to separate desired signal (frog calls) from background noise. The spectral peak tracks are used to extract various syllable features, including: syllable duration, dominant frequency, oscillation rate, frequency modulation, and energy modulation. Finally, a k-nearest neighbor classifier is used for classifying frog calls based on the results of principal component analysis. The experiment results show that syllable features can achieve an average classification accuracy of 90.5% which outperforms Mel-frequency cepstral coefficients features (79.0%).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The development of a microstructure in 304L stainless steel during industrial hot-forming operations, including press forging (mean strain rate of 0.15 s(-1)), rolling/extrusion (2-5 s(-1)), and hammer forging (100 s(-1)) at different temperatures in the range 600-1200 degrees C, was studied with a view to validating the predictions of the processing map. The results have shown that excellent correlation exists between the regimes exhibited by the map and the product microstructures. 304L stainless steel exhibits instability bands when hammer forged at temperatures below 1100 degrees C, rolled/extruded below 1000 degrees C, or press forged below 800 degrees C. All of these conditions must be avoided in mechanical processing of the material. On the other hand, ideally, the material may be rolled, extruded, or press forged at 1200 degrees C to obtain a defect-free microstructure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The hot deformation behavior of hot isostatically pressed (HIPd) P/M IN-100 superalloy has been studied in the temperature range 1000-1200 degrees C and strain rate range 0.0003-10 s(-1) using hot compression testing. A processing map has been developed on the basis of these data and using the principles of dynamic materials modelling. The map exhibited three domains: one at 1050 degrees C and 0.01 s(-1), with a peak efficiency of power dissipation of approximate to 32%, the second at 1150 degrees C and 10 s(-1), with a peak efficiency of approximate to 36% and the third at 1200 degrees C and 0.1 s(-1), with a similar efficiency. On the basis of optical and electron microscopic observations, the first domain was interpreted to represent dynamic recovery of the gamma phase, the second domain represents dynamic recrystallization (DRX) of gamma in the presence of softer gamma', while the third domain represents DRX of the gamma phase only. The gamma' phase is stable upto 1150 degrees C, gets deformed below this temperature and the chunky gamma' accumulates dislocations, which at larger strains cause cracking of this phase. At temperatures lower than 1080 degrees C and strain rates higher than 0.1 s(-1), the material exhibits flow instability, manifested in the form of adiabatic shear bands. The material may be subjected to mechanical processing without cracking or instabilities at 1200 degrees C and 0.1 s(-1), which are the conditions for DRX of the gamma phase.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An overview of the synthesis of materials under microwave irradiation has been presented based on the work performed recently. A variety of reactions such as direct combination, carbothermal reduction, carbidation and nitridation have been described. Examples of microwave preparation of glasses are also presented. Great advantages of fast, clean and reduced reaction temperature of microwave methods are emphasized. The example of ZrO2-CeO2 ceramics has been used show the extraordinarily fast and effective sintering which occurs in microwave irradiation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Power dissipation maps have been generated in the temperature range of 900 degrees C to 1150 degrees C and strain rate range of 10(-3) to 10 s(-1) for a cast aluminide alloy Ti-24Al-20Nb using dynamic material model. The results define two distinct regimes of temperature and strain rate in which efficiency of power dissipation is maximum. The first region, centered around 975 degrees C/0.1 s(-1), is shown to correspond to dynamic recrystallization of the alpha(2) phase and the second, centered around 1150 degrees C/0.001 s(-1), corresponds to dynamic recovery and superplastic deformation of the beta phase. Thermal activation analysis using the power law creep equation yielded apparent activation energies of 854 and 627 kJ/mol for the first and second regimes, respectively. Reanalyzing the data by alternate methods yielded activation energies in the range of 170 to 220 kJ/mol and 220 to 270 kJ/mol for the first and second regimes, respectively. Cross slip was shown to constitute the activation barrier in both cases. Two distinct regimes of processing instability-one at high strain rates and the other at the low strain rates in the lower temperature regions-have been identified, within which shear bands are formed.