264 resultados para Oscillators, Audio-frequency
Resumo:
Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
Automated digital recordings are useful for large-scale temporal and spatial environmental monitoring. An important research effort has been the automated classification of calling bird species. In this paper we examine a related task, retrieval of birdcalls from a database of audio recordings, similar to a user supplied query call. Such a retrieval task can sometimes be more useful than an automated classifier. We compare three approaches to similarity-based birdcall retrieval using spectral ridge features and two kinds of gradient features, structure tensor and the histogram of oriented gradients. The retrieval accuracy of our spectral ridge method is 94% compared to 82% for the structure tensor method and 90% for the histogram of gradients method. Additionally, this approach potentially offers a more compact representation and is more computationally efficient.
Resumo:
Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).
Resumo:
Piezoelectric ultrasound transducers are commonly used to convert mechanical energy to electrical energy and vice versa. The transducer performance is highly affected by the frequency at which it is excited. If excitation frequency and main resonant frequency match, transducers can deliver maximum power. However, the problem is that main resonant frequency changes in real time operation resulting in low power conversion. To achieve the maximum possible power conversion, the transducer should be excited at its resonant frequency estimated in real time. This paper proposes a method to first estimate the resonant frequency of the transducer and then tunes the excitation frequency accordingly in real time. The measurement showed a significant difference between the offline and real time resonant frequencies. Also, it was shown that the maximum power was achieved at the resonant frequency estimated in real time compare to the one measured offline.
Resumo:
Objective Ankylosing spondylitis (AS) is a highly heritable common inflammatory arthritis that targets the spine and sacroiliac joints of the pelvis, causing pain and stiffness and leading eventually to joint fusion. Although previous studies have shown a strong association of IL23R with AS in white Europeans, similar studies in East Asian populations have shown no association with common variants of IL23R, suggesting either that IL23R variants have no role or that rare genetic variants contribute. The present study was undertaken to screen IL23R to identify rare variants associated with AS in Han Chinese. Methods A 170-kb region containing IL23R and its flanking regions was sequenced in 50 patients with AS and 50 ethnically matched healthy control subjects from a Han Chinese population. In addition, the 30-kb region of peak association in white Europeans was sequenced in 650 patients with AS and 1,300 healthy controls. Validation genotyping was undertaken in 846 patients with AS and 1,308 healthy controls. Results We identified 1,047 variants, of which 729 were not found in the dbSNP genomic build 130. Several potentially functional rare variants in IL23R were identified, including one nonsynonomous single-nucleotide polymorphism (nsSNP), Gly149Arg (position 67421184 GA on chromosome 1). Validation genotyping showed that the Gly149Arg variant was associated with AS (odds ratio 0.61, P = 0.0054). Conclusion This is the first study to implicate rare IL23R variants in the pathogenesis of AS. The results identified a low-frequency nsSNP with predicted loss-of-function effects that was protectively associated with AS in Han Chinese, suggesting that decreased function of the interleukin-23 (IL-23) receptor protects against AS. These findings further support the notion that IL-23 signaling has an important role in the pathogenesis of AS.
Resumo:
The paper presents an improved Phase-Locked Loop (PLL) for measuring the fundamental frequency and selective harmonic content of a distorted signal. This information can be used by grid interfaced devices and harmonic compensators. The single-phase structure is based on the Synchronous Reference Frame (SRF) PLL. The proposed PLL needs only a limited number of harmonic stages by incorporating Moving Average Filters (MAF) for eliminating the undesired harmonic content at each stage. The frequency dependency of MAF in effective filtering of undesired harmonics is also dealt with by a proposed method for adaptation to frequency variations of input signal. The method is suitable for high sampling rates and a wide frequency measurement range. Furthermore, an extended model of this structure is proposed which includes the response to both the frequency and phase angle variations. The proposed algorithm is simulated and verified using Hardware-in-the-Loop (HIL) testing.
Resumo:
为研究风电并网对互联系统低频振荡的影响,基于完整的双馈风电机组模型,定性分析了两区域互联系统在风电机组并网前后阻尼特性的变化情况.从双馈风电机组并网输送距离、并网容量、互联系统联络线传送功率、是否加装电力系统稳定器等多个方面,多角度分析了风电场并网对互联系统小干扰稳定及低频振荡特性的影响.之后,以两个包括两个区域的电力系统为例,进行了系统的计算分析和比较.结果表明,有双馈风电机组接入的互联电力系统,在不同运行模式下,双馈风电机组的并网输送距离、出力水平、联络线传送功率对低频振荡模式的影响在趋势和程度上均有显著差异,这样在对风电场进行入网规划、设计和运行时就需要综合考虑这些因素的影响.
Resumo:
The aim of the study was to examine differences in total body water (TBW) measured using single-frequency (SF) and multi-frequency (MF) modes of bioelectrical impedance spectroscopy (BIS) in children and adults measured in different postures using the deuterium (2H) dilution technique as the reference. Twenty-three boys and 26 adult males underwent assessment of TBW using the dilution technique and BIS measured in supine and standing positions using two frequencies of the SF mode (50 kHz and 100 kHz) and the MF mode. While TBW estimated from the MF mode was comparable, extra-cellular fluid (ECF) and intra-cellular fluid (ICF) values differed significantly (p < 0.01) between the different postures in both groups. In addition, while estimated TBW in adult males using the MF mode was significantly (p < 0.01) greater than the result from the dilution technique, TBW estimated using the SF mode and prediction equation was significantly (p < 0.01) lower in boys. Measurement posture may not affect estimation of TBW in boys and adult males, however, body fluid shifts may still occur. In addition, technical factors, including selection of prediction equation, may be important when TBW is estimated from measured impedance.
Resumo:
Pattern recognition is a promising approach for the identification of structural damage using measured dynamic data. Much of the research on pattern recognition has employed artificial neural networks (ANNs) and genetic algorithms as systematic ways of matching pattern features. The selection of a damage-sensitive and noise-insensitive pattern feature is important for all structural damage identification methods. Accordingly, a neural networks-based damage detection method using frequency response function (FRF) data is presented in this paper. This method can effectively consider uncertainties of measured data from which training patterns are generated. The proposed method reduces the dimension of the initial FRF data and transforms it into new damage indices and employs an ANN method for the actual damage localization and quantification using recognized damage patterns from the algorithm. In civil engineering applications, the measurement of dynamic response under field conditions always contains noise components from environmental factors. In order to evaluate the performance of the proposed strategy with noise polluted data, noise contaminated measurements are also introduced to the proposed algorithm. ANNs with optimal architecture give minimum training and testing errors and provide precise damage detection results. In order to maximize damage detection results, the optimal architecture of ANN is identified by defining the number of hidden layers and the number of neurons per hidden layer by a trial and error method. In real testing, the number of measurement points and the measurement locations to obtain the structure response are critical for damage detection. Therefore, optimal sensor placement to improve damage identification is also investigated herein. A finite element model of a two storey framed structure is used to train the neural network. It shows accurate performance and gives low error with simulated and noise-contaminated data for single and multiple damage cases. As a result, the proposed method can be used for structural health monitoring and damage detection, particularly for cases where the measurement data is very large. Furthermore, it is suggested that an optimal ANN architecture can detect damage occurrence with good accuracy and can provide damage quantification with reasonable accuracy under varying levels of damage.
Resumo:
Acoustic classification of anurans (frogs) has received increasing attention for its promising application in biological and environment studies. In this study, a novel feature extraction method for frog call classification is presented based on the analysis of spectrograms. The frog calls are first automatically segmented into syllables. Then, spectral peak tracks are extracted to separate desired signal (frog calls) from background noise. The spectral peak tracks are used to extract various syllable features, including: syllable duration, dominant frequency, oscillation rate, frequency modulation, and energy modulation. Finally, a k-nearest neighbor classifier is used for classifying frog calls based on the results of principal component analysis. The experiment results show that syllable features can achieve an average classification accuracy of 90.5% which outperforms Mel-frequency cepstral coefficients features (79.0%).
Resumo:
Frogs have received increasing attention due to their effectiveness for indicating the environment change. Therefore, it is important to monitor and assess frogs. With the development of sensor techniques, large volumes of audio data (including frog calls) have been collected and need to be analysed. After transforming the audio data into its spectrogram representation using short-time Fourier transform, the visual inspection of this representation motivates us to use image processing techniques for analysing audio data. Applying acoustic event detection (AED) method to spectrograms, acoustic events are firstly detected from which ridges are extracted. Three feature sets, Mel-frequency cepstral coefficients (MFCCs), AED feature set and ridge feature set, are then used for frog call classification with a support vector machine classifier. Fifteen frog species widely spread in Queensland, Australia, are selected to evaluate the proposed method. The experimental results show that ridge feature set can achieve an average classification accuracy of 74.73% which outperforms the MFCCs (38.99%) and AED feature set (67.78%).
Resumo:
This research has made contributions to the area of spoken term detection (STD), defined as the process of finding all occurrences of a specified search term in a large collection of speech segments. The use of visual information in the form of lip movements of the speaker in addition to audio and the use of topic of the speech segments, and the expected frequency of words in the target speech domain, are proposed. By using these complementary information, improvement in the performance of STD has been achieved which enables efficient search of key words in large collection of multimedia documents.
Resumo:
The extended recruitment season for short-lived species such as prawns biases the estimation of growth parameters from length-frequency data when conventional methods are used. We propose a simple method for overcoming this bias given a time series of length-frequency data. The difficulties arising from extended recruitment are eliminated by predicting the growth of the succeeding samples and the length increments of the recruits in previous samples. This method requires that some maximum size at recruitment can be specified. The advantages of this multiple length-frequency method are: it is simple to use; it requires only three parameters; no specific distributions need to be assumed; and the actual seasonal recruitment pattern does not have to be specified. We illustrate the new method with length-frequency data on the tiger prawn Penaeus esculentus from the north-western Gulf of Carpentaria, Australia.
Resumo:
We consider estimation of mortality rates and growth parameters from length-frequency data of a fish stock and derive the underlying length distribution of the population and the catch when there is individual variability in the von Bertalanffy growth parameter L-infinity. The model is flexible enough to accommodate 1) any recruitment pattern as a function of both time and length, 2) length-specific selectivity, and 3) varying fishing effort over time. The maximum likelihood method gives consistent estimates, provided the underlying distribution for individual variation in growth is correctly specified. Simulation results indicate that our method is reasonably robust to violations in the assumptions. The method is applied to tiger prawn data (Penaeus semisulcatus) to obtain estimates of natural and fishing mortality.