467 resultados para Signal processing


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The QUT-NOISE-SRE protocol is designed to mix the large QUT-NOISE database, consisting of over 10 hours of back- ground noise, collected across 10 unique locations covering 5 common noise scenarios, with commonly used speaker recognition datasets such as Switchboard, Mixer and the speaker recognition evaluation (SRE) datasets provided by NIST. By allowing common, clean, speech corpora to be mixed with a wide variety of noise conditions, environmental reverberant responses, and signal-to-noise ratios, this protocol provides a solid basis for the development, evaluation and benchmarking of robust speaker recognition algorithms, and is freely available to download alongside the QUT-NOISE database. In this work, we use the QUT-NOISE-SRE protocol to evaluate a state-of-the-art PLDA i-vector speaker recognition system, demonstrating the importance of designing voice-activity-detection front-ends specifically for speaker recognition, rather than aiming for perfect coherence with the true speech/non-speech boundaries.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Meta-analyses estimate a statistical effect size for a test or an analysis by combining results from multiple studies without necessarily having access to each individual study's raw data. Multi-site meta-analysis is crucial for imaging genetics, as single sites rarely have a sample size large enough to pick up effects of single genetic variants associated with brain measures. However, if raw data can be shared, combining data in a "mega-analysis" is thought to improve power and precision in estimating global effects. As part of an ENIGMA-DTI investigation, we use fractional anisotropy (FA) maps from 5 studies (total N=2, 203 subjects, aged 9-85) to estimate heritability. We combine the studies through meta-and mega-analyses as well as a mixture of the two - combining some cohorts with mega-analysis and meta-analyzing the results with those of the remaining sites. A combination of mega-and meta-approaches may boost power compared to meta-analysis alone.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bioacoustic monitoring has become a significant research topic for species diversity conservation. Due to the development of sensing techniques, acoustic sensors are widely deployed in the field to record animal sounds over a large spatial and temporal scale. With large volumes of collected audio data, it is essential to develop semi-automatic or automatic techniques to analyse the data. This can help ecologists make decisions on how to protect and promote the species diversity. This paper presents generic features to characterize a range of bird species for vocalisation retrieval. In the implementation, audio recordings are first converted to spectrograms using short-time Fourier transform, then a ridge detection method is applied to the spectrogram for detecting points of interest. Based on the detected points, a new region representation are explored for describing various bird vocalisations and a local descriptor including temporal entropy, frequency bin entropy and histogram of counts of four ridge directions is calculated for each sub-region. To speed up the retrieval process, indexing is carried out and the retrieved results are ranked according to similarity scores. The experiment results show that our proposed feature set can achieve 0.71 in term of retrieval success rate which outperforms spectral ridge features alone (0.55) and Mel frequency cepstral coefficients (0.36).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Purpose To develop a signal processing paradigm for extracting ERG responses to temporal sinusoidal modulation with contrasts ranging from below perceptual threshold to suprathreshold contrasts. To estimate the magnitude of intrinsic noise in ERG signals at different stimulus contrasts. Methods Photopic test stimuli were generated using a 4-primary Maxwellian view optical system. The 4-primary lights were sinusoidally temporally modulated in-phase (36 Hz; 2.5 - 50% Michelson). The stimuli were presented in 1 s epochs separated by a 1 ms blank interval and repeated 160 times (160.16 s duration) during the recording of the continuous flicker ERG from the right eye using DTL fiber electrodes. After artefact rejection, the ERG signal was extracted using Fourier methods in each of the 1 s epochs where a stimulus was presented. The signal processing allows for computation of the intrinsic noise distribution in addition to the signal to noise (SNR) ratio. Results We provide the initial report that the ERG intrinsic noise distribution is independent of stimulus contrast whereas SNR decreases linearly with decreasing contrast until the noise limit at ~2.5%. The 1ms blank intervals between epochs de-correlated the ERG signal at the line frequency (50 Hz) and thus increased the SNR of the averaged response. We confirm that response amplitude increases linearly with stimulus contrast. The phase response shows a shallow positive relationship with stimulus contrast. Conclusions This new technique will enable recording of intrinsic noise in ERG signals above and below perceptual visual threshold and is suitable for measurement of continuous rod and cone ERGs across a range of temporal frequencies, and post-receptoral processing in the primary retinogeniculate pathways at low stimulus contrasts. The intrinsic noise distribution may have application as a biomarker for detecting changes in disease progression or treatment efficacy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, the results of the time dispersion parameters obtained from a set of channel measurements conducted in various environments that are typical of multiuser Infostation application scenarios are presented. The measurement procedure takes into account the practical scenarios typical of the positions and movements of the users in the particular Infostation network. To provide one with the knowledge of how much data can be downloaded by users over a given time and mobile speed, data transfer analysis for multiband orthogonal frequency division multiplexing (MB-OFDM) is presented. As expected, the rough estimate of simultaneous data transfer in a multiuser Infostation scenario indicates dependency of the percentage of download on the data size, number and speed of the users, and the elapse time.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Rolling-element bearing failures are the most frequent problems in rotating machinery, which can be catastrophic and cause major downtime. Hence, providing advance failure warning and precise fault detection in such components are pivotal and cost-effective. The vast majority of past research has focused on signal processing and spectral analysis for fault diagnostics in rotating components. In this study, a data mining approach using a machine learning technique called anomaly detection (AD) is presented. This method employs classification techniques to discriminate between defect examples. Two features, kurtosis and Non-Gaussianity Score (NGS), are extracted to develop anomaly detection algorithms. The performance of the developed algorithms was examined through real data from a test to failure bearing. Finally, the application of anomaly detection is compared with one of the popular methods called Support Vector Machine (SVM) to investigate the sensitivity and accuracy of this approach and its ability to detect the anomalies in early stages.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a system to analyze long field recordings with low signal-to-noise ratio (SNR) for bio-acoustic monitoring. A method based on spectral peak track, Shannon entropy, harmonic structure and oscillation structure is proposed to automatically detect anuran (frog) calling activity. Gaussian mixture model (GMM) is introduced for modelling those features. Four anuran species widespread in Queensland, Australia, are selected to evaluate the proposed system. A visualization method based on extracted indices is employed for detection of anuran calling activity which achieves high accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Acoustic classification of anurans (frogs) has received increasing attention for its promising application in biological and environment studies. In this study, a novel feature extraction method for frog call classification is presented based on the analysis of spectrograms. The frog calls are first automatically segmented into syllables. Then, spectral peak tracks are extracted to separate desired signal (frog calls) from background noise. The spectral peak tracks are used to extract various syllable features, including: syllable duration, dominant frequency, oscillation rate, frequency modulation, and energy modulation. Finally, a k-nearest neighbor classifier is used for classifying frog calls based on the results of principal component analysis. The experiment results show that syllable features can achieve an average classification accuracy of 90.5% which outperforms Mel-frequency cepstral coefficients features (79.0%).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Over past few decades, frog species have been experiencing dramatic decline around the world. The reason for this decline includes habitat loss, invasive species, climate change and so on. To better know the status of frog species, classifying frogs has become increasingly important. In this study, acoustic features are investigated for multi-level classication of Australian frogs: family, genus and species, including three families, eleven genera and eighty ve species which are collected from Queensland, Australia. For each frog species, six instances are selected from which ten acoustic features are calculated. Then, the multicollinearity between ten features are studied for selecting non-correlated features for subsequent analysis. A decision tree (DT) classier is used to visually and explicitly determine which acoustic features are relatively important for classifying family, which for genus, and which for species. Finally, a weighted support vector machines (SVMs) classier is used for the multi- level classication with three most important acoustic features respectively. Our experiment results indicate that using different acoustic feature sets can successfully classify frogs at different levels and the average classication accuracy can be up to 85.6%, 86.1% and 56.2% for family, genus and species respectively.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Video surveillance infrastructure has been widely installed in public places for security purposes. However, live video feeds are typically monitored by human staff, making the detection of important events as they occur difficult. As such, an expert system that can automatically detect events of interest in surveillance footage is highly desirable. Although a number of approaches have been proposed, they have significant limitations: supervised approaches, which can detect a specific event, ideally require a large number of samples with the event spatially and temporally localised; while unsupervised approaches, which do not require this demanding annotation, can only detect whether an event is abnormal and not specific event types. To overcome these problems, we formulate a weakly-supervised approach using Kullback-Leibler (KL) divergence to detect rare events. The proposed approach leverages the sparse nature of the target events to its advantage, and we show that this data imbalance guarantees the existence of a decision boundary to separate samples that contain the target event from those that do not. This trait, combined with the coarse annotation used by weakly supervised learning (that only indicates approximately when an event occurs), greatly reduces the annotation burden while retaining the ability to detect specific events. Furthermore, the proposed classifier requires only a decision threshold, simplifying its use compared to other weakly supervised approaches. We show that the proposed approach outperforms state-of-the-art methods on a popular real-world traffic surveillance dataset, while preserving real time performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Environmental changes have put great pressure on biological systems leading to the rapid decline of biodiversity. To monitor this change and protect biodiversity, animal vocalizations have been widely explored by the aid of deploying acoustic sensors in the field. Consequently, large volumes of acoustic data are collected. However, traditional manual methods that require ecologists to physically visit sites to collect biodiversity data are both costly and time consuming. Therefore it is essential to develop new semi-automated and automated methods to identify species in automated audio recordings. In this study, a novel feature extraction method based on wavelet packet decomposition is proposed for frog call classification. After syllable segmentation, the advertisement call of each frog syllable is represented by a spectral peak track, from which track duration, dominant frequency and oscillation rate are calculated. Then, a k-means clustering algorithm is applied to the dominant frequency, and the centroids of clustering results are used to generate the frequency scale for wavelet packet decomposition (WPD). Next, a new feature set named adaptive frequency scaled wavelet packet decomposition sub-band cepstral coefficients is extracted by performing WPD on the windowed frog calls. Furthermore, the statistics of all feature vectors over each windowed signal are calculated for producing the final feature set. Finally, two well-known classifiers, a k-nearest neighbour classifier and a support vector machine classifier, are used for classification. In our experiments, we use two different datasets from Queensland, Australia (18 frog species from commercial recordings and field recordings of 8 frog species from James Cook University recordings). The weighted classification accuracy with our proposed method is 99.5% and 97.4% for 18 frog species and 8 frog species respectively, which outperforms all other comparable methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents an effective classification method based on Support Vector Machines (SVM) in the context of activity recognition. Local features that capture both spatial and temporal information in activity videos have made significant progress recently. Efficient and effective features, feature representation and classification plays a crucial role in activity recognition. For classification, SVMs are popularly used because of their simplicity and efficiency; however the common multi-class SVM approaches applied suffer from limitations including having easily confused classes and been computationally inefficient. We propose using a binary tree SVM to address the shortcomings of multi-class SVMs in activity recognition. We proposed constructing a binary tree using Gaussian Mixture Models (GMM), where activities are repeatedly allocated to subnodes until every new created node contains only one activity. Then, for each internal node a separate SVM is learned to classify activities, which significantly reduces the training time and increases the speed of testing compared to popular the `one-against-the-rest' multi-class SVM classifier. Experiments carried out on the challenging and complex Hollywood dataset demonstrates comparable performance over the baseline bag-of-features method.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational requirements, their performance is still limited for real world applications due to a lack of contextual information and models not being tailored to specific activities. We propose a new activity representation framework to address the shortcomings of the popular, but simple bag-of-words approach. In our framework, first multiple instance SVM (mi-SVM) is used to identify positive features for each action category and the k-means algorithm is used to generate a codebook. Then locality-constrained linear coding is used to encode the features into the generated codebook, followed by spatio-temporal pyramid pooling to convey the spatio-temporal statistics. Finally, an SVM is used to classify the videos. Experiments carried out on two popular datasets with varying complexity demonstrate significant performance improvement over the base-line bag-of-feature method.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Hydraulic instabilities represent a critical problem for Francis and Kaplan turbines, reducing their useful life due to increase of fatigue on the components and cavitation phenomena. Whereas an exhaustive list of publications on computational fluid-dynamic models of hydraulic instability is available, the possibility of applying diagnostic techniques based on vibration measurements has not been investigated sufficiently, also because the appropriate sensors seldom equip hydro turbine units. The aim of this study is to fill this knowledge gap and to exploit fully, for this purpose, the potentiality of combining cyclostationary analysis tools, able to describe complex dynamics such as those of fluid-structure interactions, with order tracking procedures, allowing domain transformations and consequently the separation of synchronous and non-synchronous components. This paper will focus on experimental data obtained on a full-scale Kaplan turbine unit, operating in a real power plant, tackling the issues of adapting such diagnostic tools for the analysis of hydraulic instabilities and proposing techniques and methodologies for a highly automated condition monitoring system. 2015 Elsevier Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cyclostationary analysis has proven effective in identifying signal components for diagnostic purposes. A key descriptor in this framework is the cyclic power spectrum, traditionally estimated by the averaged cyclic periodogram and the smoothed cyclic periodogram. A lengthy debate about the best estimator finally found a solution in a cornerstone work by Antoni, who proposed a unified form for the two families, thus allowing a detailed statistical study of their properties. Since then, the focus of cyclostationary research has shifted towards algorithms, in terms of computational efficiency and simplicity of implementation. Traditional algorithms have proven computationally inefficient and the sophisticated "cyclostationary" definition of these estimators slowed their spread in the industry. The only attempt to increase the computational efficiency of cyclostationary estimators is represented by the cyclic modulation spectrum. This indicator exploits the relationship between cyclostationarity and envelope analysis. The link with envelope analysis allows a leap in computational efficiency and provides a "way in" for the understanding by industrial engineers. However, the new estimator lies outside the unified form described above and an unbiased version of the indicator has not been proposed. This paper will therefore extend the analysis of envelope-based estimators of the cyclic spectrum, proposing a new approach to include them in the unified form of cyclostationary estimators. This will enable the definition of a new envelope-based algorithm and the detailed analysis of the properties of the cyclic modulation spectrum. The computational efficiency of envelope-based algorithms will be also discussed quantitatively for the first time in comparison with the averaged cyclic periodogram. Finally, the algorithms will be validated with numerical and experimental examples.