952 resultados para 080106 Image Processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Acoustic recordings of the environment provide an effective means to monitor bird species diversity. To facilitate exploration of acoustic recordings, we describe a content-based birdcall retrieval algorithm. A query birdcall is a region of spectrogram bounded by frequency and time. Retrieval depends on a similarity measure derived from the orientation and distribution of spectral ridges. The spectral ridge detection method caters for a broad range of birdcall structures. In this paper, we extend previous work by incorporating a spectrogram scaling step in order to improve the detection of spectral ridges. Compared to an existing approach based on MFCC features, our feature representation achieves better retrieval performance for multiple bird species in noisy recordings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Frog species have been declining worldwide at unprecedented rates in the past decades. There are many reasons for this decline including pollution, habitat loss, and invasive species [1]. To preserve, protect, and restore frog biodiversity, it is important to monitor and assess frog species. In this paper, a novel method using image processing techniques for analyzing Australian frog vocalisations is proposed. An FFT is applied to audio data to produce a spectrogram. Then, acoustic events are detected and isolated into corresponding segments through image processing techniques applied to the spectrogram. For each segment, spectral peak tracks are extracted with selected seeds and a region growing technique is utilised to obtain the contour of each frog vocalisation. Based on spectral peak tracks and the contour of each frog vocalisation, six feature sets are extracted. Principal component analysis reduces each feature set down to six principal components which are tested for classification performance with a k-nearest neighbor classifier. This experiment tests the proposed method of classification on fourteen frog species which are geographically well distributed throughout Queensland, Australia. The experimental results show that the best average classification accuracy for the fourteen frog species can be up to 87%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Frogs have received increasing attention due to their effectiveness for indicating the environment change. Therefore, it is important to monitor and assess frogs. With the development of sensor techniques, large volumes of audio data (including frog calls) have been collected and need to be analysed. After transforming the audio data into its spectrogram representation using short-time Fourier transform, the visual inspection of this representation motivates us to use image processing techniques for analysing audio data. Applying acoustic event detection (AED) method to spectrograms, acoustic events are firstly detected from which ridges are extracted. Three feature sets, Mel-frequency cepstral coefficients (MFCCs), AED feature set and ridge feature set, are then used for frog call classification with a support vector machine classifier. Fifteen frog species widely spread in Queensland, Australia, are selected to evaluate the proposed method. The experimental results show that ridge feature set can achieve an average classification accuracy of 74.73% which outperforms the MFCCs (38.99%) and AED feature set (67.78%).

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Summary Generalized Procrustes analysis and thin plate splines were employed to create an average 3D shape template of the proximal femur that was warped to the size and shape of a single 2D radiographic image of a subject. Mean absolute depth errors are comparable with previous approaches utilising multiple 2D input projections. Introduction Several approaches have been adopted to derive volumetric density (g cm-3) from a conventional 2D representation of areal bone mineral density (BMD, g cm-2). Such approaches have generally aimed at deriving an average depth across the areal projection rather than creating a formal 3D shape of the bone. Methods Generalized Procrustes analysis and thin plate splines were employed to create an average 3D shape template of the proximal femur that was subsequently warped to suit the size and shape of a single 2D radiographic image of a subject. CT scans of excised human femora, 18 and 24 scanned at pixel resolutions of 1.08 mm and 0.674 mm, respectively, were equally split into training (created 3D shape template) and test cohorts. Results The mean absolute depth errors of 3.4 mm and 1.73 mm, respectively, for the two CT pixel sizes are comparable with previous approaches based upon multiple 2D input projections. Conclusions This technique has the potential to derive volumetric density from BMD and to facilitate 3D finite element analysis for prediction of the mechanical integrity of the proximal femur. It may further be applied to other anatomical bone sites such as the distal radius and lumbar spine.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To date, automatic recognition of semantic information such as salient objects and mid-level concepts from images is a challenging task. Since real-world objects tend to exist in a context within their environment, the computer vision researchers have increasingly incorporated contextual information for improving object recognition. In this paper, we present a method to build a visual contextual ontology from salient objects descriptions for image annotation. The ontologies include not only partOf/kindOf relations, but also spatial and co-occurrence relations. A two-step image annotation algorithm is also proposed based on ontology relations and probabilistic inference. Different from most of the existing work, we specially exploit how to combine representation of ontology, contextual knowledge and probabilistic inference. The experiments show that image annotation results are improved in the LabelMe dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe the design and evaluation of a platform for networks of cameras in low-bandwidth, low-power sensor networks. In our work to date we have investigated two different DSP hardware/software platforms for undertaking the tasks of compression and object detection and tracking. We compare the relative merits of each of the hardware and software platforms in terms of both performance and energy consumption. Finally we discuss what we believe are the ongoing research questions for image processing in WSNs.