313 resultados para biomimetic pattern recognition
Resumo:
Faunal vocalisations are vital indicators for environmental change and faunal vocalisation analysis can provide information for answering ecological questions. Therefore, automated species recognition in environmental recordings has become a critical research area. This thesis presents an automated species recognition approach named Timed and Probabilistic Automata. A small lexicon for describing animal calls is defined, six algorithms for acoustic component detection are developed, and a series of species recognisers are built and evaluated.The presented automated species recognition approach yields significant improvement on the analysis performance over a real world dataset, and may be transferred to commercial software in the future.
Resumo:
A novel shape recognition algorithm was developed to autonomously classify the Northern Pacific Sea Star (Asterias amurenis) from benthic images that were collected by the Starbug AUV during 6km of transects in the Derwent estuary. Despite the effects of scattering, attenuation, soft focus and motion blur within the underwater images, an optimal joint classification rate of 77.5% and misclassification rate of 13.5% was achieved. The performance of algorithm was largely attributed to its ability to recognise locally deformed sea star shapes that were created during the segmentation of the distorted images.
Resumo:
Acoustics is a rich source of environmental information that can reflect the ecological dynamics. To deal with the escalating acoustic data, a variety of automated classification techniques have been used for acoustic patterns or scene recognition, including urban soundscapes such as streets and restaurants; and natural soundscapes such as raining and thundering. It is common to classify acoustic patterns under the assumption that a single type of soundscapes present in an audio clip. This assumption is reasonable for some carefully selected audios. However, only few experiments have been focused on classifying simultaneous acoustic patterns in long-duration recordings. This paper proposes a binary relevance based multi-label classification approach to recognise simultaneous acoustic patterns in one-minute audio clips. By utilising acoustic indices as global features and multilayer perceptron as a base classifier, we achieve good classification performance on in-the-field data. Compared with single-label classification, multi-label classification approach provides more detailed information about the distributions of various acoustic patterns in long-duration recordings. These results will merit further biodiversity investigations, such as bird species surveys.
Resumo:
The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, such as indexing and sorting of large collections of such images, or as a precursor to optical character recognition (OCR). In this paper, we investigate the use of texture as a tool for determining the script of a document image, based on the observation that text has a distinct visual texture. An experimental evaluation of a number of commonly used texture features is conducted on a newly created script database, providing a qualitative measure of which features are most appropriate for this task. Strategies for improving classification results in situations with limited training data and multiple font types are also proposed.
Resumo:
The effectiveness of higher-order spectral (HOS) phase features in speaker recognition is investigated by comparison with Mel Cepstral features on the same speech data. HOS phase features retain phase information from the Fourier spectrum unlikeMel–frequency Cepstral coefficients (MFCC). Gaussian mixture models are constructed from Mel– Cepstral features and HOS features, respectively, for the same data from various speakers in the Switchboard telephone Speech Corpus. Feature clusters, model parameters and classification performance are analyzed. HOS phase features on their own provide a correct identification rate of about 97% on the chosen subset of the corpus. This is the same level of accuracy as provided by MFCCs. Cluster plots and model parameters are compared to show that HOS phase features can provide complementary information to better discriminate between speakers.