967 resultados para 080109 Pattern Recognition and Data Mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation about information modelling and artificial intelligence, semantic structure, cognitive processing and quantum theory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vigilance declines when exposed to highly predictable and uneventful tasks. Monotonous tasks provide little cognitive and motor stimulation and contribute to human errors. This paper aims to model and detect vigilance decline in real time through participant’s reaction times during a monotonous task. A lab-based experiment adapting the Sustained Attention to Response Task (SART) is conducted to quantify the effect of monotony on overall performance. Then relevant parameters are used to build a model detecting hypovigilance throughout the experiment. The accuracy of different mathematical models are compared to detect in real-time – minute by minute - the lapses in vigilance during the task. We show that monotonous tasks can lead to an average decline in performance of 45%. Furthermore, vigilance modelling enables to detect vigilance decline through reaction times with an accuracy of 72% and a 29% false alarm rate. Bayesian models are identified as a better model to detect lapses in vigilance as compared to Neural Networks and Generalised Linear Mixed Models. This modelling could be used as a framework to detect vigilance decline of any human performing monotonous tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Continuous biometric authentication schemes (CBAS) are built around the biometrics supplied by user behavioural characteristics and continuously check the identity of the user throughout the session. The current literature for CBAS primarily focuses on the accuracy of the system in order to reduce false alarms. However, these attempts do not consider various issues that might affect practicality in real world applications and continuous authentication scenarios. One of the main issues is that the presented CBAS are based on several samples of training data either of both intruder and valid users or only the valid users' profile. This means that historical profiles for either the legitimate users or possible attackers should be available or collected before prediction time. However, in some cases it is impractical to gain the biometric data of the user in advance (before detection time). Another issue is the variability of the behaviour of the user between the registered profile obtained during enrollment, and the profile from the testing phase. The aim of this paper is to identify the limitations in current CBAS in order to make them more practical for real world applications. Also, the paper discusses a new application for CBAS not requiring any training data either from intruders or from valid users.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A hierarchical structure is used to represent the content of the semi-structured documents such as XML and XHTML. The traditional Vector Space Model (VSM) is not sufficient to represent both the structure and the content of such web documents. Hence in this paper, we introduce a novel method of representing the XML documents in Tensor Space Model (TSM) and then utilize it for clustering. Empirical analysis shows that the proposed method is scalable for a real-life dataset as well as the factorized matrices produced from the proposed method helps to improve the quality of clusters due to the enriched document representation with both the structure and the content information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This document outlines the system submitted by the Speech and Audio Research Laboratory at the Queensland University of Technology (QUT) for the Speaker Identity Verification: Application task of EVALITA 2009. This competitive submission consisted of a score-level fusion of three component systems; a joint-factor analysis GMM system and two SVM systems using GLDS and GMM supervector kernels. Development evaluation and post-submission results are presented in this study, demonstrating the effectiveness of this fused system approach. This study highlights the challenges associated with system calibration from limited development data and that mismatch between training and testing conditions continues to be a major source of error in speaker verification technology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recommender systems are widely used online to help users find other products, items etc that they may be interested in based on what is known about that user in their profile. Often however user profiles may be short on information and thus it is difficult for a recommender system to make quality recommendations. This problem is known as the cold-start problem. Here we investigate using association rules as a source of information to expand a user profile and thus avoid this problem. Our experiments show that it is possible to use association rules to noticeably improve the performance of a recommender system under the cold-start situation. Furthermore, we also show that the improvement in performance obtained can be achieved while using non-redundant rule sets. This shows that non-redundant rules do not cause a loss of information and are just as informative as a set of association rules that contain redundancy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In automatic facial expression detection, very accurate registration is desired which can be achieved via a deformable model approach where a dense mesh of 60-70 points on the face is used, such as an active appearance model (AAM). However, for applications where manually labeling frames is prohibitive, AAMs do not work well as they do not generalize well to unseen subjects. As such, a more coarse approach is taken for person-independent facial expression detection, where just a couple of key features (such as face and eyes) are tracked using a Viola-Jones type approach. The tracked image is normally post-processed to encode for shift and illumination invariance using a linear bank of filters. Recently, it was shown that this preprocessing step is of no benefit when close to ideal registration has been obtained. In this paper, we present a system based on the Constrained Local Model (CLM) which is a generic or person-independent face alignment algorithm which gains high accuracy. We show these results against the LBP feature extraction on the CK+ and GEMEP datasets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an overview of the experiments conducted using Hybrid Clustering of XML documents using Constraints (HCXC) method for the clustering task in the INEX 2009 XML Mining track. This technique utilises frequent subtrees generated from the structure to extract the content for clustering the XML documents. It also presents the experimental study using several data representations such as the structure-only, content-only and using both the structure and the content of XML documents for the purpose of clustering them. Unlike previous years, this year the XML documents were marked up using the Wiki tags and contains categories derived by using the YAGO ontology. This paper also presents the results of studying the effect of these tags on XML clustering using the HCXC method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes the use of the Bayes Factor as a distance metric for speaker segmentation within a speaker diarization system. The proposed approach uses a pair of constant sized, sliding windows to compute the value of the Bayes Factor between the adjacent windows over the entire audio. Results obtained on the 2002 Rich Transcription Evaluation dataset show an improved segmentation performance compared to previous approaches reported in literature using the Generalized Likelihood Ratio. When applied in a speaker diarization system, this approach results in a 5.1% relative improvement in the overall Diarization Error Rate compared to the baseline.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This technical report is concerned with one aspect of environmental monitoring—the detection and analysis of acoustic events in sound recordings of the environment. Sound recordings offer ecologists the advantage of cheaper and increased sampling but make available so much data that automated analysis becomes essential. The report describes a number of tools for automated analysis of recordings, including noise removal from spectrograms, acoustic event detection, event pattern recognition, spectral peak tracking, syntactic pattern recognition applied to call syllables, and oscillation detection. These algorithms are applied to a number of animal call recognition tasks, chosen because they illustrate quite different modes of analysis: (1) the detection of diffuse events caused by wind and rain, which are frequent contaminants of recordings of the terrestrial environment; (2) the detection of bird and calls; and (3) the preparation of acoustic maps for whole ecosystem analysis. This last task utilises the temporal distribution of events over a daily, monthly or yearly cycle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present pyktree, an implementation of the K-tree algorithm in the Python programming language. The K-tree algorithm provides highly balanced search trees for vector quantization that scales up to very large data sets. Pyktree is highly modular and well suited for rapid-prototyping of novel distance measures and centroid representations. It is easy to install and provides a python package for library use as well as command line tools.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Performance comparisons between File Signatures and Inverted Files for text retrieval have previously shown several significant shortcomings of file signatures relative to inverted files. The inverted file approach underpins most state-of-the-art search engine algorithms, such as Language and Probabilistic models. It has been widely accepted that traditional file signatures are inferior alternatives to inverted files. This paper describes TopSig, a new approach to the construction of file signatures. Many advances in semantic hashing and dimensionality reduction have been made in recent times, but these were not so far linked to general purpose, signature file based, search engines. This paper introduces a different signature file approach that builds upon and extends these recent advances. We are able to demonstrate significant improvements in the performance of signature file based indexing and retrieval, performance that is comparable to that of state of the art inverted file based systems, including Language models and BM25. These findings suggest that file signatures offer a viable alternative to inverted files in suitable settings and positions the file signatures model in the class of Vector Space retrieval models.