967 resultados para 080109 Pattern Recognition and Data Mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article explores two matrix methods to induce the ``shades of meaning" (SoM) of a word. A matrix representation of a word is computed from a corpus of traces based on the given word. Non-negative Matrix Factorisation (NMF) and Singular Value Decomposition (SVD) compute a set of vectors corresponding to a potential shade of meaning. The two methods were evaluated based on loss of conditional entropy with respect to two sets of manually tagged data. One set reflects concepts generally appearing in text, and the second set comprises words used for investigations into word sense disambiguation. Results show that for NMF consistently outperforms SVD for inducing both SoM of general concepts as well as word senses. The problem of inducing the shades of meaning of a word is more subtle than that of word sense induction and hence relevant to thematic analysis of opinion where nuances of opinion can arise.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a novel Hybrid Clustering approach for XML documents (HCX) that first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. The empirical analysis reveals that the proposed method is scalable and accurate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A method of improving the security of biometric templates which satisfies desirable properties such as (a) irreversibility of the template, (b) revocability and assignment of a new template to the same biometric input, (c) matching in the secure transformed domain is presented. It makes use of an iterative procedure based on the bispectrum that serves as an irreversible transformation for biometric features because signal phase is discarded each iteration. Unlike the usual hash function, this transformation preserves closeness in the transformed domain for similar biometric inputs. A number of such templates can be generated from the same input. These properties are illustrated using synthetic data and applied to images from the FRGC 3D database with Gabor features. Verification can be successfully performed using these secure templates with an EER of 5.85%

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we discuss our participation to the INEX 2008 Link-the-Wiki track. We utilized a sliding window based algorithm to extract the frequent terms and phrases. Using the extracted phrases and term as descriptive vectors, the anchors and relevant links (both incoming and outgoing) are recognized efficiently.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we classify, review, and experimentally compare major methods that are exploited in the definition, adoption, and utilization of element similarity measures in the context of XML schema matching. We aim at presenting a unified view which is useful when developing a new element similarity measure, when implementing an XML schema matching component, when using an XML schema matching system, and when comparing XML schema matching systems.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Heart rate variability (HRV) refers to the regulation of the sinoatrial node, the natural pacemaker of the heart, by the sympathetic and parasympathetic branches of the autonomic nervous system. Heart rate variability analysis is an important tool to observe the heart's ability to respond to normal regulatory impulses that affect its rhythm. A computer-based intelligent system for analysis of cardiac states is very useful in diagnostics and disease management. Like many bio-signals, HRV signals are nonlinear in nature. Higher order spectral analysis (HOS) is known to be a good tool for the analysis of nonlinear systems and provides good noise immunity. In this work, we studied the HOS of the HRV signals of normal heartbeat and seven classes of arrhythmia. We present some general characteristics for each of these classes of HRV signals in the bispectrum and bicoherence plots. We also extracted features from the HOS and performed an analysis of variance (ANOVA) test. The results are very promising for cardiac arrhythmia classification with a number of features yielding a p-value < 0.02 in the ANOVA test.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information fusion in biometrics has received considerable attention. The architecture proposed here is based on the sequential integration of multi-instance and multi-sample fusion schemes. This method is analytically shown to improve the performance and allow a controlled trade-off between false alarms and false rejects when the classifier decisions are statistically independent. Equations developed for detection error rates are experimentally evaluated by considering the proposed architecture for text dependent speaker verification using HMM based digit dependent speaker models. The tuning of parameters, n classifiers and m attempts/samples, is investigated and the resultant detection error trade-off performance is evaluated on individual digits. Results show that performance improvement can be achieved even for weaker classifiers (FRR-19.6%, FAR-16.7%). The architectures investigated apply to speaker verification from spoken digit strings such as credit card numbers in telephone or VOIP or internet based applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Gabor representations have been widely used in facial analysis (face recognition, face detection and facial expression detection) due to their biological relevance and computational properties. Two popular Gabor representations used in literature are: 1) Log-Gabor and 2) Gabor energy filters. Even though these representations are somewhat similar, they also have distinct differences as the Log-Gabor filters mimic the simple cells in the visual cortex while the Gabor energy filters emulate the complex cells, which causes subtle differences in the responses. In this paper, we analyze the difference between these two Gabor representations and quantify these differences on the task of facial action unit (AU) detection. In our experiments conducted on the Cohn-Kanade dataset, we report an average area underneath the ROC curve (A`) of 92.60% across 17 AUs for the Gabor energy filters, while the Log-Gabor representation achieved an average A` of 96.11%. This result suggests that small spatial differences that the Log-Gabor filters pick up on are more useful for AU detection than the differences in contours and edges that the Gabor energy filters extract.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This technical report is concerned with one aspect of environmental monitoring—the detection and analysis of acoustic events in sound recordings of the environment. Sound recordings offer ecologists the potential advantages of cheaper and increased sampling. An acoustic event detection algorithm is introduced that outputs a compact rectangular marquee description of each event. It can disentangle superimposed events, which are a common occurrence during morning and evening choruses. Next, three uses to which acoustic event detection can be put are illustrated. These tasks have been selected because they illustrate quite different modes of analysis: (1) the detection of diffuse events caused by wind and rain, which are a frequent contaminant of recordings of the terrestrial environment; (2) the detection of bird calls using the spatial distribution of their component events; and (3) the preparation of acoustic maps for whole ecosystem analysis. This last task utilises the temporal distribution of events over a daily, monthly or yearly cycle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This technical report is concerned with one aspect of environmental monitoring—the detection and analysis of acoustic events in sound recordings of the environment. Sound recordings offer ecologists the potential advantages of cheaper and increased sampling. An acoustic event detection algorithm is introduced that outputs a compact rectangular marquee description of each event. It can disentangle superimposed events, which are a common occurrence during morning and evening choruses. Next, three uses to which acoustic event detection can be put are illustrated. These tasks have been selected because they illustrate quite different modes of analysis: (1) the detection of diffuse events caused by wind and rain, which are a frequent contaminant of recordings of the terrestrial environment; (2) the detection of bird calls using the spatial distribution of their component events; and (3) the preparation of acoustic maps for whole ecosystem analysis. This last task utilises the temporal distribution of events over a daily, monthly or yearly cycle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a novel technique to directly optimise the Figure of Merit (FOM) for phonetic spoken term detection. The FOM is a popular measure of sTD accuracy, making it an ideal candiate for use as an objective function. A simple linear model is introduced to transform the phone log-posterior probabilities output by a phe classifier to produce enhanced log-posterior features that are more suitable for the STD task. Direct optimisation of the FOM is then performed by training the parameters of this model using a non-linear gradient descent algorithm. Substantial FOM improvements of 11% relative are achieved on held-out evaluation data, demonstrating the generalisability of the approach.