921 resultados para Pattern Recognition, Visual
Resumo:
Chapter in Book Proceedings with Peer Review First Iberian Conference, IbPRIA 2003, Puerto de Andratx, Mallorca, Spain, JUne 4-6, 2003. Proceedings
Resumo:
Chapter in Book Proceedings with Peer Review First Iberian Conference, IbPRIA 2003, Puerto de Andratx, Mallorca, Spain, JUne 4-6, 2003. Proceedings
Resumo:
Electrocardiographic (ECG) signals are emerging as a recent trend in the field of biometrics. In this paper, we propose a novel ECG biometric system that combines clustering and classification methodologies. Our approach is based on dominant-set clustering, and provides a framework for outlier removal and template selection. It enhances the typical workflows, by making them better suited to new ECG acquisition paradigms that use fingers or hand palms, which lead to signals with lower signal to noise ratio, and more prone to noise artifacts. Preliminary results show the potential of the approach, helping to further validate the highly usable setups and ECG signals as a complementary biometric modality.
Resumo:
A classical application of biosignal analysis has been the psychophysiological detection of deception, also known as the polygraph test, which is currently a part of standard practices of law enforcement agencies and several other institutions worldwide. Although its validity is far from gathering consensus, the underlying psychophysiological principles are still an interesting add-on for more informal applications. In this paper we present an experimental off-the-person hardware setup, propose a set of feature extraction criteria and provide a comparison of two classification approaches, targeting the detection of deception in the context of a role-playing interactive multimedia environment. Our work is primarily targeted at recreational use in the context of a science exhibition, where the main goal is to present basic concepts related with knowledge discovery, biosignal analysis and psychophysiology in an educational way, using techniques that are simple enough to be understood by children of different ages. Nonetheless, this setting will also allow us to build a significant data corpus, annotated with ground-truth information, and collected with non-intrusive sensors, enabling more advanced research on the topic. Experimental results have shown interesting findings and provided useful guidelines for future work. Pattern Recognition
Resumo:
Many learning problems require handling high dimensional datasets with a relatively small number of instances. Learning algorithms are thus confronted with the curse of dimensionality, and need to address it in order to be effective. Examples of these types of data include the bag-of-words representation in text classification problems and gene expression data for tumor detection/classification. Usually, among the high number of features characterizing the instances, many may be irrelevant (or even detrimental) for the learning tasks. It is thus clear that there is a need for adequate techniques for feature representation, reduction, and selection, to improve both the classification accuracy and the memory requirements. In this paper, we propose combined unsupervised feature discretization and feature selection techniques, suitable for medium and high-dimensional datasets. The experimental results on several standard datasets, with both sparse and dense features, show the efficiency of the proposed techniques as well as improvements over previous related techniques.
Resumo:
Feature selection is a central problem in machine learning and pattern recognition. On large datasets (in terms of dimension and/or number of instances), using search-based or wrapper techniques can be cornputationally prohibitive. Moreover, many filter methods based on relevance/redundancy assessment also take a prohibitively long time on high-dimensional. datasets. In this paper, we propose efficient unsupervised and supervised feature selection/ranking filters for high-dimensional datasets. These methods use low-complexity relevance and redundancy criteria, applicable to supervised, semi-supervised, and unsupervised learning, being able to act as pre-processors for computationally intensive methods to focus their attention on smaller subsets of promising features. The experimental results, with up to 10(5) features, show the time efficiency of our methods, with lower generalization error than state-of-the-art techniques, while being dramatically simpler and faster.
Resumo:
A procura de padrões nos dados de modo a formar grupos é conhecida como aglomeração de dados ou clustering, sendo uma das tarefas mais realizadas em mineração de dados e reconhecimento de padrões. Nesta dissertação é abordado o conceito de entropia e são usados algoritmos com critérios entrópicos para fazer clustering em dados biomédicos. O uso da entropia para efetuar clustering é relativamente recente e surge numa tentativa da utilização da capacidade que a entropia possui de extrair da distribuição dos dados informação de ordem superior, para usá-la como o critério na formação de grupos (clusters) ou então para complementar/melhorar algoritmos existentes, numa busca de obtenção de melhores resultados. Alguns trabalhos envolvendo o uso de algoritmos baseados em critérios entrópicos demonstraram resultados positivos na análise de dados reais. Neste trabalho, exploraram-se alguns algoritmos baseados em critérios entrópicos e a sua aplicabilidade a dados biomédicos, numa tentativa de avaliar a adequação destes algoritmos a este tipo de dados. Os resultados dos algoritmos testados são comparados com os obtidos por outros algoritmos mais “convencionais" como o k-médias, os algoritmos de spectral clustering e um algoritmo baseado em densidade.
Resumo:
Na atualidade, está a emergir um novo paradigma de interação, designado por Natural User Interface (NUI) para reconhecimento de gestos produzidos com o corpo do utilizador. O dispositivo de interação Microsoft Kinect foi inicialmente concebido para controlo de videojogos, para a consola Xbox360. Este dispositivo demonstra ser uma aposta viável para explorar outras áreas, como a do apoio ao processo de ensino e de aprendizagem para crianças do ensino básico. O protótipo desenvolvido visa definir um modo de interação baseado no desenho de letras no ar, e realizar a interpretação dos símbolos desenhados, usando os reconhecedores de padrões Kernel Discriminant Analysis (KDA), Support Vector Machines (SVM) e $N. O desenvolvimento deste projeto baseou-se no estudo dos diferentes dispositivos NUI disponíveis no mercado, bibliotecas de desenvolvimento NUI para este tipo de dispositivos e algoritmos de reconhecimento de padrões. Com base nos dois elementos iniciais, foi possível obter uma visão mais concreta de qual o hardware e software disponíveis indicados à persecução do objetivo pretendido. O reconhecimento de padrões constitui um tema bastante extenso e complexo, de modo que foi necessária a seleção de um conjunto limitado deste tipo de algoritmos, realizando os respetivos testes por forma a determinar qual o que melhor se adequava ao objetivo pretendido. Aplicando as mesmas condições aos três algoritmos de reconhecimento de padrões permitiu avaliar as suas capacidades e determinar o $N como o que apresentou maior eficácia no reconhecimento. Por último, tentou-se averiguar a viabilidade do protótipo desenvolvido, tendo sido testado num universo de elementos de duas faixas etárias para determinar a capacidade de adaptação e aprendizagem destes dois grupos. Neste estudo, constatou-se um melhor desempenho inicial ao modo de interação do grupo de idade mais avançada. Contudo, o grupo mais jovem foi revelando uma evolutiva capacidade de adaptação a este modo de interação melhorando progressivamente os resultados.
Resumo:
Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação
Resumo:
In the present paper we assess the performance of information-theoretic inspired risks functionals in multilayer perceptrons with reference to the two most popular ones, Mean Square Error and Cross-Entropy. The information-theoretic inspired risks, recently proposed, are: HS and HR2 are, respectively, the Shannon and quadratic Rényi entropies of the error; ZED is a risk reflecting the error density at zero errors; EXP is a generalized exponential risk, able to mimic a wide variety of risk functionals, including the information-thoeretic ones. The experiments were carried out with multilayer perceptrons on 35 public real-world datasets. All experiments were performed according to the same protocol. The statistical tests applied to the experimental results showed that the ubiquitous mean square error was the less interesting risk functional to be used by multilayer perceptrons. Namely, mean square error never achieved a significantly better classification performance than competing risks. Cross-entropy and EXP were the risks found by several tests to be significantly better than their competitors. Counts of significantly better and worse risks have also shown the usefulness of HS and HR2 for some datasets.
Resumo:
Feature discretization (FD) techniques often yield adequate and compact representations of the data, suitable for machine learning and pattern recognition problems. These representations usually decrease the training time, yielding higher classification accuracy while allowing for humans to better understand and visualize the data, as compared to the use of the original features. This paper proposes two new FD techniques. The first one is based on the well-known Linde-Buzo-Gray quantization algorithm, coupled with a relevance criterion, being able perform unsupervised, supervised, or semi-supervised discretization. The second technique works in supervised mode, being based on the maximization of the mutual information between each discrete feature and the class label. Our experimental results on standard benchmark datasets show that these techniques scale up to high-dimensional data, attaining in many cases better accuracy than existing unsupervised and supervised FD approaches, while using fewer discretization intervals.
Resumo:
In machine learning and pattern recognition tasks, the use of feature discretization techniques may have several advantages. The discretized features may hold enough information for the learning task at hand, while ignoring minor fluctuations that are irrelevant or harmful for that task. The discretized features have more compact representations that may yield both better accuracy and lower training time, as compared to the use of the original features. However, in many cases, mainly with medium and high-dimensional data, the large number of features usually implies that there is some redundancy among them. Thus, we may further apply feature selection (FS) techniques on the discrete data, keeping the most relevant features, while discarding the irrelevant and redundant ones. In this paper, we propose relevance and redundancy criteria for supervised feature selection techniques on discrete data. These criteria are applied to the bin-class histograms of the discrete features. The experimental results, on public benchmark data, show that the proposed criteria can achieve better accuracy than widely used relevance and redundancy criteria, such as mutual information and the Fisher ratio.
Resumo:
This paper addresses the estimation of surfaces from a set of 3D points using the unified framework described in [1]. This framework proposes the use of competitive learning for curve estimation, i.e., a set of points is defined on a deformable curve and they all compete to represent the available data. This paper extends the use of the unified framework to surface estimation. It o shown that competitive learning performes better than snakes, improving the model performance in the presence of concavities and allowing to desciminate close surfaces. The proposed model is evaluated in this paper using syntheticdata and medical images (MRI and ultrasound images).
Resumo:
The Evidence Accumulation Clustering (EAC) paradigm is a clustering ensemble method which derives a consensus partition from a collection of base clusterings obtained using different algorithms. It collects from the partitions in the ensemble a set of pairwise observations about the co-occurrence of objects in a same cluster and it uses these co-occurrence statistics to derive a similarity matrix, referred to as co-association matrix. The Probabilistic Evidence Accumulation for Clustering Ensembles (PEACE) algorithm is a principled approach for the extraction of a consensus clustering from the observations encoded in the co-association matrix based on a probabilistic model for the co-association matrix parameterized by the unknown assignments of objects to clusters. In this paper we extend the PEACE algorithm by deriving a consensus solution according to a MAP approach with Dirichlet priors defined for the unknown probabilistic cluster assignments. In particular, we study the positive regularization effect of Dirichlet priors on the final consensus solution with both synthetic and real benchmark data.
Resumo:
In hyperspectral imagery a pixel typically consists mixture of spectral signatures of reference substances, also called endmembers. Linear spectral mixture analysis, or linear unmixing, aims at estimating the number of endmembers, their spectral signatures, and their abundance fractions. This paper proposes a framework for hyperpsectral unmixing. A blind method (SISAL) is used for the estimation of the unknown endmember signature and their abundance fractions. This method solve a non-convex problem by a sequence of augmented Lagrangian optimizations, where the positivity constraints, forcing the spectral vectors to belong to the convex hull of the endmember signatures, are replaced by soft constraints. The proposed framework simultaneously estimates the number of endmembers present in the hyperspectral image by an algorithm based on the minimum description length (MDL) principle. Experimental results on both synthetic and real hyperspectral data demonstrate the effectiveness of the proposed algorithm.