25 resultados para hemoglobin variant
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.
Resumo:
Dissertação para obtenção do grau de Mestre em Engenharia Electrotécnica - Ramo de Energia
Resumo:
Mestrado em Radiações Aplicadas às Tecnologias da Saúde - Ramo de especialização: Imagem por Ressonância Magnética
Resumo:
Mestrado em Controlo de Gestão e dos Negócios
Resumo:
Mestrado em Controlo de Gestão e dos Negócios
Resumo:
Relatório Final de Estágio apresentado à Escola Superior de Dança com vista à obtenção do Grau de Mestre em Ensino de Dança.
Resumo:
Finding the structure of a confined liquid crystal is a difficult task since both the density and order parameter profiles are nonuniform. Starting from a microscopic model and density-functional theory, one has to either (i) solve a nonlinear, integral Euler-Lagrange equation, or (ii) perform a direct multidimensional free energy minimization. The traditional implementations of both approaches are computationally expensive and plagued with convergence problems. Here, as an alternative, we introduce an unsupervised variant of the multilayer perceptron (MLP) artificial neural network for minimizing the free energy of a fluid of hard nonspherical particles confined between planar substrates of variable penetrability. We then test our algorithm by comparing its results for the structure (density-orientation profiles) and equilibrium free energy with those obtained by standard iterative solution of the Euler-Lagrange equations and with Monte Carlo simulation results. Very good agreement is found and the MLP method proves competitively fast, flexible, and refinable. Furthermore, it can be readily generalized to the richer experimental patterned-substrate geometries that are now experimentally realizable but very problematic to conventional theoretical treatments.
Resumo:
Trabalho de projeto realizado para obtenção do grau de Mestre em Engenharia Informática e de Computadores
Resumo:
Thesis to obtain the Master Degree in Electronics and Telecommunications Engineering