9 resultados para Mixed Type Variables Clustering
em Reposit
Resumo:
In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion. © 2014 Springer-Verlag Berlin Heidelberg.
Resumo:
In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion.
Resumo:
Vivemos actualmente numa sociedade caracterizada pela informação, pela segmentação do público e pela crescente necessidade de experiências por parte deste mesmo público, em que é cada vez mais difícil para as marcas posicionarem-se no mercado, razão pela qual é necessário pensar em novas estratégias publicitárias para comunicar com os consumidores. Em face deste cenário, o Marketing de Guerrilha pode constituir-se como uma ferramenta diferenciadora e eficaz, já que se propõe desenvolver soluções à medida dos públicos através da implementação de acções inesperadas, ousadas, impactantes e, sobretudo, geradoras de experiências. Neste contexto este trabalho procura aprofundar o conhecimento deste novo tipo de comunicação publicitária a partir da identificação das suas técnicas e tácticas mais usadas e do estudo do seu impacto junto do grande público, entendido este como a viralidade da acção, medida em Plays, e o feedback do consumidor, medido em Likes e Dislikes. Considerando a diminuta investigação científica sobre este tema, bem como a parca literatura disponível, esta dissertação assume a forma de um estudo exploratório do tipo misto sequencial, desenvolvido com base numa análise qualitativa seguida de quantitativa de 150 casos publicados online, disponibilizados na base de dados Ads of the World. Identificadas as técnicas e tácticas mais comuns, os resultados do trabalho empírico sugerem a existência de dependência entre técnicas ou tácticas e a viralidade da acção e feedback do consumidor.
Resumo:
This paper is on the problem of short-term hydro scheduling (STHS), particularly concerning a head-dependent hydro chain We propose a novel mixed-integer nonlinear programming (MINLP) approach, considering hydroelectric power generation as a nonlinear function of water discharge and of the head. As a new contribution to eat her studies, we model the on-off behavior of the hydro plants using integer variables, in order to avoid water discharges at forbidden areas Thus, an enhanced STHS is provided due to the more realistic modeling presented in this paper Our approach has been applied successfully to solve a test case based on one of the Portuguese cascaded hydro systems with a negligible computational time requirement.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
Biosignals analysis has become widespread, upstaging their typical use in clinical settings. Electrocardiography (ECG) plays a central role in patient monitoring as a diagnosis tool in today's medicine and as an emerging biometric trait. In this paper we adopt a consensus clustering approach for the unsupervised analysis of an ECG-based biometric records. This type of analysis highlights natural groups within the population under investigation, which can be correlated with ground truth information in order to gain more insights about the data. Preliminary results are promising, for meaningful clusters are extracted from the population under analysis. © 2014 EURASIP.
Resumo:
An improved class of Boussinesq systems of an arbitrary order using a wave surface elevation and velocity potential formulation is derived. Dissipative effects and wave generation due to a time-dependent varying seabed are included. Thus, high-order source functions are considered. For the reduction of the system order and maintenance of some dispersive characteristics of the higher-order models, an extra O(mu 2n+2) term (n ??? N) is included in the velocity potential expansion. We introduce a nonlocal continuous/discontinuous Galerkin FEM with inner penalty terms to calculate the numerical solutions of the improved fourth-order models. The discretization of the spatial variables is made using continuous P2 Lagrange elements. A predictor-corrector scheme with an initialization given by an explicit RungeKutta method is also used for the time-variable integration. Moreover, a CFL-type condition is deduced for the linear problem with a constant bathymetry. To demonstrate the applicability of the model, we considered several test cases. Improved stability is achieved.
Resumo:
This work provides an assessment of layerwise mixed models using least-squares formulation for the coupled electromechanical static analysis of multilayered plates. In agreement with three-dimensional (3D) exact solutions, due to compatibility and equilibrium conditions at the layers interfaces, certain mechanical and electrical variables must fulfill interlaminar C-0 continuity, namely: displacements, in-plane strains, transverse stresses, electric potential, in-plane electric field components and transverse electric displacement (if no potential is imposed between layers). Hence, two layerwise mixed least-squares models are here investigated, with two different sets of chosen independent variables: Model A, developed earlier, fulfills a priori the interiaminar C-0 continuity of all those aforementioned variables, taken as independent variables; Model B, here newly developed, rather reduces the number of independent variables, but also fulfills a priori the interlaminar C-0 continuity of displacements, transverse stresses, electric potential and transverse electric displacement, taken as independent variables. The predictive capabilities of both models are assessed by comparison with 3D exact solutions, considering multilayered piezoelectric composite plates of different aspect ratios, under an applied transverse load or surface potential. It is shown that both models are able to predict an accurate quasi-3D description of the static electromechanical analysis of multilayered plates for all aspect ratios.