825 resultados para means clustering
Resumo:
This article is is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Attribution-NonCommercial (CC BY-NC) license lets others remix, tweak, and build upon work non-commercially, and although the new works must also acknowledge & be non-commercial.
Resumo:
TPM Vol. 21, No. 4, December 2014, 435-447 – Special Issue © 2014 Cises.
Resumo:
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Resumo:
Com a crescente geração, armazenamento e disseminação da informação nos últimos anos, o anterior problema de falta de informação transformou-se num problema de extracção do conhecimento útil a partir da informação disponível. As representações visuais da informação abstracta têm sido utilizadas para auxiliar a interpretação os dados e para revelar padrões de outra forma escondidos. A visualização de informação procura aumentar a cognição humana aproveitando as capacidades visuais humanas, de forma a tornar perceptível a informação abstracta, fornecendo os meios necessários para que um humano possa absorver quantidades crescentes de informação, com as suas capacidades de percepção. O objectivo das técnicas de agrupamento de dados consiste na divisão de um conjunto de dados em vários grupos, em que dados semelhantes são colocados no mesmo grupo e dados dissemelhantes em grupos diferentes. Mais especificamente, o agrupamento de dados com restrições tem o intuito de incorporar conhecimento a priori no processo de agrupamento de dados, com o objectivo de aumentar a qualidade do agrupamento de dados e, simultaneamente, encontrar soluções apropriadas a tarefas e interesses específicos. Nesta dissertação é estudado a abordagem de Agrupamento de Dados Visual Interactivo que permite ao utilizador, através da interacção com uma representação visual da informação, incorporar o seu conhecimento prévio acerca do domínio de dados, de forma a influenciar o agrupamento resultante para satisfazer os seus objectivos. Esta abordagem combina e estende técnicas de visualização interactiva de informação, desenho de grafos de forças direccionadas e agrupamento de dados com restrições. Com o propósito de avaliar o desempenho de diferentes estratégias de interacção com o utilizador, são efectuados estudos comparativos utilizando conjuntos de dados sintéticos e reais.
Resumo:
Although Mobility is a trendy and an important keyword in education matters, it has been a knowledge tool since the beginning of times, namely the Classical Antiquity, when students were moving from place to place following the masters. Over the time, different types of academic mobility can be found and this tool has been taken both by the education and business sector as almost a compulsory process since the world has gone global. Mobility is, of course, not an end but a means. And as far as academic mobility is concerned it is above all a means to get knowledge, being it theoretical or practical. But why does it still make sense to move from one place to another to get knowledge if never as before we have heaps of information and experiences available around us, either through personal contacts, in books, journals, newspapers or online? With this paper we intend to discuss the purpose of international mobility in the global world of the 21st century as a means to the development of world citizens able to live, work and learn in different and unfamiliar contexts. Based on our own experience as International Coordinator in a Higher Education Institution (HEI) over the last 8 years, on the latest research on academic mobility and still on studies on employability we will show how and why academic mobility can develop skills either in students or in other academic staff that are hardly possible to build in a classroom, or in a non-mobile academic or professional experience and that are highly valued by employers and society in general.
Resumo:
Clustering analysis is a useful tool to detect and monitor disease patterns and, consequently, to contribute for an effective population disease management. Portugal has the highest incidence of tuberculosis in the European Union (in 2012, 21.6 cases per 100.000 inhabitants), although it has been decreasing consistently. Two critical PTB (Pulmonary Tuberculosis) areas, metropolitan Oporto and metropolitan Lisbon regions, were previously identified through spatial and space-time clustering for PTB incidence rate and risk factors. Identifying clusters of temporal trends can further elucidate policy makers about municipalities showing a faster or a slower TB control improvement.
Resumo:
Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
In data clustering, the problem of selecting the subset of most relevant features from the data has been an active research topic. Feature selection for clustering is a challenging task due to the absence of class labels for guiding the search for relevant features. Most methods proposed for this goal are focused on numerical data. In this work, we propose an approach for clustering and selecting categorical features simultaneously. We assume that the data originate from a finite mixture of multinomial distributions and implement an integrated expectation-maximization (EM) algorithm that estimates all the parameters of the model and selects the subset of relevant features simultaneously. The results obtained on synthetic data illustrate the performance of the proposed approach. An application to real data, referred to official statistics, shows its usefulness.
Resumo:
Although power-line communication (PLC) is not a new technology, its use to support communication with timing requirements is still the focus of ongoing research. Recently, a new infrastructure was presented, intended for communication using power lines from a central location to geographically dispersed nodes using inexpensive devices. This new infrastructure uses a two-level hierarchical power-line system, together with an IP-based network. Within this infrastructure, in order to provide end-toend communication through the two levels of the powerline system, it is necessary to fully understand the behaviour of the underlying network layers. The masterslave behaviour of the PLC MAC, together with the inherent dynamic topology of power-line networks are important issues that must be fully characterised. Therefore, in this paper we present a simulation model which is being used to study and characterise the behaviour of power-line communication.
Resumo:
This paper focus on a demand response model analysis in a smart grid context considering a contingency scenario. A fuzzy clustering technique is applied on the developed demand response model and an analysis is performed for the contingency scenario. Model considerations and architecture are described. The demand response developed model aims to support consumers decisions regarding their consumption needs and possible economic benefits.
Resumo:
This paper analyses earthquake data in the perspective of dynamical systems and its Pseudo Phase Plane representation. The seismic data is collected from the Bulletin of the International Seismological Centre. The geological events are characterised by their magnitude and geographical location and described by means of time series of sequences of Dirac impulses. Fifty groups of data series are considered, according to the Flinn-Engdahl seismic regions of Earth. For each region, Pearson’s correlation coefficient is used to find the optimal time delay for reconstructing the Pseudo Phase Plane. The Pseudo Phase Plane plots are then analysed and characterised.
Resumo:
This paper reports on the analysis of tidal breathing patterns measured during noninvasive forced oscillation lung function tests in six individual groups. The three adult groups were healthy, with prediagnosed chronic obstructive pulmonary disease, and with prediagnosed kyphoscoliosis, respectively. The three children groups were healthy, with prediagnosed asthma, and with prediagnosed cystic fibrosis, respectively. The analysis is applied to the pressure–volume curves and the pseudophaseplane loop by means of the box-counting method, which gives a measure of the area within each loop. The objective was to verify if there exists a link between the area of the loops, power-law patterns, and alterations in the respiratory structure with disease. We obtained statistically significant variations between the data sets corresponding to the six groups of patients, showing also the existence of power-law patterns. Our findings support the idea that the respiratory system changes with disease in terms of airway geometry and tissue parameters, leading, in turn, to variations in the fractal dimension of the respiratory tree and its dynamics.
Resumo:
The goal of this study is to analyze the dynamical properties of financial data series from nineteen worldwide stock market indices (SMI) during the period 1995–2009. SMI reveal a complex behavior that can be explored since it is available a considerable volume of data. In this paper is applied the window Fourier transform and methods of fractional calculus. The results reveal classification patterns typical of fractional order systems.
Resumo:
Biosignals analysis has become widespread, upstaging their typical use in clinical settings. Electrocardiography (ECG) plays a central role in patient monitoring as a diagnosis tool in today's medicine and as an emerging biometric trait. In this paper we adopt a consensus clustering approach for the unsupervised analysis of an ECG-based biometric records. This type of analysis highlights natural groups within the population under investigation, which can be correlated with ground truth information in order to gain more insights about the data. Preliminary results are promising, for meaningful clusters are extracted from the population under analysis. © 2014 EURASIP.