15 resultados para Hierarchical Clustering Model
em Instituto Politécnico do Porto, Portugal
Resumo:
With the electricity market liberalization, distribution and retail companies are looking for better market strategies based on adequate information upon the consumption patterns of its electricity customers. In this environment all consumers are free to choose their electricity supplier. A fair insight on the customer´s behaviour will permit the definition of specific contract aspects based on the different consumption patterns. In this paper Data Mining (DM) techniques are applied to electricity consumption data from a utility client’s database. To form the different customer´s classes, and find a set of representative consumption patterns, we have used the Two-Step algorithm which is a hierarchical clustering algorithm. Each consumer class will be represented by its load profile resulting from the clustering operation. Next, to characterize each consumer class a classification model will be constructed with the C5.0 classification algorithm.
Resumo:
This paper describes a methodology that was developed for the classification of Medium Voltage (MV) electricity customers. Starting from a sample of data bases, resulting from a monitoring campaign, Data Mining (DM) techniques are used in order to discover a set of a MV consumer typical load profile and, therefore, to extract knowledge regarding to the electric energy consumption patterns. In first stage, it was applied several hierarchical clustering algorithms and compared the clustering performance among them using adequacy measures. In second stage, a classification model was developed in order to allow classifying new consumers in one of the obtained clusters that had resulted from the previously process. Finally, the interpretation of the discovered knowledge are presented and discussed.
Resumo:
This paper deals with the establishment of a characterization methodology of electric power profiles of medium voltage (MV) consumers. The characterization is supported on the data base knowledge discovery process (KDD). Data Mining techniques are used with the purpose of obtaining typical load profiles of MV customers and specific knowledge of their customers’ consumption habits. In order to form the different customers’ classes and to find a set of representative consumption patterns, a hierarchical clustering algorithm and a clustering ensemble combination approach (WEACS) are used. Taking into account the typical consumption profile of the class to which the customers belong, new tariff options were defined and new energy coefficients prices were proposed. Finally, and with the results obtained, the consequences that these will have in the interaction between customer and electric power suppliers are analyzed.
Resumo:
This paper presents an integrated system that helps both retail companies and electricity consumers on the definition of the best retail contracts and tariffs. This integrated system is composed by a Decision Support System (DSS) based on a Consumer Characterization Framework (CCF). The CCF is based on data mining techniques, applied to obtain useful knowledge about electricity consumers from large amounts of consumption data. This knowledge is acquired following an innovative and systematic approach able to identify different consumers’ classes, represented by a load profile, and its characterization using decision trees. The framework generates inputs to use in the knowledge base and in the database of the DSS. The rule sets derived from the decision trees are integrated in the knowledge base of the DSS. The load profiles together with the information about contracts and electricity prices form the database of the DSS. This DSS is able to perform the classification of different consumers, present its load profile and test different electricity tariffs and contracts. The final outputs of the DSS are a comparative economic analysis between different contracts and advice about the most economic contract to each consumer class. The presentation of the DSS is completed with an application example using a real data base of consumers from the Portuguese distribution company.
Resumo:
This paper analyses forest fires in the perspective of dynamical systems. Forest fires exhibit complex correlations in size, space and time, revealing features often present in complex systems, such as the absence of a characteristic length-scale, or the emergence of long range correlations and persistent memory. This study addresses a public domain forest fires catalogue, containing information of events for Portugal, during the period from 1980 up to 2012. The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the burnt area. First, we consider mutual information to correlate annual patterns. We use visualization trees, generated by hierarchical clustering algorithms, in order to compare and to extract relationships among the data. Second, we adopt the Multidimensional Scaling (MDS) visualization tool. MDS generates maps where each object corresponds to a point. Objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to identify forest fire patterns.
Resumo:
This paper analyses forest fires in the perspective of dynamical systems. Forest fires exhibit complex correlations in size, space and time, revealing features often present in complex systems, such as the absence of a characteristic length-scale, or the emergence of long range correlations and persistent memory. This study addresses a public domain forest fires catalogue, containing information of events for Portugal, during the period from 1980 up to 2012. The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the burnt area. First, we consider mutual information to correlate annual patterns. We use visualization trees, generated by hierarchical clustering algorithms, in order to compare and to extract relationships among the data. Second, we adopt the Multidimensional Scaling (MDS) visualization tool. MDS generates maps where each object corresponds to a point. Objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to identify forest fire patterns.
Resumo:
This paper studies forest fires from the perspective of dynamical systems. Burnt area, precipitation and atmospheric temperatures are interpreted as state variables of a complex system and the correlations between them are investigated by means of different mathematical tools. First, we use mutual information to reveal potential relationships in the data. Second, we adopt the state space portrait to characterize the system’s behavior. Third, we compare the annual state space curves and we apply clustering and visualization tools to unveil long-range patterns. We use forest fire data for Portugal, covering the years 1980–2003. The territory is divided into two regions (North and South), characterized by different climates and vegetation. The adopted methodology represents a new viewpoint in the context of forest fires, shedding light on a complex phenomenon that needs to be better understood in order to mitigate its devastating consequences, at both economical and environmental levels.
Resumo:
This paper studies the statistical distributions of worldwide earthquakes from year 1963 up to year 2012. A Cartesian grid, dividing Earth into geographic regions, is considered. Entropy and the Jensen–Shannon divergence are used to analyze and compare real-world data. Hierarchical clustering and multi-dimensional scaling techniques are adopted for data visualization. Entropy-based indices have the advantage of leading to a single parameter expressing the relationships between the seismic data. Classical and generalized (fractional) entropy and Jensen–Shannon divergence are tested. The generalized measures lead to a clear identification of patterns embedded in the data and contribute to better understand earthquake distributions.
Resumo:
Proceeding of the 3rd International Conference on Fractional Systems and Signals, at Ghent, Belgium
Resumo:
In this paper we study several natural and man-made complex phenomena in the perspective of dynamical systems. For each class of phenomena, the system outputs are time-series records obtained in identical conditions. The time-series are viewed as manifestations of the system behavior and are processed for analyzing the system dynamics. First, we use the Fourier transform to process the data and we approximate the amplitude spectra by means of power law functions. We interpret the power law parameters as a phenomenological signature of the system dynamics. Second, we adopt the techniques of non-hierarchical clustering and multidimensional scaling to visualize hidden relationships between the complex phenomena. Third, we propose a vector field based analogy to interpret the patterns unveiled by the PL parameters.
Resumo:
The last 40 years of the world economy are analyzed by means of computer visualization methods. Multidimensional scaling and the hierarchical clustering tree techniques are used. The current Western downturn in favor of Asian partners may still be reversed in the coming decades.
Resumo:
Composition is a practice of key importance in software engineering. When real-time applications are composed it is necessary that their timing properties (such as meeting the deadlines) are guaranteed. The composition is performed by establishing an interface between the application and the physical platform. Such an interface does typically contain information about the amount of computing capacity needed by the application. In multiprocessor platforms, the interface should also present information about the degree of parallelism. Recently there have been quite a few interface proposals. However, they are either too complex to be handled or too pessimistic.In this paper we propose the Generalized Multiprocessor Periodic Resource model (GMPR) that is strictly superior to the MPR model without requiring a too detailed description. We describe a method to generate the interface from the application specification. All these methods have been implemented in Matlab routines that are publicly available.
Resumo:
The problem of selecting suppliers/partners is a crucial and important part in the process of decision making for companies that intend to perform competitively in their area of activity. The selection of supplier/partner is a time and resource-consuming task that involves data collection and a careful analysis of the factors that can positively or negatively influence the choice. Nevertheless it is a critical process that affects significantly the operational performance of each company. In this work, there were identified five broad selection criteria: Quality, Financial, Synergies, Cost, and Production System. Within these criteria, it was also included five sub-criteria. After the identification criteria, a survey was elaborated and companies were contacted in order to understand which factors have more weight in their decisions to choose the partners. Interpreted the results and processed the data, it was adopted a model of linear weighting to reflect the importance of each factor. The model has a hierarchical structure and can be applied with the Analytic Hierarchy Process (AHP) method or Value Analysis. The goal of the paper it's to supply a selection reference model that can represent an orientation/pattern for a decision making on the suppliers/partners selection process
Resumo:
Epidemiologic studies have reported an inverse association between dairy product consumption and cardiometabolic risk factors in adults, but this relation is relatively unexplored in adolescents. We hypothesized that a higher dairy product intake is associated with lower cardiometabolic risk factor clustering in adolescents. To test this hypothesis, a cross-sectional study was conducted with 494 adolescents aged 15 to 18 years from the Azorean Archipelago, Portugal. We measured fasting glucose, insulin, total cholesterol, high-density lipoprotein cholesterol, triglycerides, systolic blood pressure, body fat, and cardiorespiratory fitness. We also calculated homeostatic model assessment and total cholesterol/high-density lipoprotein cholesterol ratio. For each one of these variables, a z score was computed using age and sex. A cardiometabolic risk score (CMRS) was constructed by summing up the z scores of all individual risk factors. High risk was considered to exist when an individual had at least 1 SD from this score. Diet was evaluated using a food frequency questionnaire, and the intake of total dairy (included milk, yogurt, and cheese), milk, yogurt, and cheese was categorized as low (equal to or below the median of the total sample) or “appropriate” (above the median of the total sample).The association between dairy product intake and CMRS was evaluated using separate logistic regression, and the results were adjusted for confounders. Adolescents with high milk intake had lower CMRS, compared with those with low intake (10.6% vs 18.1%, P = .018). Adolescents with appropriate milk intake were less likely to have high CMRS than those with low milk intake (odds ratio, 0.531; 95% confidence interval, 0.302-0.931). No association was found between CMRS and total dairy, yogurt, and cheese intake. Only milk intake seems to be inversely related to CMRS in adolescents.
Resumo:
O objetivo desta dissertação foi estudar um conjunto de empresas cotadas na bolsa de valores de Lisboa, para identificar aquelas que têm um comportamento semelhante ao longo do tempo. Para isso utilizamos algoritmos de Clustering tais como K-Means, PAM, Modelos hierárquicos, Funny e C-Means tanto com a distância euclidiana como com a distância de Manhattan. Para selecionar o melhor número de clusters identificado por cada um dos algoritmos testados, recorremos a alguns índices de avaliação/validação de clusters como o Davies Bouldin e Calinski-Harabasz entre outros.