16 resultados para means clustering
em Instituto Politécnico do Porto, Portugal
Resumo:
In recent decades, all over the world, competition in the electric power sector has deeply changed the way this sector’s agents play their roles. In most countries, electric process deregulation was conducted in stages, beginning with the clients of higher voltage levels and with larger electricity consumption, and later extended to all electrical consumers. The sector liberalization and the operation of competitive electricity markets were expected to lower prices and improve quality of service, leading to greater consumer satisfaction. Transmission and distribution remain noncompetitive business areas, due to the large infrastructure investments required. However, the industry has yet to clearly establish the best business model for transmission in a competitive environment. After generation, the electricity needs to be delivered to the electrical system nodes where demand requires it, taking into consideration transmission constraints and electrical losses. If the amount of power flowing through a certain line is close to or surpasses the safety limits, then cheap but distant generation might have to be replaced by more expensive closer generation to reduce the exceeded power flows. In a congested area, the optimal price of electricity rises to the marginal cost of the local generation or to the level needed to ration demand to the amount of available electricity. Even without congestion, some power will be lost in the transmission system through heat dissipation, so prices reflect that it is more expensive to supply electricity at the far end of a heavily loaded line than close to an electric power generation. Locational marginal pricing (LMP), resulting from bidding competition, represents electrical and economical values at nodes or in areas that may provide economical indicator signals to the market agents. This article proposes a data-mining-based methodology that helps characterize zonal prices in real power transmission networks. To test our methodology, we used an LMP database from the California Independent System Operator for 2009 to identify economical zones. (CAISO is a nonprofit public benefit corporation charged with operating the majority of California’s high-voltage wholesale power grid.) To group the buses into typical classes that represent a set of buses with the approximate LMP value, we used two-step and k-means clustering algorithms. By analyzing the various LMP components, our goal was to extract knowledge to support the ISO in investment and network-expansion planning.
Resumo:
A procura de padrões nos dados de modo a formar grupos é conhecida como aglomeração de dados ou clustering, sendo uma das tarefas mais realizadas em mineração de dados e reconhecimento de padrões. Nesta dissertação é abordado o conceito de entropia e são usados algoritmos com critérios entrópicos para fazer clustering em dados biomédicos. O uso da entropia para efetuar clustering é relativamente recente e surge numa tentativa da utilização da capacidade que a entropia possui de extrair da distribuição dos dados informação de ordem superior, para usá-la como o critério na formação de grupos (clusters) ou então para complementar/melhorar algoritmos existentes, numa busca de obtenção de melhores resultados. Alguns trabalhos envolvendo o uso de algoritmos baseados em critérios entrópicos demonstraram resultados positivos na análise de dados reais. Neste trabalho, exploraram-se alguns algoritmos baseados em critérios entrópicos e a sua aplicabilidade a dados biomédicos, numa tentativa de avaliar a adequação destes algoritmos a este tipo de dados. Os resultados dos algoritmos testados são comparados com os obtidos por outros algoritmos mais “convencionais" como o k-médias, os algoritmos de spectral clustering e um algoritmo baseado em densidade.
Resumo:
O objetivo desta dissertação foi estudar um conjunto de empresas cotadas na bolsa de valores de Lisboa, para identificar aquelas que têm um comportamento semelhante ao longo do tempo. Para isso utilizamos algoritmos de Clustering tais como K-Means, PAM, Modelos hierárquicos, Funny e C-Means tanto com a distância euclidiana como com a distância de Manhattan. Para selecionar o melhor número de clusters identificado por cada um dos algoritmos testados, recorremos a alguns índices de avaliação/validação de clusters como o Davies Bouldin e Calinski-Harabasz entre outros.
Resumo:
With the electricity market liberalization, the distribution and retail companies are looking for better market strategies based on adequate information upon the consumption patterns of its electricity consumers. A fair insight on the consumers’ behavior will permit the definition of specific contract aspects based on the different consumption patterns. In order to form the different consumers’ classes, and find a set of representative consumption patterns we use electricity consumption data from a utility client’s database and two approaches: Two-step clustering algorithm and the WEACS approach based on evidence accumulation (EAC) for combining partitions in a clustering ensemble. While EAC uses a voting mechanism to produce a co-association matrix based on the pairwise associations obtained from N partitions and where each partition has equal weight in the combination process, the WEACS approach uses subsampling and weights differently the partitions. As a complementary step to the WEACS approach, we combine the partitions obtained in the WEACS approach with the ALL clustering ensemble construction method and we use the Ward Link algorithm to obtain the final data partition. The characterization of the obtained consumers’ clusters was performed using the C5.0 classification algorithm. Experiment results showed that the WEACS approach leads to better results than many other clustering approaches.
Resumo:
The present research paper presents five different clustering methods to identify typical load profiles of medium voltage (MV) electricity consumers. These methods are intended to be used in a smart grid environment to extract useful knowledge about customer’s behaviour. The obtained knowledge can be used to support a decision tool, not only for utilities but also for consumers. Load profiles can be used by the utilities to identify the aspects that cause system load peaks and enable the development of specific contracts with their customers. The framework presented throughout the paper consists in several steps, namely the pre-processing data phase, clustering algorithms application and the evaluation of the quality of the partition, which is supported by cluster validity indices. The process ends with the analysis of the discovered knowledge. To validate the proposed framework, a case study with a real database of 208 MV consumers is used.
Resumo:
The growing importance and influence of new resources connected to the power systems has caused many changes in their operation. Environmental policies and several well know advantages have been made renewable based energy resources largely disseminated. These resources, including Distributed Generation (DG), are being connected to lower voltage levels where Demand Response (DR) must be considered too. These changes increase the complexity of the system operation due to both new operational constraints and amounts of data to be processed. Virtual Power Players (VPP) are entities able to manage these resources. Addressing these issues, this paper proposes a methodology to support VPP actions when these act as a Curtailment Service Provider (CSP) that provides DR capacity to a DR program declared by the Independent System Operator (ISO) or by the VPP itself. The amount of DR capacity that the CSP can assure is determined using data mining techniques applied to a database which is obtained for a large set of operation scenarios. The paper includes a case study based on 27,000 scenarios considering a diversity of distributed resources in a 33 bus distribution network.
Resumo:
A methodology based on data mining techniques to support the analysis of zonal prices in real transmission networks is proposed in this paper. The mentioned methodology uses clustering algorithms to group the buses in typical classes that include a set of buses with similar LMP values. Two different clustering algorithms have been used to determine the LMP clusters: the two-step and K-means algorithms. In order to evaluate the quality of the partition as well as the best performance algorithm adequacy measurements indices are used. The paper includes a case study using a Locational Marginal Prices (LMP) data base from the California ISO (CAISO) in order to identify zonal prices.
Resumo:
This paper analyses earthquake data in the perspective of dynamical systems and its Pseudo Phase Plane representation. The seismic data is collected from the Bulletin of the International Seismological Centre. The geological events are characterised by their magnitude and geographical location and described by means of time series of sequences of Dirac impulses. Fifty groups of data series are considered, according to the Flinn-Engdahl seismic regions of Earth. For each region, Pearson’s correlation coefficient is used to find the optimal time delay for reconstructing the Pseudo Phase Plane. The Pseudo Phase Plane plots are then analysed and characterised.
Resumo:
This paper reports on the analysis of tidal breathing patterns measured during noninvasive forced oscillation lung function tests in six individual groups. The three adult groups were healthy, with prediagnosed chronic obstructive pulmonary disease, and with prediagnosed kyphoscoliosis, respectively. The three children groups were healthy, with prediagnosed asthma, and with prediagnosed cystic fibrosis, respectively. The analysis is applied to the pressure–volume curves and the pseudophaseplane loop by means of the box-counting method, which gives a measure of the area within each loop. The objective was to verify if there exists a link between the area of the loops, power-law patterns, and alterations in the respiratory structure with disease. We obtained statistically significant variations between the data sets corresponding to the six groups of patients, showing also the existence of power-law patterns. Our findings support the idea that the respiratory system changes with disease in terms of airway geometry and tissue parameters, leading, in turn, to variations in the fractal dimension of the respiratory tree and its dynamics.
Resumo:
The goal of this study is to analyze the dynamical properties of financial data series from nineteen worldwide stock market indices (SMI) during the period 1995–2009. SMI reveal a complex behavior that can be explored since it is available a considerable volume of data. In this paper is applied the window Fourier transform and methods of fractional calculus. The results reveal classification patterns typical of fractional order systems.
Resumo:
Fractional order modeling of biological systems has received significant interest in the research community. Since the fractal geometry is characterized by a recurrent structure, the self-similar branching arrangement of the airways makes the respiratory system an ideal candidate for the application of fractional calculus theory. To demonstrate the link between the recurrence of the respiratory tree and the appearance of a fractional-order model, we develop an anatomically consistent representation of the respiratory system. This model is capable of simulating the mechanical properties of the lungs and we compare the model output with in vivo measurements of the respiratory input impedance collected in 20 healthy subjects. This paper provides further proof of the underlying fractal geometry of the human lungs, and the consequent appearance of constant-phase behavior in the total respiratory impedance.
Resumo:
This paper reports on the analysis of tidal breathing patterns measured during noninvasive forced oscillation lung function tests in six individual groups. The three adult groups were healthy, with prediagnosed chronic obstructive pulmonary disease, and with prediagnosed kyphoscoliosis, respectively. The three children groups were healthy, with prediagnosed asthma, and with prediagnosed cystic fibrosis, respectively. The analysis is applied to the pressure-volume curves and the pseudophase-plane loop by means of the box-counting method, which gives a measure of the area within each loop. The objective was to verify if there exists a link between the area of the loops, power-law patterns, and alterations in the respiratory structure with disease. We obtained statistically significant variations between the data sets corresponding to the six groups of patients, showing also the existence of power-law patterns. Our findings support the idea that the respiratory system changes with disease in terms of airway geometry and tissue parameters, leading, in turn, to variations in the fractal dimension of the respiratory tree and its dynamics.
Resumo:
Forest fires dynamics is often characterized by the absence of a characteristic length-scale, long range correlations in space and time, and long memory, which are features also associated with fractional order systems. In this paper a public domain forest fires catalogue, containing information of events for Portugal, covering the period from 1980 up to 2012, is tackled. The events are modelled as time series of Dirac impulses with amplitude proportional to the burnt area. The time series are viewed as the system output and are interpreted as a manifestation of the system dynamics. In the first phase we use the pseudo phase plane (PPP) technique to describe forest fires dynamics. In the second phase we use multidimensional scaling (MDS) visualization tools. The PPP allows the representation of forest fires dynamics in two-dimensional space, by taking time series representative of the phenomena. The MDS approach generates maps where objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to better understand forest fires behaviour.
Resumo:
In recent years, vehicular cloud computing (VCC) has emerged as a new technology which is being used in wide range of applications in the area of multimedia-based healthcare applications. In VCC, vehicles act as the intelligent machines which can be used to collect and transfer the healthcare data to the local, or global sites for storage, and computation purposes, as vehicles are having comparatively limited storage and computation power for handling the multimedia files. However, due to the dynamic changes in topology, and lack of centralized monitoring points, this information can be altered, or misused. These security breaches can result in disastrous consequences such as-loss of life or financial frauds. Therefore, to address these issues, a learning automata-assisted distributive intrusion detection system is designed based on clustering. Although there exist a number of applications where the proposed scheme can be applied but, we have taken multimedia-based healthcare application for illustration of the proposed scheme. In the proposed scheme, learning automata (LA) are assumed to be stationed on the vehicles which take clustering decisions intelligently and select one of the members of the group as a cluster-head. The cluster-heads then assist in efficient storage and dissemination of information through a cloud-based infrastructure. To secure the proposed scheme from malicious activities, standard cryptographic technique is used in which the auotmaton learns from the environment and takes adaptive decisions for identification of any malicious activity in the network. A reward and penalty is given by the stochastic environment where an automaton performs its actions so that it updates its action probability vector after getting the reinforcement signal from the environment. The proposed scheme was evaluated using extensive simulations on ns-2 with SUMO. The results obtained indicate that the proposed scheme yields an improvement of 10 % in detection rate of malicious nodes when compared with the existing schemes.
Resumo:
In this paper we analyze the behavior of tornado time-series in the U.S. from the perspective of dynamical systems. A tornado is a violently rotating column of air extending from a cumulonimbus cloud down to the ground. Such phenomena reveal features that are well described by power law functions and unveil characteristics found in systems with long range memory effects. Tornado time series are viewed as the output of a complex system and are interpreted as a manifestation of its dynamics. Tornadoes are modeled as sequences of Dirac impulses with amplitude proportional to the events size. First, a collection of time series involving 64 years is analyzed in the frequency domain by means of the Fourier transform. The amplitude spectra are approximated by power law functions and their parameters are read as an underlying signature of the system dynamics. Second, it is adopted the concept of circular time and the collective behavior of tornadoes analyzed. Clustering techniques are then adopted to identify and visualize the emerging patterns.