873 resultados para constrained clustering
Resumo:
Approximate execution is a viable technique for environments with energy constraints, provided that applications are given the mechanisms to produce outputs of the highest possible quality within the available energy budget. This paper introduces a framework for energy-constrained execution with controlled and graceful quality loss. A simple programming model allows developers to structure the computation in different tasks, and to express the relative importance of these tasks for the quality of the end result. For non-significant tasks, the developer can also supply less costly, approximate versions. The target energy consumption for a given execution is specified when the application is launched. A significance-aware runtime system employs an application-specific analytical energy model to decide how many cores to use for the execution, the operating frequency for these cores, as well as the degree of task approximation, so as to maximize the quality of the output while meeting the user-specified energy constraints. Evaluation on a dual-socket 16-core Intel platform using 9 benchmark kernels shows that the proposed framework picks the optimal configuration with high accuracy. Also, a comparison with loop perforation (a well-known compile-time approximation technique), shows that the proposed framework results in significantly higher quality for the same energy budget.
Resumo:
In this brief, a hybrid filter algorithm is developed to deal with the state estimation (SE) problem for power systems by taking into account the impact from the phasor measurement units (PMUs). Our aim is to include PMU measurements when designing the dynamic state estimators for power systems with traditional measurements. Also, as data dropouts inevitably occur in the transmission channels of traditional measurements from the meters to the control center, the missing measurement phenomenon is also tackled in the state estimator design. In the framework of extended Kalman filter (EKF) algorithm, the PMU measurements are treated as inequality constraints on the states with the aid of the statistical criterion, and then the addressed SE problem becomes a constrained optimization one based on the probability-maximization method. The resulting constrained optimization problem is then solved using the particle swarm optimization algorithm together with the penalty function approach. The proposed algorithm is applied to estimate the states of the power systems with both traditional and PMU measurements in the presence of probabilistic data missing phenomenon. Extensive simulations are carried out on the IEEE 14-bus test system and it is shown that the proposed algorithm gives much improved estimation performances over the traditional EKF method.
Resumo:
We consider a convex problem of Semi-Infinite Programming (SIP) with multidimensional index set. In study of this problem we apply the approach suggested in [20] for convex SIP problems with one-dimensional index sets and based on the notions of immobile indices and their immobility orders. For the problem under consideration we formulate optimality conditions that are explicit and have the form of criterion. We compare this criterion with other known optimality conditions for SIP and show its efficiency in the convex case.
Resumo:
Nos últimos anos temos vindo a assistir a uma mudança na forma como a informação é disponibilizada online. O surgimento da web para todos possibilitou a fácil edição, disponibilização e partilha da informação gerando um considerável aumento da mesma. Rapidamente surgiram sistemas que permitem a coleção e partilha dessa informação, que para além de possibilitarem a coleção dos recursos também permitem que os utilizadores a descrevam utilizando tags ou comentários. A organização automática dessa informação é um dos maiores desafios no contexto da web atual. Apesar de existirem vários algoritmos de clustering, o compromisso entre a eficácia (formação de grupos que fazem sentido) e a eficiência (execução em tempo aceitável) é difícil de encontrar. Neste sentido, esta investigação tem por problemática aferir se um sistema de agrupamento automático de documentos, melhora a sua eficácia quando se integra um sistema de classificação social. Analisámos e discutimos dois métodos baseados no algoritmo k-means para o clustering de documentos e que possibilitam a integração do tagging social nesse processo. O primeiro permite a integração das tags diretamente no Vector Space Model e o segundo propõe a integração das tags para a seleção das sementes iniciais. O primeiro método permite que as tags sejam pesadas em função da sua ocorrência no documento através do parâmetro Social Slider. Este método foi criado tendo por base um modelo de predição que sugere que, quando se utiliza a similaridade dos cossenos, documentos que partilham tags ficam mais próximos enquanto que, no caso de não partilharem, ficam mais distantes. O segundo método deu origem a um algoritmo que denominamos k-C. Este para além de permitir a seleção inicial das sementes através de uma rede de tags também altera a forma como os novos centróides em cada iteração são calculados. A alteração ao cálculo dos centróides teve em consideração uma reflexão sobre a utilização da distância euclidiana e similaridade dos cossenos no algoritmo de clustering k-means. No contexto da avaliação dos algoritmos foram propostos dois algoritmos, o algoritmo da “Ground truth automática” e o algoritmo MCI. O primeiro permite a deteção da estrutura dos dados, caso seja desconhecida, e o segundo é uma medida de avaliação interna baseada na similaridade dos cossenos entre o documento mais próximo de cada documento. A análise de resultados preliminares sugere que a utilização do primeiro método de integração das tags no VSM tem mais impacto no algoritmo k-means do que no algoritmo k-C. Além disso, os resultados obtidos evidenciam que não existe correlação entre a escolha do parâmetro SS e a qualidade dos clusters. Neste sentido, os restantes testes foram conduzidos utilizando apenas o algoritmo k-C (sem integração de tags no VSM), sendo que os resultados obtidos indicam que a utilização deste algoritmo tende a gerar clusters mais eficazes.
Resumo:
Systems equipped with multiple antennas at the transmitter and at the receiver, known as MIMO (Multiple Input Multiple Output) systems, offer higher capacities, allowing an efficient exploitation of the available spectrum and/or the employment of more demanding applications. It is well known that the radio channel is characterized by multipath propagation, a phenomenon deemed problematic and whose mitigation has been achieved through techniques such as diversity, beamforming or adaptive antennas. By exploring conveniently the spatial domain MIMO systems turn the characteristics of the multipath channel into an advantage and allow creating multiple parallel and independent virtual channels. However, the achievable benefits are constrained by the propagation channel’s characteristics, which may not always be ideal. This work focuses on the characterization of the MIMO radio channel. It begins with the presentation of the fundamental results from information theory that triggered the interest on these systems, including the discussion of some of their potential benefits and a review of the existing channel models for MIMO systems. The characterization of the MIMO channel developed in this work is based on experimental measurements of the double-directional channel. The measurement system is based on a vector network analyzer and a two-dimensional positioning platform, both controlled by a computer, allowing the measurement of the channel’s frequency response at the locations of a synthetic array. Data is then processed using the SAGE (Space-Alternating Expectation-Maximization) algorithm to obtain the parameters (delay, direction of arrival and complex amplitude) of the channel’s most relevant multipath components. Afterwards, using a clustering algorithm these data are grouped into clusters. Finally, statistical information is extracted allowing the characterization of the channel’s multipath components. The information about the multipath characteristics of the channel, induced by existing scatterers in the propagation scenario, enables the characterization of MIMO channel and thus to evaluate its performance. The method was finally validated using MIMO measurements.
Resumo:
Clustering and Disjoint Principal Component Analysis (CDP CA) is a constrained principal component analysis recently proposed for clustering of objects and partitioning of variables, simultaneously, which we have implemented in R language. In this paper, we deal in detail with the alternating least-squares algorithm for CDPCA and highlight its algebraic features for constructing both interpretable principal components and clusters of objects. Two applications are given to illustrate the capabilities of this new methodology.
Resumo:
Background Clustering of lifestyle risk behaviours is very important in predicting premature mortality. Understanding the extent to which risk behaviours are clustered in deprived communities is vital to most effectively target public health interventions. Methods We examined co-occurrence and associations between risk behaviours (smoking, alcohol consumption, poor diet, low physical activity and high sedentary time) reported by adults living in deprived London neighbourhoods. Associations between sociodemographic characteristics and clustered risk behaviours were examined. Latent class analysis was used to identify underlying clustering of behaviours. Results Over 90% of respondents reported at least one risk behaviour. Reporting specific risk behaviours predicted reporting of further risk behaviours. Latent class analyses revealed four underlying classes. Membership of a maximal risk behaviour class was more likely for young, white males who were unable to work. Conclusions Compared with recent national level analysis, there was a weaker relationship between education and clustering of behaviours and a very high prevalence of clustering of risk behaviours in those unable to work. Young, white men who report difficulty managing on income were at high risk of reporting multiple risk behaviours. These groups may be an important target for interventions to reduce premature mortality caused by multiple risk behaviours.
Resumo:
Thesis (Ph.D.)--University of Washington, 2015
Resumo:
With the electricity market liberalization, the distribution and retail companies are looking for better market strategies based on adequate information upon the consumption patterns of its electricity consumers. A fair insight on the consumers’ behavior will permit the definition of specific contract aspects based on the different consumption patterns. In order to form the different consumers’ classes, and find a set of representative consumption patterns we use electricity consumption data from a utility client’s database and two approaches: Two-step clustering algorithm and the WEACS approach based on evidence accumulation (EAC) for combining partitions in a clustering ensemble. While EAC uses a voting mechanism to produce a co-association matrix based on the pairwise associations obtained from N partitions and where each partition has equal weight in the combination process, the WEACS approach uses subsampling and weights differently the partitions. As a complementary step to the WEACS approach, we combine the partitions obtained in the WEACS approach with the ALL clustering ensemble construction method and we use the Ward Link algorithm to obtain the final data partition. The characterization of the obtained consumers’ clusters was performed using the C5.0 classification algorithm. Experiment results showed that the WEACS approach leads to better results than many other clustering approaches.
Resumo:
The present research paper presents five different clustering methods to identify typical load profiles of medium voltage (MV) electricity consumers. These methods are intended to be used in a smart grid environment to extract useful knowledge about customer’s behaviour. The obtained knowledge can be used to support a decision tool, not only for utilities but also for consumers. Load profiles can be used by the utilities to identify the aspects that cause system load peaks and enable the development of specific contracts with their customers. The framework presented throughout the paper consists in several steps, namely the pre-processing data phase, clustering algorithms application and the evaluation of the quality of the partition, which is supported by cluster validity indices. The process ends with the analysis of the discovered knowledge. To validate the proposed framework, a case study with a real database of 208 MV consumers is used.
Resumo:
Distribution systems are the first volunteers experiencing the benefits of smart grids. The smart grid concept impacts the internal legislation and standards in grid-connected and isolated distribution systems. Demand side management, the main feature of smart grids, acquires clear meaning in low voltage distribution systems. In these networks, various coordination procedures are required between domestic, commercial and industrial consumers, producers and the system operator. Obviously, the technical basis for bidirectional communication is the prerequisite of developing such a coordination procedure. The main coordination is required when the operator tries to dispatch the producers according to their own preferences without neglecting its inherent responsibility. Maintenance decisions are first determined by generating companies, and then the operator has to check and probably modify them for final approval. In this paper the generation scheduling from the viewpoint of a distribution system operator (DSO) is formulated. The traditional task of the DSO is securing network reliability and quality. The effectiveness of the proposed method is assessed by applying it to a 6-bus and 9-bus distribution system.
Resumo:
The growing importance and influence of new resources connected to the power systems has caused many changes in their operation. Environmental policies and several well know advantages have been made renewable based energy resources largely disseminated. These resources, including Distributed Generation (DG), are being connected to lower voltage levels where Demand Response (DR) must be considered too. These changes increase the complexity of the system operation due to both new operational constraints and amounts of data to be processed. Virtual Power Players (VPP) are entities able to manage these resources. Addressing these issues, this paper proposes a methodology to support VPP actions when these act as a Curtailment Service Provider (CSP) that provides DR capacity to a DR program declared by the Independent System Operator (ISO) or by the VPP itself. The amount of DR capacity that the CSP can assure is determined using data mining techniques applied to a database which is obtained for a large set of operation scenarios. The paper includes a case study based on 27,000 scenarios considering a diversity of distributed resources in a 33 bus distribution network.
Resumo:
In real optimization problems, usually the analytical expression of the objective function is not known, nor its derivatives, or they are complex. In these cases it becomes essential to use optimization methods where the calculation of the derivatives, or the verification of their existence, is not necessary: the Direct Search Methods or Derivative-free Methods are one solution. When the problem has constraints, penalty functions are often used. Unfortunately the choice of the penalty parameters is, frequently, very difficult, because most strategies for choosing it are heuristics strategies. As an alternative to penalty function appeared the filter methods. A filter algorithm introduces a function that aggregates the constrained violations and constructs a biobjective problem. In this problem the step is accepted if it either reduces the objective function or the constrained violation. This implies that the filter methods are less parameter dependent than a penalty function. In this work, we present a new direct search method, based on simplex methods, for general constrained optimization that combines the features of the simplex method and filter methods. This method does not compute or approximate any derivatives, penalty constants or Lagrange multipliers. The basic idea of simplex filter algorithm is to construct an initial simplex and use the simplex to drive the search. We illustrate the behavior of our algorithm through some examples. The proposed methods were implemented in Java.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.