830 resultados para Labeling hierarchical clustering
Resumo:
This study examines managers‟ perceptions of Knowledge Management (KM) prior to implementation of KM-systems in a global insurance company and investigates whether Hierarchical structures are conducive to KM. Mixed methods are used, combining large scale surveying and case study using content analysis to organize the data into themes that provide the basis for arguments. Evidence suggests that managers strongly align their perception of KM with communication. Despite a multi-layered, hierarchical structure and strong middle management presence, organizational structure was not viewed as an issue. These factors are usually barriers to communication and organizational flexibility, yet managers believe that they may not inhibit KM becoming fully embedded. This evidence is contradicted by the results of a global KM study where silos, stovepipe and hierarchical structures were commonly cited as barriers. This contributes to the understanding of managerial mis-conceptions of knowledge as opposed to communication, and how organizations effectively share knowledge.
Resumo:
With the electricity market liberalization, the distribution and retail companies are looking for better market strategies based on adequate information upon the consumption patterns of its electricity consumers. A fair insight on the consumers’ behavior will permit the definition of specific contract aspects based on the different consumption patterns. In order to form the different consumers’ classes, and find a set of representative consumption patterns we use electricity consumption data from a utility client’s database and two approaches: Two-step clustering algorithm and the WEACS approach based on evidence accumulation (EAC) for combining partitions in a clustering ensemble. While EAC uses a voting mechanism to produce a co-association matrix based on the pairwise associations obtained from N partitions and where each partition has equal weight in the combination process, the WEACS approach uses subsampling and weights differently the partitions. As a complementary step to the WEACS approach, we combine the partitions obtained in the WEACS approach with the ALL clustering ensemble construction method and we use the Ward Link algorithm to obtain the final data partition. The characterization of the obtained consumers’ clusters was performed using the C5.0 classification algorithm. Experiment results showed that the WEACS approach leads to better results than many other clustering approaches.
Resumo:
The present research paper presents five different clustering methods to identify typical load profiles of medium voltage (MV) electricity consumers. These methods are intended to be used in a smart grid environment to extract useful knowledge about customer’s behaviour. The obtained knowledge can be used to support a decision tool, not only for utilities but also for consumers. Load profiles can be used by the utilities to identify the aspects that cause system load peaks and enable the development of specific contracts with their customers. The framework presented throughout the paper consists in several steps, namely the pre-processing data phase, clustering algorithms application and the evaluation of the quality of the partition, which is supported by cluster validity indices. The process ends with the analysis of the discovered knowledge. To validate the proposed framework, a case study with a real database of 208 MV consumers is used.
Resumo:
The growing importance and influence of new resources connected to the power systems has caused many changes in their operation. Environmental policies and several well know advantages have been made renewable based energy resources largely disseminated. These resources, including Distributed Generation (DG), are being connected to lower voltage levels where Demand Response (DR) must be considered too. These changes increase the complexity of the system operation due to both new operational constraints and amounts of data to be processed. Virtual Power Players (VPP) are entities able to manage these resources. Addressing these issues, this paper proposes a methodology to support VPP actions when these act as a Curtailment Service Provider (CSP) that provides DR capacity to a DR program declared by the Independent System Operator (ISO) or by the VPP itself. The amount of DR capacity that the CSP can assure is determined using data mining techniques applied to a database which is obtained for a large set of operation scenarios. The paper includes a case study based on 27,000 scenarios considering a diversity of distributed resources in a 33 bus distribution network.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
In this paper we present a possible design for a passive RFID tag antenna on paper substrate to be integrated into bottle labels. Considering the application scenario, we verified and determined the permittivity and dissipation factor of the materials in order to simulate all the possible sources that would influence the antenna performance. The measured results reported a maximum reading range of 1.45 m even though the efficiency obtained with the antenna integrated into the bottle was only of 3%. © 2014 IEEE.
Resumo:
In the present paper we focus on the performance of clustering algorithms using indices of paired agreement to measure the accordance between clusters and an a priori known structure. We specifically propose a method to correct all indices considered for agreement by chance - the adjusted indices are meant to provide a realistic measure of clustering performance. The proposed method enables the correction of virtually any index - overcoming previous limitations known in the literature - and provides very precise results. We use simulated datasets under diverse scenarios and discuss the pertinence of our proposal which is particularly relevant when poorly separated clusters are considered. Finally we compare the performance of EM and KMeans algorithms, within each of the simulated scenarios and generally conclude that EM generally yields best results.
Resumo:
A procura de padrões nos dados de modo a formar grupos é conhecida como aglomeração de dados ou clustering, sendo uma das tarefas mais realizadas em mineração de dados e reconhecimento de padrões. Nesta dissertação é abordado o conceito de entropia e são usados algoritmos com critérios entrópicos para fazer clustering em dados biomédicos. O uso da entropia para efetuar clustering é relativamente recente e surge numa tentativa da utilização da capacidade que a entropia possui de extrair da distribuição dos dados informação de ordem superior, para usá-la como o critério na formação de grupos (clusters) ou então para complementar/melhorar algoritmos existentes, numa busca de obtenção de melhores resultados. Alguns trabalhos envolvendo o uso de algoritmos baseados em critérios entrópicos demonstraram resultados positivos na análise de dados reais. Neste trabalho, exploraram-se alguns algoritmos baseados em critérios entrópicos e a sua aplicabilidade a dados biomédicos, numa tentativa de avaliar a adequação destes algoritmos a este tipo de dados. Os resultados dos algoritmos testados são comparados com os obtidos por outros algoritmos mais “convencionais" como o k-médias, os algoritmos de spectral clustering e um algoritmo baseado em densidade.
Resumo:
In the present paper we compare clustering solutions using indices of paired agreement. We propose a new method - IADJUST - to correct indices of paired agreement, excluding agreement by chance. This new method overcomes previous limitations known in the literature as it permits the correction of any index. We illustrate its use in external clustering validation, to measure the accordance between clusters and an a priori known structure. The adjusted indices are intended to provide a realistic measure of clustering performance that excludes agreement by chance with ground truth. We use simulated data sets, under a range of scenarios - considering diverse numbers of clusters, clusters overlaps and balances - to discuss the pertinence and the precision of our proposal. Precision is established based on comparisons with the analytical approach for correction specific indices that can be corrected in this way are used for this purpose. The pertinence of the proposed correction is discussed when making a detailed comparison between the performance of two classical clustering approaches, namely Expectation-Maximization (EM) and K-Means (KM) algorithms. Eight indices of paired agreement are studied and new corrected indices are obtained.
Resumo:
In recent years, vehicular cloud computing (VCC) has emerged as a new technology which is being used in wide range of applications in the area of multimedia-based healthcare applications. In VCC, vehicles act as the intelligent machines which can be used to collect and transfer the healthcare data to the local, or global sites for storage, and computation purposes, as vehicles are having comparatively limited storage and computation power for handling the multimedia files. However, due to the dynamic changes in topology, and lack of centralized monitoring points, this information can be altered, or misused. These security breaches can result in disastrous consequences such as-loss of life or financial frauds. Therefore, to address these issues, a learning automata-assisted distributive intrusion detection system is designed based on clustering. Although there exist a number of applications where the proposed scheme can be applied but, we have taken multimedia-based healthcare application for illustration of the proposed scheme. In the proposed scheme, learning automata (LA) are assumed to be stationed on the vehicles which take clustering decisions intelligently and select one of the members of the group as a cluster-head. The cluster-heads then assist in efficient storage and dissemination of information through a cloud-based infrastructure. To secure the proposed scheme from malicious activities, standard cryptographic technique is used in which the auotmaton learns from the environment and takes adaptive decisions for identification of any malicious activity in the network. A reward and penalty is given by the stochastic environment where an automaton performs its actions so that it updates its action probability vector after getting the reinforcement signal from the environment. The proposed scheme was evaluated using extensive simulations on ns-2 with SUMO. The results obtained indicate that the proposed scheme yields an improvement of 10 % in detection rate of malicious nodes when compared with the existing schemes.
Resumo:
Dissertação apresentada como requisito parcial para obtenção do grau de Mestre em Estatística e Gestão de Informação
Resumo:
This study focuses on the implementation of several pair trading strategies across three emerging markets, with the objective of comparing the results obtained from the different strategies and assessing if pair trading benefits from a more volatile environment. The results show that, indeed, there are higher potential profits arising from emerging markets. However, the higher excess return will be partially offset by higher transaction costs, which will be a determinant factor to the profitability of pair trading strategies. Also, a new clustering approach based on the Principal Component Analysis was tested as an alternative to the more standard clustering by Industry Groups. The new clustering approach delivers promising results, consistently reducing volatility to a greater extent than the Industry Group approach, with no significant harm to the excess returns.
Resumo:
Objective: Nutritional labeling systems are considered a tool to fight obesity since they aim to contribute for more informed food choices as well as assist consumers to make healthier nutrition options and in this manner, contribute to a decrease in the obesity rate. This study intends to analyze the effect of different types of labeling systems on parents’ purchasing decisions for their children on a specific product: breakfast cereals. More precisely, how labels affect parents’ perception of healthiness regarding cereals and if the nutritional information has an effect on intended purchases for their children. Participants and methods: We conducted a study with 135 Portuguese parents of children aged 4 to12 years. Parents answered a questionnaire with one of three hypothetical cereals menus. Menus only differed in their nutritional labeling technique: no labels (control group), reference intake labels or traffic light labels. In addition, we conducted 20 face-to-face interviews to a different group of parents in order to perform a recall task. Findings: This paper provides no evidence to suggest that energy labeling or traffic light labeling systems alone were successful in helping parents making healthy purchases of cereals for their children. Therefore, there is the need to promote supplementary policies to encourage the consumption of healthier food and help fight obesity.
Resumo:
Résumé : Les progrès techniques de la spectrométrie de masse (MS) ont contribué au récent développement de la protéomique. Cette technique peut actuellement détecter, identifier et quantifier des milliers de protéines. Toutefois, elle n'est pas encore assez puissante pour fournir une analyse complète des modifications du protéome corrélées à des phénomènes biologiques. Notre objectif était le développement d'une nouvelle stratégie pour la détection spécifique et la quantification des variations du protéome, basée sur la mesure de la synthèse des protéines plutôt que sur celle de la quantité de protéines totale. Pour cela, nous volions associer le marquage pulsé des protéines par des isotopes stables avec une méthode d'acquisition MS basée sur le balayage des ions précurseurs (precursor ion scan, ou PIS), afin de détecter spécifiquement les protéines ayant intégré les isotopes et d'estimer leur abondance par rapport aux protéines non marquées. Une telle approche peut identifier les protéines avec les plus hauts taux de synthèse dans une période de temps donnée, y compris les protéines dont l'expression augmente spécifiquement suite à un événement précis. Nous avons tout d'abord testé différents acides aminés marqués en combinaison avec des méthodes PIS spécifiques. Ces essais ont permis la détection spécifique des protéines marquées. Cependant, en raison des limitations instrumentales du spectromètre de masse utilisé pour les méthodes PIS, la sensibilité de cette approche s'est révélée être inférieure à une analyse non ciblée réalisée sur un instrument plus récent (Chapitre 2.1). Toutefois, pour l'analyse différentielle de deux milieux de culture conditionnés par des cellules cancéreuses humaines, nous avons utilisé le marquage métabolique pour distinguer les protéines d'origine cellulaire des protéines non marquées du sérum présentes dans les milieux de culture (Chapitre 2.2). Parallèlement, nous avons développé une nouvelle méthode de quantification nommée IBIS, qui utilise des paires d'isotopes stables d'acides aminés capables de produire des ions spécifiques qui peuvent être utilisés pour la quantification relative. La méthode IBIS a été appliquée à l'analyse de deux lignées cellulaires cancéreuses complètement marquées, mais de manière différenciée, par des paires d'acides aminés (Chapitre 2.3). Ensuite, conformément à l'objectif initial de cette thèse, nous avons utilisé une variante pulsée de l'IBIS pour détecter des modifications du protéome dans des cellules HeLa infectée par le virus humain Herpes Simplex-1 (Chapitre 2.4). Ce virus réprime la synthèse des protéines des cellules hôtes afin d'exploiter leur mécanisme de traduction pour la production massive de virions. Comme prévu, de hauts taux de synthèse ont été mesurés pour les protéines virales détectées, attestant de leur haut niveau d'expression. Nous avons de plus identifié un certain nombre de protéines humaines dont le rapport de synthèse et de dégradation (S/D) a été modifié par l'infection virale, ce qui peut donner des indications sur les stratégies utilisées par les virus pour détourner la machinerie cellulaire. En conclusion, nous avons montré dans ce travail que le marquage métabolique peut être employé de façon non conventionnelle pour étudier des dimensions peu explorées en protéomique. Summary : In recent years major technical advancements greatly supported the development of mass spectrometry (MS)-based proteomics. Currently, this technique can efficiently detect, identify and quantify thousands of proteins. However, it is not yet sufficiently powerful to provide a comprehensive analysis of the proteome changes correlated with biological phenomena. The aim of our project was the development of ~a new strategy for the specific detection and quantification of proteomé variations based on measurements of protein synthesis rather than total protein amounts. The rationale for this approach was that changes in protein synthesis more closely reflect dynamic cellular responses than changes in total protein concentrations. Our starting idea was to couple "pulsed" stable-isotope labeling of proteins with a specific MS acquisition method based on precursor ion scan (PIS), to specifically detect proteins that incorporated the label and to simultaneously estimate their abundance, relative to the unlabeled protein isoform. Such approach could highlight proteins with the highest synthesis rate in a given time frame, including proteins specifically up-regulated by a given biological stimulus. As a first step, we tested different isotope-labeled amino acids in combination with dedicated PIS methods and showed that this leads to specific detection of labeled proteins. Sensitivity, however, turned out to be lower than an untargeted analysis run on a more recent instrument, due to MS hardware limitations (Chapter 2.1). We next used metabolic labeling to distinguish the proteins of cellular origin from a high background of unlabeled (serum) proteins, for the differential analysis of two serum-containing culture media conditioned by labeled human cancer cells (Chapter 2.2). As a parallel project we developed a new quantification method (named ISIS), which uses pairs of stable-isotope labeled amino acids able to produce specific reporter ions, which can be used for relative quantification. The ISIS method was applied to the analysis of two fully, yet differentially labeled cancer cell lines, as described in Chapter 2.3. Next, in line with the original purpose of this thesis, we used a "pulsed" variant of ISIS to detect proteome changes in HeLa cells after the infection with human Herpes Simplex Virus-1 (Chapter 2.4). This virus is known to repress the synthesis of host cell proteins to exploit the translation machinery for the massive production of virions. As expected, high synthesis rates were measured for the detected viral proteins, confirming their up-regulation. Moreover, we identified a number of human proteins whose synthesis/degradation ratio (S/D) was affected by the viral infection and which could provide clues on the strategies used by the virus to hijack the cellular machinery. Overall, in this work, we showed that metabolic labeling can be employed in alternative ways to investigate poorly explored dimensions in proteomics.