19 resultados para Semi-supervised clustering
em Instituto Politécnico do Porto, Portugal
Resumo:
A definition of medium voltage (MV) load diagrams was made, based on the data base knowledge discovery process. Clustering techniques were used as support for the agents of the electric power retail markets to obtain specific knowledge of their customers’ consumption habits. Each customer class resulting from the clustering operation is represented by its load diagram. The Two-step clustering algorithm and the WEACS approach based on evidence accumulation (EAC) were applied to an electricity consumption data from a utility client’s database in order to form the customer’s classes and to find a set of representative consumption patterns. The WEACS approach is a clustering ensemble combination approach that uses subsampling and that weights differently the partitions in the co-association matrix. As a complementary step to the WEACS approach, all the final data partitions produced by the different variations of the method are combined and the Ward Link algorithm is used to obtain the final data partition. Experiment results showed that WEACS approach led to better accuracy than many other clustering approaches. In this paper the WEACS approach separates better the customer’s population than Two-step clustering algorithm.
Resumo:
With the electricity market liberalization, the distribution and retail companies are looking for better market strategies based on adequate information upon the consumption patterns of its electricity consumers. A fair insight on the consumers’ behavior will permit the definition of specific contract aspects based on the different consumption patterns. In order to form the different consumers’ classes, and find a set of representative consumption patterns we use electricity consumption data from a utility client’s database and two approaches: Two-step clustering algorithm and the WEACS approach based on evidence accumulation (EAC) for combining partitions in a clustering ensemble. While EAC uses a voting mechanism to produce a co-association matrix based on the pairwise associations obtained from N partitions and where each partition has equal weight in the combination process, the WEACS approach uses subsampling and weights differently the partitions. As a complementary step to the WEACS approach, we combine the partitions obtained in the WEACS approach with the ALL clustering ensemble construction method and we use the Ward Link algorithm to obtain the final data partition. The characterization of the obtained consumers’ clusters was performed using the C5.0 classification algorithm. Experiment results showed that the WEACS approach leads to better results than many other clustering approaches.
Resumo:
The present research paper presents five different clustering methods to identify typical load profiles of medium voltage (MV) electricity consumers. These methods are intended to be used in a smart grid environment to extract useful knowledge about customer’s behaviour. The obtained knowledge can be used to support a decision tool, not only for utilities but also for consumers. Load profiles can be used by the utilities to identify the aspects that cause system load peaks and enable the development of specific contracts with their customers. The framework presented throughout the paper consists in several steps, namely the pre-processing data phase, clustering algorithms application and the evaluation of the quality of the partition, which is supported by cluster validity indices. The process ends with the analysis of the discovered knowledge. To validate the proposed framework, a case study with a real database of 208 MV consumers is used.
Resumo:
The growing importance and influence of new resources connected to the power systems has caused many changes in their operation. Environmental policies and several well know advantages have been made renewable based energy resources largely disseminated. These resources, including Distributed Generation (DG), are being connected to lower voltage levels where Demand Response (DR) must be considered too. These changes increase the complexity of the system operation due to both new operational constraints and amounts of data to be processed. Virtual Power Players (VPP) are entities able to manage these resources. Addressing these issues, this paper proposes a methodology to support VPP actions when these act as a Curtailment Service Provider (CSP) that provides DR capacity to a DR program declared by the Independent System Operator (ISO) or by the VPP itself. The amount of DR capacity that the CSP can assure is determined using data mining techniques applied to a database which is obtained for a large set of operation scenarios. The paper includes a case study based on 27,000 scenarios considering a diversity of distributed resources in a 33 bus distribution network.
Resumo:
Introdução: No Centro Hospitalar de São João, EPE a partir de 2008, foi implementado o sistema semi-automático de reposição de stocks nivelados de medicamentos, Pyxis®, atualmente implementado em 16 serviços clínicos. Atendendo ao crescimento verificado na implementação deste sistema automatizado na instituição, este trabalho pretende dar a conhecer a realidade de preparação da medicação para reposição no sistema semi-automático Pyxis®, por avaliação do número de unidades de medicamentos repostos diariamente e por dia da semana. Material e Métodos: Desenvolveu-se um estudo longitudinal retrospetivo onde se analisou a totalidade de serviços com implementação Pyxis® através do registo diário de reposição dos diferentes Serviços Clínicos num período de 41 dias consecutivos. Numa segunda fase, os dados foram sintetizados sob a forma de tabelas em Microsoft Office Excel®, tendo posteriormente sido construídos os respetivos gráficos para análise. Resultados: Os resultados, representados graficamente, mostram que a segunda-feira é o dia da semana com maior número de reposições de medicamentos, sendo os serviços com maior número de reposições totais UCI Geral, UCI Neurocríticos, Cirurgia Cardiotorácica e UCIPU. Discussão / Conclusões: Os resultados obtidos permitiram verificar uma sobrecarga de referências de medicamentos e unidades repostas às segundas-feiras, atingindo, em muitos serviços, valores de unidades repostas duas vezes superior à média de reposições do serviço (por ex. UCI Neurocíticos). Contudo, apesar do reduzido período de análise, os dados parecem evidenciar que o facto de haver reposições ao domingo agiliza o processo de reposição dos Pyxis® às segundas-feiras.
Resumo:
Mestrado em Engenharia Electrotécnica e de Computadores
Resumo:
Mestrado em Engenharia Informática
Resumo:
A Teia Mundial (Web) foi prevista como uma rede de documentos de hipertexto interligados de forma a criar uma espaço de informação onde humanos e máquinas poderiam comunicar. No entanto, a informação contida na Web tradicional foi/é armazenada de forma não estruturada o que leva a que apenas os humanos a possam consumir convenientemente. Consequentemente, a procura de informações na Web sintáctica é uma tarefa principalmente executada pelos humanos e nesse sentido nem sempre é fácil de concretizar. Neste contexto, tornou-se essencial a evolução para uma Web mais estruturada e mais significativa onde é dado significado bem definido à informação de forma a permitir a cooperação entre humanos e máquinas. Esta Web é usualmente referida como Web Semântica. Além disso, a Web Semântica é totalmente alcançável apenas se os dados de diferentes fontes forem ligados criando assim um repositório de Dados Abertos Ligados (LOD). Com o aparecimento de uma nova Web de Dados (Abertos) Ligados (i.e. a Web Semântica), novas oportunidades e desafios surgiram. Pergunta Resposta (QA) sobre informação semântica é actualmente uma área de investigação activa que tenta tirar vantagens do uso das tecnologias ligadas à Web Semântica para melhorar a tarefa de responder a questões. O principal objectivo do projecto World Search passa por explorar a Web Semântica para criar mecanismos que suportem os utilizadores de domínios de aplicação específicos a responder a questões complexas com base em dados oriundos de diferentes repositórios. No entanto, a avaliação feita ao estado da arte permite concluir que as aplicações existentes não suportam os utilizadores na resposta a questões complexas. Nesse sentido, o trabalho desenvolvido neste documento foca-se em estudar/desenvolver metodologias/processos que permitam ajudar os utilizadores a encontrar respostas exactas/corretas para questões complexas que não podem ser respondidas fazendo uso dos sistemas tradicionais. Tal inclui: (i) Ultrapassar a dificuldade dos utilizadores visionarem o esquema subjacente aos repositórios de conhecimento; (ii) Fazer a ponte entre a linguagem natural expressa pelos utilizadores e a linguagem (formal) entendível pelos repositórios; (iii) Processar e retornar informações relevantes que respondem apropriadamente às questões dos utilizadores. Para esse efeito, são identificadas um conjunto de funcionalidades que são consideradas necessárias para suportar o utilizador na resposta a questões complexas. É também fornecida uma descrição formal dessas funcionalidades. A proposta é materializada num protótipo que implementa as funcionalidades previamente descritas. As experiências realizadas com o protótipo desenvolvido demonstram que os utilizadores efectivamente beneficiam das funcionalidades apresentadas: ▪ Pois estas permitem que os utilizadores naveguem eficientemente sobre os repositórios de informação; ▪ O fosso entre as conceptualizações dos diferentes intervenientes é minimizado; ▪ Os utilizadores conseguem responder a questões complexas que não conseguiam responder com os sistemas tradicionais. Em suma, este documento apresenta uma proposta que comprovadamente permite, de forma orientada pelo utilizador, responder a questões complexas em repositórios semiestruturados.
Resumo:
Background and aim: Cardiorespiratory fitness (CRF) and diet have been involved as significant factors towards the prevention of cardio-metabolic diseases. This study aimed to assess the impact of the combined associations of CRF and adherence to the Southern European Atlantic Diet (SEADiet) on the clustering of metabolic risk factors in adolescents. Methods and Results: A cross-sectional school-based study was conducted on 468 adolescents aged 15-18, from the Azorean Islands, Portugal. We measured fasting glucose, insulin, total cholesterol (TC), HDL-cholesterol, triglycerides, systolic blood pressure, waits circumference and height. HOMA, TC/HDL-C ratio and waist-to-height ratio were calculated. For each of these variables, a Z-score was computed by age and sex. A metabolic risk score (MRS) was constructed by summing the Z scores of all individual risk factors. High risk was considered when the individual had 1SD of this score. CRF was measured with the 20 m-Shuttle-Run- Test. Adherence to SEADiet was assessed with a semi-quantitative food frequency questionnaire. Logistic regression showed that, after adjusting for potential confounders, unfit adolescents with low adherence to SEADiet had the highest odds of having MRS (OR Z 9.4; 95%CI:2.6e33.3) followed by the unfit ones with high adherence to the SEADiet (OR Z 6.6; 95% CI: 1.9e22.5) when compared to those who were fit and had higher adherence to SEADiet.
Resumo:
A procura de padrões nos dados de modo a formar grupos é conhecida como aglomeração de dados ou clustering, sendo uma das tarefas mais realizadas em mineração de dados e reconhecimento de padrões. Nesta dissertação é abordado o conceito de entropia e são usados algoritmos com critérios entrópicos para fazer clustering em dados biomédicos. O uso da entropia para efetuar clustering é relativamente recente e surge numa tentativa da utilização da capacidade que a entropia possui de extrair da distribuição dos dados informação de ordem superior, para usá-la como o critério na formação de grupos (clusters) ou então para complementar/melhorar algoritmos existentes, numa busca de obtenção de melhores resultados. Alguns trabalhos envolvendo o uso de algoritmos baseados em critérios entrópicos demonstraram resultados positivos na análise de dados reais. Neste trabalho, exploraram-se alguns algoritmos baseados em critérios entrópicos e a sua aplicabilidade a dados biomédicos, numa tentativa de avaliar a adequação destes algoritmos a este tipo de dados. Os resultados dos algoritmos testados são comparados com os obtidos por outros algoritmos mais “convencionais" como o k-médias, os algoritmos de spectral clustering e um algoritmo baseado em densidade.
Resumo:
Mestrado em engenharia electrotécnica e de computadores - Área de Especialização de Sistemas Autónomos
Resumo:
Epidemiologic studies have reported an inverse association between dairy product consumption and cardiometabolic risk factors in adults, but this relation is relatively unexplored in adolescents. We hypothesized that a higher dairy product intake is associated with lower cardiometabolic risk factor clustering in adolescents. To test this hypothesis, a cross-sectional study was conducted with 494 adolescents aged 15 to 18 years from the Azorean Archipelago, Portugal. We measured fasting glucose, insulin, total cholesterol, high-density lipoprotein cholesterol, triglycerides, systolic blood pressure, body fat, and cardiorespiratory fitness. We also calculated homeostatic model assessment and total cholesterol/high-density lipoprotein cholesterol ratio. For each one of these variables, a z score was computed using age and sex. A cardiometabolic risk score (CMRS) was constructed by summing up the z scores of all individual risk factors. High risk was considered to exist when an individual had at least 1 SD from this score. Diet was evaluated using a food frequency questionnaire, and the intake of total dairy (included milk, yogurt, and cheese), milk, yogurt, and cheese was categorized as low (equal to or below the median of the total sample) or “appropriate” (above the median of the total sample).The association between dairy product intake and CMRS was evaluated using separate logistic regression, and the results were adjusted for confounders. Adolescents with high milk intake had lower CMRS, compared with those with low intake (10.6% vs 18.1%, P = .018). Adolescents with appropriate milk intake were less likely to have high CMRS than those with low milk intake (odds ratio, 0.531; 95% confidence interval, 0.302-0.931). No association was found between CMRS and total dairy, yogurt, and cheese intake. Only milk intake seems to be inversely related to CMRS in adolescents.
Resumo:
In recent years, vehicular cloud computing (VCC) has emerged as a new technology which is being used in wide range of applications in the area of multimedia-based healthcare applications. In VCC, vehicles act as the intelligent machines which can be used to collect and transfer the healthcare data to the local, or global sites for storage, and computation purposes, as vehicles are having comparatively limited storage and computation power for handling the multimedia files. However, due to the dynamic changes in topology, and lack of centralized monitoring points, this information can be altered, or misused. These security breaches can result in disastrous consequences such as-loss of life or financial frauds. Therefore, to address these issues, a learning automata-assisted distributive intrusion detection system is designed based on clustering. Although there exist a number of applications where the proposed scheme can be applied but, we have taken multimedia-based healthcare application for illustration of the proposed scheme. In the proposed scheme, learning automata (LA) are assumed to be stationed on the vehicles which take clustering decisions intelligently and select one of the members of the group as a cluster-head. The cluster-heads then assist in efficient storage and dissemination of information through a cloud-based infrastructure. To secure the proposed scheme from malicious activities, standard cryptographic technique is used in which the auotmaton learns from the environment and takes adaptive decisions for identification of any malicious activity in the network. A reward and penalty is given by the stochastic environment where an automaton performs its actions so that it updates its action probability vector after getting the reinforcement signal from the environment. The proposed scheme was evaluated using extensive simulations on ns-2 with SUMO. The results obtained indicate that the proposed scheme yields an improvement of 10 % in detection rate of malicious nodes when compared with the existing schemes.
Resumo:
13th IEEE/IFIP International Conference on Embedded and Ubiquitous Computing (EUC 2015). 21 to 23, Oct, 2015, Session W1-A: Multiprocessing and Multicore Architectures. Porto, Portugal.
Resumo:
Paper/Poster presented in Work in Progress Session, 28th GI/ITG International Conference on Architecture of Computing Systems (ARCS 2015). 24 to 26, Mar, 2015. Porto, Portugal.