907 resultados para Data clustering
Resumo:
More than ever, there is an increase of the number of decision support methods and computer aided diagnostic systems applied to various areas of medicine. In breast cancer research, many works have been done in order to reduce false-positives when used as a double reading method. In this study, we aimed to present a set of data mining techniques that were applied to approach a decision support system in the area of breast cancer diagnosis. This method is geared to assist clinical practice in identifying mammographic findings such as microcalcifications, masses and even normal tissues, in order to avoid misdiagnosis. In this work a reliable database was used, with 410 images from about 115 patients, containing previous reviews performed by radiologists as microcalcifications, masses and also normal tissue findings. Throughout this work, two feature extraction techniques were used: the gray level co-occurrence matrix and the gray level run length matrix. For classification purposes, we considered various scenarios according to different distinct patterns of injuries and several classifiers in order to distinguish the best performance in each case described. The many classifiers used were Naïve Bayes, Support Vector Machines, k-nearest Neighbors and Decision Trees (J48 and Random Forests). The results in distinguishing mammographic findings revealed great percentages of PPV and very good accuracy values. Furthermore, it also presented other related results of classification of breast density and BI-RADS® scale. The best predictive method found for all tested groups was the Random Forest classifier, and the best performance has been achieved through the distinction of microcalcifications. The conclusions based on the several tested scenarios represent a new perspective in breast cancer diagnosis using data mining techniques.
Resumo:
RESUMO: A tese de doutoramento visa demonstrar duas proposições: a comorbilidade de 4 situações de doença prevalentes, hipertensão arterial (HTA), diabetes (DM), doença cardíaca isquémica (DCI) e asma é um assunto importante em Medicina Geral e Familiar e o seu estudo tem diversas implicações na forma como os cuidados de saúde são prestados, na sua organização e no ensino-aprendizagem da disciplina. O documento encontra-se dividido em 4 partes: 1) justificação do interesse do tema e finalidades da dissertação; 2) revisão sistemática de literatura publicada entre 1992 e 2002; 3) apresentação de dois trabalhos de investigação, descritivos e exploratórios que se debruçam sobre a mesma população de estudo, o primeiro intitulado “Comorbilidade de quatro doenças crónicas e sua relação com factores sócio demográficos” e o segundo, “Diferenças entre doentes, por médico e por sub-região, na comorbilidade de 4 doenças crónicas”; 4) conclusões e implicações dos resultados dos estudos na gestão da prática clínica, nos serviços, no ensino da disciplina da MGF e no desenvolvimento posterior de uma linha de investigação nesta área. O primeiro estudo tem como objectivos: descrever a prevalência da comorbilidade entre as 4 doenças-índice; verificar se existe relação entre o tempo da primeira doença e o tempo decorrido até ao aparecimento da 2ª e da 3ª doença, nas 4 doenças; determinar a comorbilidade associada às 4 doenças; identificar eventuais agrupamentos de doenças e verificar se existe relação entre comorbilidade e factores sociais e demográficos. O segundo estudo pretende verificar se existem diferenças na comorbilidade a nível local, por médico, e por Sub-Região de Saúde. O trabalho empírico é descritivo e exploratório. A população é constituída pelos doentes, com pelo menos uma das 4 doenças crónicas índice, das listas de utentes de 12 Médicos de Família a trabalharem em Centros de Saúde urbanos, suburbanos e rurais dos distritos de Lisboa e Beja. Os dados foram colhidos durante um ano através dos registos médicos. As variáveis sócio demográficas estudadas são: sexo, idade, etnia/raça, escolaridade, situação profissional, estado civil, tipo de família, funcionalidade familiar, condições de habitação. A comorbilidade é definida pela presença de duas ou mais doenças e estudada pelo número de doenças coexistentes. O tempo de duração da doença é definido como o número de anos decorridos entre o ano de diagnóstico e 2003. Os problemas de saúde crónicos são classificados pela ICPC2. Nas comparações efectuadas aplicaram-se os testes de Mann-Whitney e de Friedman, de homogeneidade e de análise de resíduos. A Análise Classificatória Hierárquica foi utilizada para determinar o agrupamento de doenças e a Análise de Regressão Categórica e Análise de Correspondências na relação entre as características sócio demográficas e a comorbilidade. Identificaram-se 3998 doentes. A idade média é de 64,3 anos (DP=15,70). Há uma correlação positiva significativa (r =0,350 r=0) entre “anos com a primeira doença”e “idade dos doentes” em todos os indivíduos (homens r=0,129 mulheres r=0,231). A comorbilidade entre as quatro doenças crónicas índice está presente em 1/3 da população. As associações mais prevalentes são HTA+DM (14,3%) e HTA+DCI (6,25%). Existe correlação positiva, expressiva, entre a duração da primeira doença, quando esta é a HTA ou a DM, e o intervalo de tempo até ao aparecimento da 2ª e da 3ª doenças. Identificaram-se 18 655 problemas crónicos de saúde que se traduziram em 244 códigos da ICPC2. O número médio de problemas foi de 5,94 (DP=3,04). A idade, a actividade profissional, a funcionalidade familiar e a escolaridade foram as variáveis que mais contribuíram para diferenciar os indivíduos quanto à comorbilidade. Foram encontradas diferenças significativas entre médicos(c2=1165,368 r=0) e entre os agrupamentos de doentes por Sub-Região de Saúde (c2= 157,108 r=0) no respeitante à comorbilidade. Na partição por Lisboa o número médio de problemas é de 6,45 e em Beja de 5,35. Deste trabalho ressaltam várias consequências para os profissionais, para os serviços, para o ensino e para a procura de mais saber nesta área. Os médicos, numa gestão eficiente de cuidados são chamados a desempenhar um papel de gestores da complexidade e de coordenadores assim como a trabalhar num modelo organizativo apoiado numa colaboração em equipa. Por sua vez os serviços de saúde têm que desenvolver medidas de avaliação de cuidados que integrem a comorbilidade como medida de risco. O contexto social da cronicidade e da comorbilidade deverá ser incluído como área de ensino. A concluir analisa-se o impacto do estudo nos colaboradores e o possível desenvolvimento da investigação nesta área.----------------------------------------ABSTRACT: The PhD Thesis has two propositions, co-morbidity of four chronic conditions (hypertension, asthma, diabetes, cardiac ischaemic disease) is a prevalent and complex issue and its study has several implications in the way care is provided and organised as well as in the learning and teaching of the discipline of General Practice. In the first part of the document arguments of different nature are given in order to sustain the dissertation aims; the second part describes a systematic study of literature review from 1992 to 2002; the third presents two research studies "Comorbidity of four chronic diseases and its relation with socio demographic factors” and “Differences between patients among GPs at local and regional level”; implications of study results for practice management, teaching and research are presented in the last part. The prevalence of the four chronic diseases co-morbidity, the relation of the first disease duration with the time of diagnose of the next index condition, the burden of co-morbidity in the four chronic diseases, the clustering of those diseases, the relation between demographic and social characteristics and co-morbidity, are the objectives of the first study. The second intends to verify differences in comorbidity between patients at local and regional level of practice. Research studies were descriptive and exploratory. The population under study were patients enlisted in 12 GPs working in urban and rural health centres, in Lisbon and Beja districts, with at least one of the four mentioned diseases. Data were collected through medical records during one year (2003) and 3998 patients were identified. The social demographic variables were: sex, age, ethnicity/race, education, profession, marriage status, family status, family functionality, home living conditions. Co-morbidity is defined by the presence of two or more diseases, and studied by the number of co-existing diseases. The time duration of the disease is defined by the number of years between the diagnostic year and 2003. The chronic disease problems are classified in accord with ICPC2. The characterization of population is descriptive. The effected comparisons applied the Mann-Whitney, Friedman, homogeneity and analysis of residuals tests. The Classificatory Hierarchy Analysis was utilized to determine the grouping of diseases and the Regression Categorization and Correspondences Analysis was used to study the relation of socio-demographic and co-morbidity. The median age of the population under study is 64,3 (SD= 15,70). There is a significant positive correlation (r =0,350 r=0)between “years with the first disease” and “patient age” for all individuals (men r=0,129 women r=0,231). Co-morbidity of the four index diseases is present in 1/3 of the studied population. The most prevalent associations for the four diseases are HTA+DM (14,03%) and HTA+IHD (6,25%). Expressive positive correlation between the duration of the first disease and the second and the third index disease interval is found. For the 3988 patients, 18 655 chronic health problems, translated in 244 ICPC2 codes, were identified. The mean number of problems is 5,94 (SD=3,04). Age, professional activity, family functionality and education level are the socio demographic characteristics that most contribute to differentiate individuals concerning the overall co-morbidity. Significant differences in co-morbidity between GP patients at local (c2=1165,368 r=0) and regional level (c2= 157,108 r=0) are found. This study has several consequences for professionals, for services, for the teaching and learning of General Practice and for the pursuit of knowledge in this area. New competences and performances have to be implemented. General Practitioners, assuming a role of co-ordination, have to perform the role of complexity managers in patient's care, working in practices supported by a strong team in collaboration with other specialists. In order to assess provided care, services have to develop tools where co-morbidity is included as a risk measure. The social context of comorbidity and chronicity has to be included in the curricula of General Practice learning and teaching areas. The dissertation ends describing the added value to participant's performance for their participation in the research and an agenda for further research, in this area, based on a community of practice.--------RÉSUMÉ:Cette thèse de doctorat prétend démontrer deux postulats : le premier, que la comorbidité de quatre maladies fréquentes, hypertension artérielle (HTA), diabète (DM), maladie cardiaque ischémique (DCI) et asthme, est un thème important en Médecine Générale et Familiale et que son étude a plusieurs implications au niveau de l'approche pour dispenser les soins, de leur organisation et de l'enseignement/apprentissage de la discipline. Le document comprend quatre parties distinctes : 1) justification de l'intérêt du sujet et objectifs de la dissertation ; 2) étude systématique de publications éditées entre 1992 et 2002 ; 3) présentation de deux travaux de recherche, descriptifs et exploratoires, un premier intitulée « Comorbidité de quatre maladies chroniques et leur relation avec des facteurs sociodémographiques » et un deuxième « Différences entre malades, selon le médecin et la sous région, dans la comorbilité de quatre maladies chroniques» ; 4) conclusions et conséquences des résultats des études dans la gestion de la pratique clinique, dans les services, dans l'enseignement de la discipline de MGF et dans le développement postérieur de la recherche dans ce domaine. Les objectifs de la première étude sont les suivants : décrire la prévalence de la comorbidité entre les quatre maladies chroniques, vérifier s'il existe une relation entre temps de durée de la première maladie et l'espace de temps jusqu'à le diagnostic de la 2ème ou 3ème maladie; déterminer la comorbidité entre les 4 maladies ; identifier d'éventuelles groupements de maladies et vérifier s'il existe une relation entre comorbidité et facteurs sociodémographiques. La deuxième étude prétend vérifier s'il existe des différences de comorbidité entre médecins et par groupement régional. Le travail empirique est descriptif et exploratoire. La population est composée des malades ayant au moins une des quatre maladies chroniques parmi les listes de malades de douze Médecins de Famille qui travaillent dans des Centres de Santé urbains, suburbains et ruraux (Districts de Lisbonne et Beja). Les données ont été extraites pendant l'année 2003 des registres des médecins. Les variables sociodémographiques étudiées sont : le sexe, l'âge, l'ethnie/race, la scolarité, la situation professionnelle, l'état civil, le type de famille, sa fonctionnalité, les conditions de logement. La comorbidité est définie lorsqu'il existe deux ou plusieurs maladies et est étudiée d'après le nombre de maladies coexistantes. La durée de la maladie est établie en comptant le nombre d'années écoulées entre le diagnostique et 2003. Les problèmes de santé chroniques sont classés par l'ICPC 2. Pour les comparaisons les tests de Mann-Whitney et Friedman, de homogénéité et analyse de résidues ont été appliqués. L'Analyse de Classification Hiérarchique a été utilisée pour procéder au regroupement des maladies et l'Analyse de Régression Catégorique et l'Analyse de Correspondances pour étudier la relation entre les caractéristiques sociodémographiques et la comorbilité. Les principaux résultats sont les suivants : les 3998 malades identifiés ont 64,3 ans d'âge moyen (DP=15,70). Il existe une corrélation positive significative (r =0,350 r=0) entre « les années avec la première maladie » et « l'âge des malades », chez tous les individus (hommes r=0,129 femmes r=0,231). La comorbidité entre les quatre maladies chroniques est une réalité chez 1/3 des patients. Les associations les plus fréquentes sont HTA+DM (14%) et HTA+DCI (6,25%). Il existe une corrélation positive significative entre la durée de la première maladie, HTA ou DM, et l'écart jusqu'à l'apparition de la deuxième et de la troisième maladie. Chez les malades, 18.655 problèmes chroniques de santé ont été identifiés et traduits en 244 codes de l'ICPC2. La moyenne des problèmes a été de 5,94 (DP=3,04). L'âge, l'activité professionnelle, la fonctionnalité familiale et la scolarité sont les variables qui ont le plus contribué à différencier les individus face à la comorbilité. Des différences notoires ont été trouvées entre médecins (c2=1165,368 r=0) et entre les groupements régionaux (c2=157,108 r=0) en ce qui concerne la comorbidité. Dans le groupe de patients de Lisbonne, le chiffre moyen de problèmes est de 6,45 et à Beja il est de 5,35. Cette étude met en évidence plusieurs conséquences pour les professionnels, les services, l'enseignement et l'élargissement du savoir dans ce domaine. Les médecins, soucieux de gérer efficacement les soins sont appelés à jouer un rôle de gestionnaires de la complexité et de coordinateurs, de même qu'à travailler dans un modèle d'organisation soutenus par un travail d'équipe. D'autre part, les services de santé doivent eux aussi développer des mesures d'évaluation des soins qui intègrent la comorbidité comme mesure de risque. Le contexte social de la chronicité et de la comorbidité devra être inclus comme domaines à étudier. La fin de cette thèse décrit l'impact de cette étude sur les collaborateurs et le développement futur de la recherche dans ce domaine.
Resumo:
This document presents a tool able to automatically gather data provided by real energy markets and to generate scenarios, capture and improve market players’ profiles and strategies by using knowledge discovery processes in databases supported by artificial intelligence techniques, data mining algorithms and machine learning methods. It provides the means for generating scenarios with different dimensions and characteristics, ensuring the representation of real and adapted markets, and their participating entities. The scenarios generator module enhances the MASCEM (Multi-Agent Simulator of Competitive Electricity Markets) simulator, endowing a more effective tool for decision support. The achievements from the implementation of the proposed module enables researchers and electricity markets’ participating entities to analyze data, create real scenarios and make experiments with them. On the other hand, applying knowledge discovery techniques to real data also allows the improvement of MASCEM agents’ profiles and strategies resulting in a better representation of real market players’ behavior. This work aims to improve the comprehension of electricity markets and the interactions among the involved entities through adequate multi-agent simulation.
Resumo:
The study of electricity markets operation has been gaining an increasing importance in the last years, as result of the new challenges that the restructuring process produced. Currently, lots of information concerning electricity markets is available, as market operators provide, after a period of confidentiality, data regarding market proposals and transactions. These data can be used as source of knowledge to define realistic scenarios, which are essential for understanding and forecast electricity markets behavior. The development of tools able to extract, transform, store and dynamically update data, is of great importance to go a step further into the comprehension of electricity markets and of the behaviour of the involved entities. In this paper an adaptable tool capable of downloading, parsing and storing data from market operators’ websites is presented, assuring constant updating and reliability of the stored data.
Resumo:
Electricity markets worldwide suffered profound transformations. The privatization of previously nationally owned systems; the deregulation of privately owned systems that were regulated; and the strong interconnection of national systems, are some examples of such transformations [1, 2]. In general, competitive environments, as is the case of electricity markets, require good decision-support tools to assist players in their decisions. Relevant research is being undertaken in this field, namely concerning player modeling and simulation, strategic bidding and decision-support.
Resumo:
The study of Electricity Markets operation has been gaining an increasing importance in the last years, as result of the new challenges that the restructuring produced. Currently, lots of information concerning Electricity Markets is available, as market operators provide, after a period of confidentiality, data regarding market proposals and transactions. These data can be used as source of knowledge, to define realistic scenarios, essential for understanding and forecast Electricity Markets behaviour. The development of tools able to extract, transform, store and dynamically update data, is of great importance to go a step further into the comprehension of Electricity Markets and the behaviour of the involved entities. In this paper we present an adaptable tool capable of downloading, parsing and storing data from market operators’ websites, assuring actualization and reliability of stored data.
Resumo:
Electric power networks, namely distribution networks, have been suffering several changes during the last years due to changes in the power systems operation, towards the implementation of smart grids. Several approaches to the operation of the resources have been introduced, as the case of demand response, making use of the new capabilities of the smart grids. In the initial levels of the smart grids implementation reduced amounts of data are generated, namely consumption data. The methodology proposed in the present paper makes use of demand response consumers’ performance evaluation methods to determine the expected consumption for a given consumer. Then, potential commercial losses are identified using monthly historic consumption data. Real consumption data is used in the case study to demonstrate the application of the proposed method.
Resumo:
Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
Worldwide electricity markets have been evolving into regional and even continental scales. The aim at an efficient use of renewable based generation in places where it exceeds the local needs is one of the main reasons. A reference case of this evolution is the European Electricity Market, where countries are connected, and several regional markets were created, each one grouping several countries, and supporting transactions of huge amounts of electrical energy. The continuous transformations electricity markets have been experiencing over the years create the need to use simulation platforms to support operators, regulators, and involved players for understanding and dealing with this complex environment. This paper focuses on demonstrating the advantage that real electricity markets data has for the creation of realistic simulation scenarios, which allow the study of the impacts and implications that electricity markets transformations will bring to the participant countries. A case study using MASCEM (Multi-Agent System for Competitive Electricity Markets) is presented, with a scenario based on real data, simulating the European Electricity Market environment, and comparing its performance when using several different market mechanisms.
Resumo:
This paper presents the Realistic Scenarios Generator (RealScen), a tool that processes data from real electricity markets to generate realistic scenarios that enable the modeling of electricity market players’ characteristics and strategic behavior. The proposed tool provides significant advantages to the decision making process in an electricity market environment, especially when coupled with a multi-agent electricity markets simulator. The generation of realistic scenarios is performed using mechanisms for intelligent data analysis, which are based on artificial intelligence and data mining algorithms. These techniques allow the study of realistic scenarios, adapted to the existing markets, and improve the representation of market entities as software agents, enabling a detailed modeling of their profiles and strategies. This work contributes significantly to the understanding of the interactions between the entities acting in electricity markets by increasing the capability and realism of market simulations.
Resumo:
A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems.
Resumo:
Dissertation presented to obtain the Ph.D degree in Bioinformatics
Resumo:
Epidemiologic studies have reported an inverse association between dairy product consumption and cardiometabolic risk factors in adults, but this relation is relatively unexplored in adolescents. We hypothesized that a higher dairy product intake is associated with lower cardiometabolic risk factor clustering in adolescents. To test this hypothesis, a cross-sectional study was conducted with 494 adolescents aged 15 to 18 years from the Azorean Archipelago, Portugal. We measured fasting glucose, insulin, total cholesterol, high-density lipoprotein cholesterol, triglycerides, systolic blood pressure, body fat, and cardiorespiratory fitness. We also calculated homeostatic model assessment and total cholesterol/high-density lipoprotein cholesterol ratio. For each one of these variables, a z score was computed using age and sex. A cardiometabolic risk score (CMRS) was constructed by summing up the z scores of all individual risk factors. High risk was considered to exist when an individual had at least 1 SD from this score. Diet was evaluated using a food frequency questionnaire, and the intake of total dairy (included milk, yogurt, and cheese), milk, yogurt, and cheese was categorized as low (equal to or below the median of the total sample) or “appropriate” (above the median of the total sample).The association between dairy product intake and CMRS was evaluated using separate logistic regression, and the results were adjusted for confounders. Adolescents with high milk intake had lower CMRS, compared with those with low intake (10.6% vs 18.1%, P = .018). Adolescents with appropriate milk intake were less likely to have high CMRS than those with low milk intake (odds ratio, 0.531; 95% confidence interval, 0.302-0.931). No association was found between CMRS and total dairy, yogurt, and cheese intake. Only milk intake seems to be inversely related to CMRS in adolescents.
Resumo:
Thesis submitted to Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa, in partial fulfilment of the requirements for the degree of Master in Computer Science