906 resultados para Decision tree
Resumo:
Background: The genome-wide identification of both morbid genes, i.e., those genes whose mutations cause hereditary human diseases, and druggable genes, i.e., genes coding for proteins whose modulation by small molecules elicits phenotypic effects, requires experimental approaches that are time-consuming and laborious. Thus, a computational approach which could accurately predict such genes on a genome-wide scale would be invaluable for accelerating the pace of discovery of causal relationships between genes and diseases as well as the determination of druggability of gene products.Results: In this paper we propose a machine learning-based computational approach to predict morbid and druggable genes on a genome-wide scale. For this purpose, we constructed a decision tree-based meta-classifier and trained it on datasets containing, for each morbid and druggable gene, network topological features, tissue expression profile and subcellular localization data as learning attributes. This meta-classifier correctly recovered 65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%. It was than used to assign morbidity and druggability scores to genes not known to be morbid and druggable and we showed a good match between these scores and literature data. Finally, we generated decision trees by training the J48 algorithm on the morbidity and druggability datasets to discover cellular rules for morbidity and druggability and, among the rules, we found that the number of regulating transcription factors and plasma membrane localization are the most important factors to morbidity and druggability, respectively.Conclusions: We were able to demonstrate that network topological features along with tissue expression profile and subcellular localization can reliably predict human morbid and druggable genes on a genome-wide scale. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing morbidity and druggability.
Resumo:
Oil spills cause great damage to coastal habitats, especially when rapid and suitable response measures are not taken. Establishing high priority areas is fundamental for the operation of response teams. Under this context and considering the need for keeping all geographical information up-to-date for emergencial use, the present study proposes employing a decision tree coupled with a knowledge-based approach using GIS to assign oil sensitivity indices to Brazilian coastal habitats. The modelled system works based on rules set by the official standards of Brazilian Federal Environment Organ. We tested it on one of the littoral regions of Brazil where transportation of petroleum is most intense: the coast of the municipalities of Sao Sebastiao and Caraguatatuba in the northern littoral of São Paulo state, Brazil. The system automatically ranked the littoral sensitivity index of the study area habitats according to geographical conditions during summer and winter; since index ranks of some habitats varied between these seasons because of sediment alterations. The obtained results illustrate the great potential of the proposed system in generating ESI maps and in aiding response teams during emergency operations. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
This paper describes an investigation of the hybrid PSO/ACO algorithm to classify automatically the well drilling operation stages. The method feasibility is demonstrated by its application to real mud-logging dataset. The results are compared with bio-inspired methods, and rule induction and decision tree algorithms for data mining. © 2009 Springer Berlin Heidelberg.
Resumo:
Protein-protein interactions (PPIs) are essential for understanding the function of biological systems and have been characterized using a vast array of experimental techniques. These techniques detect only a small proportion of all PPIs and are labor intensive and time consuming. Therefore, the development of computational methods capable of predicting PPIs accelerates the pace of discovery of new interactions. This paper reports a machine learning-based prediction model, the Universal In Silico Predictor of Protein-Protein Interactions (UNISPPI), which is a decision tree model that can reliably predict PPIs for all species (including proteins from parasite-host associations) using only 20 combinations of amino acids frequencies from interacting and non-interacting proteins as learning features. UNISPPI was able to correctly classify 79.4% and 72.6% of experimentally supported interactions and non-interacting protein pairs, respectively, from an independent test set. Moreover, UNISPPI suggests that the frequencies of the amino acids asparagine, cysteine and isoleucine are important features for distinguishing between interacting and non-interacting protein pairs. We envisage that UNISPPI can be a useful tool for prioritizing interactions for experimental validation. © 2013 Valente et al.
Resumo:
Breast cancer is the most common cancer among women. In CAD systems, several studies have investigated the use of wavelet transform as a multiresolution analysis tool for texture analysis and could be interpreted as inputs to a classifier. In classification, polynomial classifier has been used due to the advantages of providing only one model for optimal separation of classes and to consider this as the solution of the problem. In this paper, a system is proposed for texture analysis and classification of lesions in mammographic images. Multiresolution analysis features were extracted from the region of interest of a given image. These features were computed based on three different wavelet functions, Daubechies 8, Symlet 8 and bi-orthogonal 3.7. For classification, we used the polynomial classification algorithm to define the mammogram images as normal or abnormal. We also made a comparison with other artificial intelligence algorithms (Decision Tree, SVM, K-NN). A Receiver Operating Characteristics (ROC) curve is used to evaluate the performance of the proposed system. Our system is evaluated using 360 digitized mammograms from DDSM database and the result shows that the algorithm has an area under the ROC curve Az of 0.98 ± 0.03. The performance of the polynomial classifier has proved to be better in comparison to other classification algorithms. © 2013 Elsevier Ltd. All rights reserved.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Esse trabalho compara os algoritmos C4.5 e MLP (do inglês “Multilayer Perceptron”) aplicados a avaliação de segurança dinâmica ou (DSA, do inglês “Dynamic Security Assessment”) e em projetos de controle preventivo, com foco na estabilidade transitória de sistemas elétricos de potência (SEPs). O C4.5 é um dos algoritmos da árvore de decisão ou (DT, do inglês “Decision Tree”) e a MLP é um dos membros da família das redes neurais artificiais (RNA). Ambos os algoritmos fornecem soluções para o problema da DSA em tempo real, identificando rapidamente quando um SEP está sujeito a uma perturbação crítica (curto-circuito, por exemplo) que pode levar para a instabilidade transitória. Além disso, o conhecimento obtido de ambas as técnicas, na forma de regras, pode ser utilizado em projetos de controle preventivo para restaurar a segurança do SEP contra perturbações críticas. Baseado na formação de base de dados com exaustivas simulações no domínio do tempo, algumas perturbações críticas específicas são tomadas como exemplo para comparar os algoritmos C4.5 e MLP empregadas a DSA e ao auxílio de ações preventivas. O estudo comparativo é testado no sistema elétrico “New England”. Nos estudos de caso, a base de dados é gerada por meio do programa PSTv3 (“Power System Toolbox”). As DTs e as RNAs são treinada e testadas usando o programa Rapidminer. Os resultados obtidos demonstram que os algoritmos C4.5 e MLP são promissores nas aplicações de DSA e em projetos de controle preventivo.
Resumo:
As técnicas utilizadas para avaliação da segurança estática em sistemas elétricos de potência dependem da execução de grande número de casos de fluxo de carga para diversas topologias e condições operacionais do sistema. Em ambientes de operação de tempo real, esta prática é de difícil realização, principalmente em sistemas de grande porte onde a execução de todos os casos de fluxo de carga que são necessários, exige elevado tempo e esforço computacional mesmo para os recursos atuais disponíveis. Técnicas de mineração de dados como árvore de decisão estão sendo utilizadas nos últimos anos e tem alcançado bons resultados nas aplicações de avaliação da segurança estática e dinâmica de sistemas elétricos de potência. Este trabalho apresenta uma metodologia para avaliação da segurança estática em tempo real de sistemas elétricos de potência utilizando árvore de decisão, onde a partir de simulações off-line de fluxo de carga, executadas via software Anarede (CEPEL), foi gerada uma extensa base de dados rotulada relacionada ao estado do sistema, para diversas condições operacionais. Esta base de dados foi utilizada para indução das árvores de decisão, fornecendo um modelo de predição rápida e precisa que classifica o estado do sistema (seguro ou inseguro) para aplicação em tempo real. Esta metodologia reduz o uso de computadores no ambiente on-line, uma vez que o processamento das árvores de decisão exigem apenas a verificação de algumas instruções lógicas do tipo if-then, de um número reduzido de testes numéricos nos nós binários para definição do valor do atributo que satisfaz as regras, pois estes testes são realizados em quantidade igual ao número de níveis hierárquicos da árvore de decisão, o que normalmente é reduzido. Com este processamento computacional simples, a tarefa de avaliação da segurança estática poderá ser executada em uma fração do tempo necessário para a realização pelos métodos tradicionais mais rápidos. Para validação da metodologia, foi realizado um estudo de caso baseado em um sistema elétrico real, onde para cada contingência classificada como inseguro, uma ação de controle corretivo é executada, a partir da informação da árvore de decisão sobre o atributo crítico que mais afeta a segurança. Os resultados mostraram ser a metodologia uma importante ferramenta para avaliação da segurança estática em tempo real para uso em um centro de operação do sistema.
Resumo:
A presente dissertação visa apresentar um conjunto de desenvolvimentos, aplicativos e serviços para suporte à operação em tempo real e ao controle preventivo visando garantir à segurança estática e dinâmica de sistemas elétricos de potência. A técnica de mineração de dados conhecida como árvore de decisão foi utilizada tanto para classificar o estado operacional do sistema, bem como para fornecer diretrizes à tomada de ações de controle, necessárias para evitar a degradação da tensão operativa e a instabilidade transitória. Testes preliminares foram realizados utilizando o histórico operacional do SCADA/SAGE do Centro de Operação Regional do Pará da Eletrobrás Eletronorte. Os resultados obtidos validaram completamente o conjunto (protótipo) de aplicativos e serviços, e indicam um grande potencial para a aplicação no ambiente de operação em tempo real.
Resumo:
Prostate cancer is a serious public health problem accounting for up to 30% of clinical tumors in men. The diagnosis of this disease is made with clinical, laboratorial and radiological exams, which may indicate the need for transrectal biopsy. Prostate biopsies are discerningly evaluated by pathologists in an attempt to determine the most appropriate conduct. This paper presents a set of techniques for identifying and quantifying regions of interest in prostatic images. Analyses were performed using multi-scale lacunarity and distinct classification methods: decision tree, support vector machine and polynomial classifier. The performance evaluation measures were based on area under the receiver operating characteristic curve (AUC). The most appropriate region for distinguishing the different tissues (normal, hyperplastic and neoplasic) was defined: the corresponding lacunarity values and a rule's model were obtained considering combinations commonly explored by specialists in clinical practice. The best discriminative values (AUC) were 0.906, 0.891 and 0.859 between neoplasic versus normal, neoplasic versus hyperplastic and hyperplastic versus normal groups, respectively. The proposed protocol offers the advantage of making the findings comprehensible to pathologists. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Hazard analysis and critical control points (HACCP) is one of the main tools currently used to ensure safety, quality and integrity of foods. So, the aim of this study was to develop and implement the HACCP program in the processing of pasteurized grade A milk Checklists were used to assess on the level of the pre requisites programs and on the sanitary classification of the dairy industry and the results were used as references for the development of the HACCP system. A "decision tree" protocol was used for the identification of the critical control points (CCP). No physical or chemical CCP were identified, whereas pasteurization and packaging were considered biological CCP For these CCP, the limits for prevention, monitoring needs, corrective actions, critical limits and verification procedures were established. The pre requisites program was essential for the establishment of the system. The implementation of the HACCP for the processing of grade A pasteurized milk was efficient to control the biological hazards and enabled the product to comply with the legislation specifications and achieve safety.
Resumo:
The aim of this work is to discriminate vegetation classes throught remote sensing images from the satellite CBERS-2, related to winter and summer seasons in the Campos Gerais region Paraná State, Brazil. The vegetation cover of the region presents different kinds of vegetations: summer and winter cultures, reforestation areas, natural areas and pasture. Supervised classification techniques like Maximum Likelihood Classifier (MLC) and Decision Tree were evaluated, considering a set of attributes from images, composed by bands of the CCD sensor (1, 2, 3, 4), vegetation indices (CTVI, DVI, GEMI, NDVI, SR, SAVI, TVI), mixture models (soil, shadow, vegetation) and the two first main components. The evaluation of the classifications accuracy was made using the classification error matrix and the kappa coefficient. It was defined a high discriminatory level during the classes definition, in order to allow separation of different kinds of winter and summer crops. The classification accuracy by decision tree was 94.5% and the kappa coefficient was 0.9389 for the scene 157/128. For the scene 158/127, the values were 88% and 0.8667, respectively. The classification accuracy by MLC was 84.86% and the kappa coefficient was 0.8099 for the scene 157/128. For the scene 158/127, the values were 77.90% and 0.7476, respectively. The results showed a better performance of the Decision Tree classifier than MLC, especially to the classes related to cultivated crops, indicating the use of the Decision Tree classifier to the vegetation cover mapping including different kinds of crops.
Resumo:
Pós-graduação em Zootecnia - FCAV