6 resultados para Decision tree
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
This paper presents a survey of evolutionary algorithms that are designed for decision-tree induction. In this context, most of the paper focuses on approaches that evolve decision trees as an alternate heuristics to the traditional top-down divide-and-conquer approach. Additionally, we present some alternative methods that make use of evolutionary algorithms to improve particular components of decision-tree classifiers. The paper's original contributions are the following. First, it provides an up-to-date overview that is fully focused on evolutionary algorithms and decision trees and does not concentrate on any specific evolutionary approach. Second, it provides a taxonomy, which addresses works that evolve decision trees and works that design decision-tree components by the use of evolutionary algorithms. Finally, a number of references are provided that describe applications of evolutionary algorithms for decision-tree induction in different domains. At the end of this paper, we address some important issues and open questions that can be the subject of future research.
Resumo:
Background: This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. Results: The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. Conclusions: We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.
Resumo:
Decision tree induction algorithms represent one of the most popular techniques for dealing with classification problems. However, traditional decision-tree induction algorithms implement a greedy approach for node splitting that is inherently susceptible to local optima convergence. Evolutionary algorithms can avoid the problems associated with a greedy search and have been successfully employed to the induction of decision trees. Previously, we proposed a lexicographic multi-objective genetic algorithm for decision-tree induction, named LEGAL-Tree. In this work, we propose extending this approach substantially, particularly w.r.t. two important evolutionary aspects: the initialization of the population and the fitness function. We carry out a comprehensive set of experiments to validate our extended algorithm. The experimental results suggest that it is able to outperform both traditional algorithms for decision-tree induction and another evolutionary algorithm in a variety of application domains.
Resumo:
Background Cost-effectiveness studies have been increasingly part of decision processes for incorporating new vaccines into the Brazilian National Immunisation Program. This study aimed to evaluate the cost-effectiveness of 10-valent pneumococcal conjugate vaccine (PCV10) in the universal childhood immunisation programme in Brazil. Methods A decision-tree analytical model based on the ProVac Initiative pneumococcus model was used, following 25 successive cohorts from birth until 5 years of age. Two strategies were compared: (1) status quo and (2) universal childhood immunisation programme with PCV10. Epidemiological and cost estimates for pneumococcal disease were based on National Health Information Systems and literature. A 'top-down' costing approach was employed. Costs are reported in 2004 Brazilian reals. Costs and benefits were discounted at 3%. Results 25 years after implementing the PCV10 immunisation programme, 10 226 deaths, 360 657 disability-adjusted life years (DALYs), 433 808 hospitalisations and 5 117 109 outpatient visits would be avoided. The cost of the immunisation programme would be R$10 674 478 765, and the expected savings on direct medical costs and family costs would be R$1 036 958 639 and R$209 919 404, respectively. This resulted in an incremental cost-effectiveness ratio of R$778 145/death avoided and R$22 066/DALY avoided from the society perspective. Conclusion The PCV10 universal infant immunisation programme is a cost-effective intervention (1-3 GDP per capita/DALY avoided). Owing to the uncertain burden of disease data, as well as unclear long-term vaccine effects, surveillance systems to monitor the long-term effects of this programme will be essential.
Resumo:
Hierarchical multi-label classification is a complex classification task where the classes involved in the problem are hierarchically structured and each example may simultaneously belong to more than one class in each hierarchical level. In this paper, we extend our previous works, where we investigated a new local-based classification method that incrementally trains a multi-layer perceptron for each level of the classification hierarchy. Predictions made by a neural network in a given level are used as inputs to the neural network responsible for the prediction in the next level. We compare the proposed method with one state-of-the-art decision-tree induction method and two decision-tree induction methods, using several hierarchical multi-label classification datasets. We perform a thorough experimental analysis, showing that our method obtains competitive results to a robust global method regarding both precision and recall evaluation measures.
Resumo:
Background: Tuberculosis (TB) remains a public health issue worldwide. The lack of specific clinical symptoms to diagnose TB makes the correct decision to admit patients to respiratory isolation a difficult task for the clinician. Isolation of patients without the disease is common and increases health costs. Decision models for the diagnosis of TB in patients attending hospitals can increase the quality of care and decrease costs, without the risk of hospital transmission. We present a predictive model for predicting pulmonary TB in hospitalized patients in a high prevalence area in order to contribute to a more rational use of isolation rooms without increasing the risk of transmission. Methods: Cross sectional study of patients admitted to CFFH from March 2003 to December 2004. A classification and regression tree (CART) model was generated and validated. The area under the ROC curve (AUC), sensitivity, specificity, positive and negative predictive values were used to evaluate the performance of model. Validation of the model was performed with a different sample of patients admitted to the same hospital from January to December 2005. Results: We studied 290 patients admitted with clinical suspicion of TB. Diagnosis was confirmed in 26.5% of them. Pulmonary TB was present in 83.7% of the patients with TB (62.3% with positive sputum smear) and HIV/AIDS was present in 56.9% of patients. The validated CART model showed sensitivity, specificity, positive predictive value and negative predictive value of 60.00%, 76.16%, 33.33%, and 90.55%, respectively. The AUC was 79.70%. Conclusions: The CART model developed for these hospitalized patients with clinical suspicion of TB had fair to good predictive performance for pulmonary TB. The most important variable for prediction of TB diagnosis was chest radiograph results. Prospective validation is still necessary, but our model offer an alternative for decision making in whether to isolate patients with clinical suspicion of TB in tertiary health facilities in countries with limited resources.