55 resultados para Decision tree method
em Reposit
Resumo:
Background: Leptospirosis is an important zoonotic disease associated with poor areas of urban settings of developing countries and early diagnosis and prompt treatment may prevent disease. Although rodents are reportedly considered the main reservoirs of leptospirosis, dogs may develop the disease, may become asymptomatic carriers and may be used as sentinels for disease epidemiology. The use of Geographical Information Systems (GIS) combined with spatial analysis techniques allows the mapping of the disease and the identification and assessment of health risk factors. Besides the use of GIS and spatial analysis, the technique of data mining, decision tree, can provide a great potential to find a pattern in the behavior of the variables that determine the occurrence of leptospirosis. The objective of the present study was to apply Geographical Information Systems and data prospection (decision tree) to evaluate the risk factors for canine leptospirosis in an area of Curitiba, PR.Materials, Methods & Results: The present study was performed on the Vila Pantanal, a urban poor community in the city of Curitiba. A total of 287 dog blood samples were randomly obtained house-by-house in a two-day sampling on January 2010. In addition, a questionnaire was applied to owners at the time of sampling. Geographical coordinates related to each household of tested dog were obtained using a Global Positioning System (GPS) for mapping the spatial distribution of reagent and non-reagent dogs to leptospirosis. For the decision tree, risk factors included results of microagglutination test (MAT) from the serum of dogs, previous disease on the household, contact with rats or other dogs, dog breed, outdoors access, feeding, trash around house or backyard, open sewer proximity and flooding. A total of 189 samples (about 2/3 of overall samples) were randomly selected for the training file and consequent decision rules. The remained 98 samples were used for the testing file. The seroprevalence showed a pattern of spatial distribution that involved all the Pantanal area, without agglomeration of reagent animals. In relation to data mining, from 189 samples used in decision tree, a total of 165 (87.3%) animal samples were correctly classified, generating a Kappa index of 0.413. A total of 154 out of 159 (96.8%) samples were considered non-reagent and were correctly classified and only 5/159 (3.2%) were wrongly identified. on the other hand, only 11 (36.7%) reagent samples were correctly classified, with 19 (63.3%) samples failing diagnosis.Discussion: The spatial distribution that involved all the Pantanal area showed that all the animals in the area are at risk of contamination by Leptospira spp. Although most samples had been classified correctly by the decision tree, a degree of difficulty of separability related to seropositive animals was observed, with only 36.7% of the samples classified correctly. This can occur due to the fact of seronegative animals number is superior to the number of seropositive ones, taking the differences in the pattern of variable behavior. The data mining helped to evaluate the most important risk factors for leptospirosis in an urban poor community of Curitiba. The variables selected by decision tree reflected the important factors about the existence of the disease (default of sewer, presence of rats and rubbish and dogs with free access to street). The analyses showed the multifactorial character of the epidemiology of canine leptospirosis.
Resumo:
Foliar diagnosis is a method for assessing the nutritional status of agricultural crops, which helps in the understanding of soil fertility and rationalized application of fertilizers taking into account economic and environmental criteria. The study aimed to use the landrelief as criteria to assist in interpreting the spatial variability of nutrient content of the citrus leaf. The leaves were collected at regular intervals of 50 m, totaling 332 sampling points. Data were analyzed by descriptive statistics, geostatistics and induction of decision tree. With the aid of digital elevation model (MDE) and the profile planaltimetric, the area was divided into three different landrelief and sub-strands. The highest values for nutrients from the leaves of citrus were observed at the top (concave area) segments on a half-slope and lower slope. The nutrients from the citrus leaves showed high values of correlation (above 0.5) with the altitude of the study area. The technique of geostatistics and the induction of decision tree show that the relief is the variable with the greatest potential to interpret the maps of spatial variability of nutrients from the citrus leaves.
Resumo:
This paper discusses, within the prevaling Brazilian situation, the possibility of applying 'Causal Tree' (CT) method in investigating occupational accidents by safety personnel in the public health services and workers' unions. The method was developed during the seventies in France, for use by plant safety personnel. The authors used this method in Botucatu, State of Sao Paulo, Brazil, in order to investigate 40 serious occupational accidents that occurred in industrial plants during the second half of 1993, that had been registered by Social Security. In these cases, the predominance of situations in which the lack of safety measures were identified by inspection indicates that in most instances, the use of CT is unnecessary. However, the authors discuss its use by safety personnel from the public health services and workers' unions to investigate certain accidents to contribute to the knowledge base and help overcome the cultural based guilt which, in Brazil, has turned the victim into the person responsible for the accident.
Resumo:
This paper describes an investigation of the hybrid PSO/ACO algorithm to classify automatically the well drilling operation stages. The method feasibility is demonstrated by its application to real mud-logging dataset. The results are compared with bio-inspired methods, and rule induction and decision tree algorithms for data mining. © 2009 Springer Berlin Heidelberg.
Resumo:
We are investigating the combination of wavelets and decision trees to detect ships and other maritime surveillance targets from medium resolution SAR images. Wavelets have inherent advantages to extract image descriptors while decision trees are able to handle different data sources. In addition, our work aims to consider oceanic features such as ship wakes and ocean spills. In this incipient work, Haar and Cohen-Daubechies-Feauveau 9/7 wavelets obtain detailed descriptors from targets and ocean features and are inserted with other statistical parameters and wavelets into an oblique decision tree. © 2011 Springer-Verlag.
Resumo:
The identification of tree species is a key step for sustainable management plans of forest resources, as well as for several other applications that are based on such surveys. However, the present available techniques are dependent on the presence of tree structures, such as flowers, fruits, and leaves, limiting the identification process to certain periods of the year Therefore, this article introduces a study on the application of statistical parameters for texture classification of tree trunk images. For that, 540 samples from five Brazilian native deciduous species were acquired and measures of entropy, uniformity, smoothness, asymmetry (third moment), mean, and standard deviation were obtained from the presented textures. Using a decision tree, a biometric species identification system was constructed and resulted to a 0.84 average precision rate for species classification with 0.83accuracy and 0.79 agreement. Thus, it can be considered that the use of texture presented in trunk images can represent an important advance in tree identification, since the limitations of the current techniques can be overcome.
Resumo:
Layer mortality due to heat stress is an important economic loss for the producer. The aim of this study was to determine the mortality pattern of layers reared in the region of Bastos, SP, Brazil, according to external environment and bird age. Data mining technique were used based on monthly mortality records of hens in production, 135 poultry houses, from January 2004 to August 2008. The external environment was characterized according maximum and minimum temperatures, obtained monthly at the meteorological station CATI in the city of Tupa, SP, Brazil. Mortality was classified as normal (<= 1.2%) or high (> 1.2%), considering the mortality limits mentioned in literature. Data mining technique produced a decision tree with nine levels and 23 leaves, with 62.6% of overall accuracy. The hit rate for the High class was 64.1% and 59.9% for Normal class. The decision tree allowed finding a pattern in the mortality data, generating a model for estimating mortality based on the thermal environment and bird age.
Resumo:
Nitroaromatic compounds such as nifuroxazide are used in many human enteropathogenic bacteria infections without causing an increase in the plasmidial antibiotic resistance of the aerobic Gram-negative intestinal Enterobacteriaceae. For these reasons, these compounds have been synthesized using the rational approach of Topliss' decision tree. Generally. this approach allows us to obtain the most active derivative from the series in a few steps. These compounds were tested against Mycobacterium tuberculosis in vitro and the most active of the series identified. A new lead for potential tuberculostatic activity has been predicted and will be used in further QSAR studies. (C) 2002 Elsevier B.V. Ltd. All rights reserved.
Resumo:
CONTEXTO E OBJETIVO: Gestações complicadas pelo diabetes estão associadas com aumento de complicações maternas e neonatais. Os custos hospitalares aumentam de acordo com a assistência prestada. O objetivo foi calcular o custo-benefício e a taxa de rentabilidade social da hospitalização comparada ao atendimento ambulatorial em gestantes com diabetes ou com hiperglicemia leve. DESENHO do ESTUDO: Estudo prospectivo, observacional, quantitativo, realizado em hospital universitário, sendo incluídas todas as gestantes com diabetes pregestacional e gestacional ou com hiperglicemia leve que não desenvolveram intercorrências clínicas na gestação e que tiveram parto no Hospital das Clínicas, Faculdade de Medicina de Botucatu, Universidade Estadual Paulista (HC-FMB-Unesp). MÉTODOS: Trinta gestantes tratadas com dieta foram acompanhadas em ambulatório e 20 tratadas com dieta e insulina foram abordadas com hospitalizações curtas e frequentes. Foram obtidos custos diretos (pessoal, material e exames) e indiretos (despesas gerais) a partir de dados contidos no prontuário e no sistema de custo por absorção do hospital e posteriormente calculado o custo-benefício. RESULTADOS: O sucesso do tratamento das gestantes diabéticas evitou o gasto de US$ 1.517,97 e US$ 1.127,43 para pacientes hospitalizadas e ambulatoriais, respectivamente. O custo-benefício da atenção hospitalizada foi US$ 143.719,16 e ambulatorial, US$ 253.267,22, com rentabilidade social 1,87 e 5,35 respectivamente. CONCLUSÃO: A análise árvore de decisão confirma que o sucesso dos tratamentos elimina custos no hospital. A relação custo-benefício indicou que o tratamento ambulatorial é economicamente mais vantajoso do que a hospitalização. A rentabilidade social de ambos os tratamentos foi maior que 1, indicando que ambos os tipos de atendimento à gestante diabética têm benefício positivo.
Resumo:
São apresentados dois acidentes do trabalho típicos, ocorridos em empresa de grande porte, investigados com o Método de Árvore de Causas ADC, método que permite identificar o papel desempenhado por fatores gerenciais e de organização do trabalho no desencadeamento desses fenômenos. Os casos apresentados revelam a participação, na gênese dos acidentes, de fatores como designação temporária e improvisada de trabalhadores para funções e postos de trabalho, execução de tarefas deixadas à iniciativa e ao arbítrio dos trabalhadores, falta de ferramentas e de materiais apropriados à execução de tarefas e falhas na circulação de informações, entre outros. São também analisadas as indicações para o uso do método, suas potencialidades em termos de prevenção, bem como as implicações decorrentes de dificuldades de aplicação, de necessidades de treinamento e reciclagens e do dispêndio elevado de tempo para investigação de cada acidente.
Resumo:
The identification of genes essential for survival is important for the understanding of the minimal requirements for cellular life and for drug design. As experimental studies with the purpose of building a catalog of essential genes for a given organism are time-consuming and laborious, a computational approach which could predict gene essentiality with high accuracy would be of great value. We present here a novel computational approach, called NTPGE (Network Topology-based Prediction of Gene Essentiality), that relies on the network topology features of a gene to estimate its essentiality. The first step of NTPGE is to construct the integrated molecular network for a given organism comprising protein physical, metabolic and transcriptional regulation interactions. The second step consists in training a decision-tree-based machine-learning algorithm on known essential and non-essential genes of the organism of interest, considering as learning attributes the network topology information for each of these genes. Finally, the decision-tree classifier generated is applied to the set of genes of this organism to estimate essentiality for each gene. We applied the NTPGE approach for discovering the essential genes in Escherichia coli and then assessed its performance. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Background: The genome-wide identification of both morbid genes, i.e., those genes whose mutations cause hereditary human diseases, and druggable genes, i.e., genes coding for proteins whose modulation by small molecules elicits phenotypic effects, requires experimental approaches that are time-consuming and laborious. Thus, a computational approach which could accurately predict such genes on a genome-wide scale would be invaluable for accelerating the pace of discovery of causal relationships between genes and diseases as well as the determination of druggability of gene products.Results: In this paper we propose a machine learning-based computational approach to predict morbid and druggable genes on a genome-wide scale. For this purpose, we constructed a decision tree-based meta-classifier and trained it on datasets containing, for each morbid and druggable gene, network topological features, tissue expression profile and subcellular localization data as learning attributes. This meta-classifier correctly recovered 65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%. It was than used to assign morbidity and druggability scores to genes not known to be morbid and druggable and we showed a good match between these scores and literature data. Finally, we generated decision trees by training the J48 algorithm on the morbidity and druggability datasets to discover cellular rules for morbidity and druggability and, among the rules, we found that the number of regulating transcription factors and plasma membrane localization are the most important factors to morbidity and druggability, respectively.Conclusions: We were able to demonstrate that network topological features along with tissue expression profile and subcellular localization can reliably predict human morbid and druggable genes on a genome-wide scale. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing morbidity and druggability.
Resumo:
Oil spills cause great damage to coastal habitats, especially when rapid and suitable response measures are not taken. Establishing high priority areas is fundamental for the operation of response teams. Under this context and considering the need for keeping all geographical information up-to-date for emergencial use, the present study proposes employing a decision tree coupled with a knowledge-based approach using GIS to assign oil sensitivity indices to Brazilian coastal habitats. The modelled system works based on rules set by the official standards of Brazilian Federal Environment Organ. We tested it on one of the littoral regions of Brazil where transportation of petroleum is most intense: the coast of the municipalities of Sao Sebastiao and Caraguatatuba in the northern littoral of São Paulo state, Brazil. The system automatically ranked the littoral sensitivity index of the study area habitats according to geographical conditions during summer and winter; since index ranks of some habitats varied between these seasons because of sediment alterations. The obtained results illustrate the great potential of the proposed system in generating ESI maps and in aiding response teams during emergency operations. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
The Brazilian Ministry of Labour has been attempting to modify the norms used to analyse industrial accidents in the country. For this purpose, in 1994 it tried to make compulsory use of the causal tree approach to accident analysis, an approach developed in France during the 1970s,without having previously determined whether it is suitable for use under the industrial safety conditions that prevail in most Brazilian firms. In addition, apposition from Brazilian employers has blocked the proposed changes to the norms. The present study employed anthropotechnology to analyse experimental application of the causal tree method to work-related accidents in industrial firms in the region of Botucatu, São Paulo. Three work-related accidents were examined in three industrial firms representative of local, national and multinational companies. on the basis of the accidents analysed in this study, the rationale for the use of the causal tree method in Brazil can be summarized for each type of firm as follows:the method is redundant if there is a predominance of the type of risk whose elimination or neutralization requires adoption of conventional industrial safety measures (firm representative of local enterprises); the method is worth while if the company's specific technical risks have already largely been eliminated (firm representative of national enterprises); and the method is particularly appropriate if the firm has a good safety record and the causes of accidents are primarily related to industrial organization and management (multinational enterprise).