Biblioteca Digital

949 resultados para decision tree

O relevo na interpretação da variabilidade espacial dos teores de nutrientes em folha de citros

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Foliar diagnosis is a method for assessing the nutritional status of agricultural crops, which helps in the understanding of soil fertility and rationalized application of fertilizers taking into account economic and environmental criteria. The study aimed to use the landrelief as criteria to assist in interpreting the spatial variability of nutrient content of the citrus leaf. The leaves were collected at regular intervals of 50 m, totaling 332 sampling points. Data were analyzed by descriptive statistics, geostatistics and induction of decision tree. With the aid of digital elevation model (MDE) and the profile planaltimetric, the area was divided into three different landrelief and sub-strands. The highest values for nutrients from the leaves of citrus were observed at the top (concave area) segments on a half-slope and lower slope. The nutrients from the citrus leaves showed high values of correlation (above 0.5) with the altitude of the study area. The technique of geostatistics and the induction of decision tree show that the relief is the variable with the greatest potential to interpret the maps of spatial variability of nutrients from the citrus leaves.

Sistema híbrido para detecção e diagnóstico de falhas em sistemas dinâmicos

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The industries are getting more and more rigorous, when security is in question, no matter is to avoid financial damages due to accidents and low productivity, or when it s related to the environment protection. It was thinking about great world accidents around the world involving aircrafts and industrial process (nuclear, petrochemical and so on) that we decided to invest in systems that could detect fault and diagnosis (FDD) them. The FDD systems can avoid eventual fault helping man on the maintenance and exchange of defective equipments. Nowadays, the issues that involve detection, isolation, diagnose and the controlling of tolerance fault are gathering strength in the academic and industrial environment. It is based on this fact, in this work, we discuss the importance of techniques that can assist in the development of systems for Fault Detection and Diagnosis (FDD) and propose a hybrid method for FDD in dynamic systems. We present a brief history to contextualize the techniques used in working environments. The detection of fault in the proposed system is based on state observers in conjunction with other statistical techniques. The principal idea is to use the observer himself, in addition to serving as an analytical redundancy, in allowing the creation of a residue. This residue is used in FDD. A signature database assists in the identification of system faults, which based on the signatures derived from trend analysis of the residue signal and its difference, performs the classification of the faults based purely on a decision tree. This FDD system is tested and validated in two plants: a simulated plant with coupled tanks and didactic plant with industrial instrumentation. All collected results of those tests will be discussed

Aplicação de técnicas de aprendizado de máquina no reconhecimento de classes estruturais de proteínas

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated

Potential tuberculostatic agents. Topliss application on benzoic acid [(5-nitro-thiophen-2-yl)-methylene]-hydrazide series

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Nitroaromatic compounds such as nifuroxazide are used in many human enteropathogenic bacteria infections without causing an increase in the plasmidial antibiotic resistance of the aerobic Gram-negative intestinal Enterobacteriaceae. For these reasons, these compounds have been synthesized using the rational approach of Topliss' decision tree. Generally. this approach allows us to obtain the most active derivative from the series in a few steps. These compounds were tested against Mycobacterium tuberculosis in vitro and the most active of the series identified. A new lead for potential tuberculostatic activity has been predicted and will be used in further QSAR studies. (C) 2002 Elsevier B.V. Ltd. All rights reserved.

Uma Análise de métodos de distriubuição de atributos em comitês de classificadores

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The objective of the researches in artificial intelligence is to qualify the computer to execute functions that are performed by humans using knowledge and reasoning. This work was developed in the area of machine learning, that it s the study branch of artificial intelligence, being related to the project and development of algorithms and techniques capable to allow the computational learning. The objective of this work is analyzing a feature selection method for ensemble systems. The proposed method is inserted into the filter approach of feature selection method, it s using the variance and Spearman correlation to rank the feature and using the reward and punishment strategies to measure the feature importance for the identification of the classes. For each ensemble, several different configuration were used, which varied from hybrid (homogeneous) to non-hybrid (heterogeneous) structures of ensemble. They were submitted to five combining methods (voting, sum, sum weight, multiLayer Perceptron and naïve Bayes) which were applied in six distinct database (real and artificial). The classifiers applied during the experiments were k- nearest neighbor, multiLayer Perceptron, naïve Bayes and decision tree. Finally, the performance of ensemble was analyzed comparatively, using none feature selection method, using a filter approach (original) feature selection method and the proposed method. To do this comparison, a statistical test was applied, which demonstrate that there was a significant improvement in the precision of the ensembles

Cost-benefit of hospitalization compared with outpatient care for pregnant women with pregestational and gestational diabetes or with mild hyperglycemia, in Brazil

Relevância:

60.00% 60.00%

Publicador:

Resumo:

CONTEXTO E OBJETIVO: Gestações complicadas pelo diabetes estão associadas com aumento de complicações maternas e neonatais. Os custos hospitalares aumentam de acordo com a assistência prestada. O objetivo foi calcular o custo-benefício e a taxa de rentabilidade social da hospitalização comparada ao atendimento ambulatorial em gestantes com diabetes ou com hiperglicemia leve. DESENHO do ESTUDO: Estudo prospectivo, observacional, quantitativo, realizado em hospital universitário, sendo incluídas todas as gestantes com diabetes pregestacional e gestacional ou com hiperglicemia leve que não desenvolveram intercorrências clínicas na gestação e que tiveram parto no Hospital das Clínicas, Faculdade de Medicina de Botucatu, Universidade Estadual Paulista (HC-FMB-Unesp). MÉTODOS: Trinta gestantes tratadas com dieta foram acompanhadas em ambulatório e 20 tratadas com dieta e insulina foram abordadas com hospitalizações curtas e frequentes. Foram obtidos custos diretos (pessoal, material e exames) e indiretos (despesas gerais) a partir de dados contidos no prontuário e no sistema de custo por absorção do hospital e posteriormente calculado o custo-benefício. RESULTADOS: O sucesso do tratamento das gestantes diabéticas evitou o gasto de US$ 1.517,97 e US$ 1.127,43 para pacientes hospitalizadas e ambulatoriais, respectivamente. O custo-benefício da atenção hospitalizada foi US$ 143.719,16 e ambulatorial, US$ 253.267,22, com rentabilidade social 1,87 e 5,35 respectivamente. CONCLUSÃO: A análise árvore de decisão confirma que o sucesso dos tratamentos elimina custos no hospital. A relação custo-benefício indicou que o tratamento ambulatorial é economicamente mais vantajoso do que a hospitalização. A rentabilidade social de ambos os tratamentos foi maior que 1, indicando que ambos os tipos de atendimento à gestante diabética têm benefício positivo.

In silico network topology-based prediction of gene essentiality

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The identification of genes essential for survival is important for the understanding of the minimal requirements for cellular life and for drug design. As experimental studies with the purpose of building a catalog of essential genes for a given organism are time-consuming and laborious, a computational approach which could predict gene essentiality with high accuracy would be of great value. We present here a novel computational approach, called NTPGE (Network Topology-based Prediction of Gene Essentiality), that relies on the network topology features of a gene to estimate its essentiality. The first step of NTPGE is to construct the integrated molecular network for a given organism comprising protein physical, metabolic and transcriptional regulation interactions. The second step consists in training a decision-tree-based machine-learning algorithm on known essential and non-essential genes of the organism of interest, considering as learning attributes the network topology information for each of these genes. Finally, the decision-tree classifier generated is applied to the set of genes of this organism to estimate essentiality for each gene. We applied the NTPGE approach for discovering the essential genes in Escherichia coli and then assessed its performance. (C) 2007 Elsevier B.V. All rights reserved.

Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

A machine learning approach for genome-wide prediction of morbid and druggable human genes based on systems-level data

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: The genome-wide identification of both morbid genes, i.e., those genes whose mutations cause hereditary human diseases, and druggable genes, i.e., genes coding for proteins whose modulation by small molecules elicits phenotypic effects, requires experimental approaches that are time-consuming and laborious. Thus, a computational approach which could accurately predict such genes on a genome-wide scale would be invaluable for accelerating the pace of discovery of causal relationships between genes and diseases as well as the determination of druggability of gene products.Results: In this paper we propose a machine learning-based computational approach to predict morbid and druggable genes on a genome-wide scale. For this purpose, we constructed a decision tree-based meta-classifier and trained it on datasets containing, for each morbid and druggable gene, network topological features, tissue expression profile and subcellular localization data as learning attributes. This meta-classifier correctly recovered 65% of known morbid genes with a precision of 66% and correctly recovered 78% of known druggable genes with a precision of 75%. It was than used to assign morbidity and druggability scores to genes not known to be morbid and druggable and we showed a good match between these scores and literature data. Finally, we generated decision trees by training the J48 algorithm on the morbidity and druggability datasets to discover cellular rules for morbidity and druggability and, among the rules, we found that the number of regulating transcription factors and plasma membrane localization are the most important factors to morbidity and druggability, respectively.Conclusions: We were able to demonstrate that network topological features along with tissue expression profile and subcellular localization can reliably predict human morbid and druggable genes on a genome-wide scale. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing morbidity and druggability.

Modelling an expert GIS system based on knowledge to evaluate oil spill environmental sensitivity

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Oil spills cause great damage to coastal habitats, especially when rapid and suitable response measures are not taken. Establishing high priority areas is fundamental for the operation of response teams. Under this context and considering the need for keeping all geographical information up-to-date for emergencial use, the present study proposes employing a decision tree coupled with a knowledge-based approach using GIS to assign oil sensitivity indices to Brazilian coastal habitats. The modelled system works based on rules set by the official standards of Brazilian Federal Environment Organ. We tested it on one of the littoral regions of Brazil where transportation of petroleum is most intense: the coast of the municipalities of Sao Sebastiao and Caraguatatuba in the northern littoral of São Paulo state, Brazil. The system automatically ranked the littoral sensitivity index of the study area habitats according to geographical conditions during summer and winter; since index ranks of some habitats varied between these seasons because of sediment alterations. The obtained results illustrate the great potential of the proposed system in generating ESI maps and in aiding response teams during emergency operations. (C) 2009 Elsevier Ltd. All rights reserved.

Classification of petroleum well drilling operations with a hybrid particle swarm/ant colony algorithm

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes an investigation of the hybrid PSO/ACO algorithm to classify automatically the well drilling operation stages. The method feasibility is demonstrated by its application to real mud-logging dataset. The results are compared with bio-inspired methods, and rule induction and decision tree algorithms for data mining. © 2009 Springer Berlin Heidelberg.

The Development of a Universal In Silico Predictor of Protein-Protein Interactions

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Protein-protein interactions (PPIs) are essential for understanding the function of biological systems and have been characterized using a vast array of experimental techniques. These techniques detect only a small proportion of all PPIs and are labor intensive and time consuming. Therefore, the development of computational methods capable of predicting PPIs accelerates the pace of discovery of new interactions. This paper reports a machine learning-based prediction model, the Universal In Silico Predictor of Protein-Protein Interactions (UNISPPI), which is a decision tree model that can reliably predict PPIs for all species (including proteins from parasite-host associations) using only 20 combinations of amino acids frequencies from interacting and non-interacting proteins as learning features. UNISPPI was able to correctly classify 79.4% and 72.6% of experimentally supported interactions and non-interacting protein pairs, respectively, from an independent test set. Moreover, UNISPPI suggests that the frequencies of the amino acids asparagine, cysteine and isoleucine are important features for distinguishing between interacting and non-interacting protein pairs. We envisage that UNISPPI can be a useful tool for prioritizing interactions for experimental validation. © 2013 Valente et al.

Classification of masses in mammographic image using wavelet domain features and polynomial classifier

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Breast cancer is the most common cancer among women. In CAD systems, several studies have investigated the use of wavelet transform as a multiresolution analysis tool for texture analysis and could be interpreted as inputs to a classifier. In classification, polynomial classifier has been used due to the advantages of providing only one model for optimal separation of classes and to consider this as the solution of the problem. In this paper, a system is proposed for texture analysis and classification of lesions in mammographic images. Multiresolution analysis features were extracted from the region of interest of a given image. These features were computed based on three different wavelet functions, Daubechies 8, Symlet 8 and bi-orthogonal 3.7. For classification, we used the polynomial classification algorithm to define the mammogram images as normal or abnormal. We also made a comparison with other artificial intelligence algorithms (Decision Tree, SVM, K-NN). A Receiver Operating Characteristics (ROC) curve is used to evaluate the performance of the proposed system. Our system is evaluated using 360 digitized mammograms from DDSM database and the result shows that the algorithm has an area under the ROC curve Az of 0.98 ± 0.03. The performance of the polynomial classifier has proved to be better in comparison to other classification algorithms. © 2013 Elsevier Ltd. All rights reserved.

Análise da suscetibilidade a movimentos de massa no município de Peruíbe - SP, com o apoio de um sistema integrador de informações georeferenciadas

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Classificação de cobertura do solo utilizando árvores de decisão e sensoriamento remoto

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

«
1
2
...
5
6
7
8
9
10
11
...
63
64
»