43 resultados para Classification Rules
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
This paper describes the modeling of a weed infestation risk inference system that implements a collaborative inference scheme based on rules extracted from two Bayesian network classifiers. The first Bayesian classifier infers a categorical variable value for the weed-crop competitiveness using as input categorical variables for the total density of weeds and corresponding proportions of narrow and broad-leaved weeds. The inferred categorical variable values for the weed-crop competitiveness along with three other categorical variables extracted from estimated maps for the weed seed production and weed coverage are then used as input for a second Bayesian network classifier to infer categorical variables values for the risk of infestation. Weed biomass and yield loss data samples are used to learn the probability relationship among the nodes of the first and second Bayesian classifiers in a supervised fashion, respectively. For comparison purposes, two types of Bayesian network structures are considered, namely an expert-based Bayesian classifier and a naive Bayes classifier. The inference system focused on the knowledge interpretation by translating a Bayesian classifier into a set of classification rules. The results obtained for the risk inference in a corn-crop field are presented and discussed. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
There is an increasing interest in the application of Evolutionary Algorithms (EAs) to induce classification rules. This hybrid approach can benefit areas where classical methods for rule induction have not been very successful. One example is the induction of classification rules in imbalanced domains. Imbalanced data occur when one or more classes heavily outnumber other classes. Frequently, classical machine learning (ML) classifiers are not able to learn in the presence of imbalanced data sets, inducing classification models that always predict the most numerous classes. In this work, we propose a novel hybrid approach to deal with this problem. We create several balanced data sets with all minority class cases and a random sample of majority class cases. These balanced data sets are fed to classical ML systems that produce rule sets. The rule sets are combined creating a pool of rules and an EA is used to build a classifier from this pool of rules. This hybrid approach has some advantages over undersampling, since it reduces the amount of discarded information, and some advantages over oversampling, since it avoids overfitting. The proposed approach was experimentally analysed and the experimental results show an improvement in the classification performance measured as the area under the receiver operating characteristics (ROC) curve.
Resumo:
A redução da disponibilidade de espécies de madeiras nativas e seus efeitos na economia, associada ao fortalecimento dos conceitos de preservação ambiental, criou a necessidade de desenvolvimento de alternativas viáveis para utilização racional de espécies de reflorestamento. E uma das opções é a realização de classificação visual das peças. Autores de trabalhos desenvolvidos nessa linha de pesquisa verificaram a adequação das regras de classificação visual do Southern Pine Inspection Bureau (SPIB) dos EUA à madeira de Pinus do Brasil e apresentaram proposta para normalizar o processo de classificação visual dessa madeira. Nessa classificação, os aspectos com maior influência são: presença de nós, desvio de grã em relação ao eixo da peça e densidade de anéis de crescimento. Assim, esta pesquisa apresenta um estudo experimental que consistiu na classificação visual e determinação da resistência à tração de 85 peças de Pinus spp e um estudo teórico, que propôs uma equação para determinar a resistência à tração média de peças estruturais em função da classificação visual. Com este trabalho, foi possível observar a influência dos nós e dos anéis de crescimento sobre a resistência à tração das peças analisadas.
Resumo:
Saving our science from ourselves: the plight of biological classification. Biological classification ( nomenclature, taxonomy, and systematics) is being sold short. The desire for new technologies, faster and cheaper taxonomic descriptions, identifications, and revisions is symptomatic of a lack of appreciation and understanding of classification. The problem of gadget-driven science, a lack of best practice and the inability to accept classification as a descriptive and empirical science are discussed. The worst cases scenario is a future in which classifications are purely artificial and uninformative.
Resumo:
Due to the imprecise nature of biological experiments, biological data is often characterized by the presence of redundant and noisy data. This may be due to errors that occurred during data collection, such as contaminations in laboratorial samples. It is the case of gene expression data, where the equipments and tools currently used frequently produce noisy biological data. Machine Learning algorithms have been successfully used in gene expression data analysis. Although many Machine Learning algorithms can deal with noise, detecting and removing noisy instances from the training data set can help the induction of the target hypothesis. This paper evaluates the use of distance-based pre-processing techniques for noise detection in gene expression data classification problems. This evaluation analyzes the effectiveness of the techniques investigated in removing noisy data, measured by the accuracy obtained by different Machine Learning classifiers over the pre-processed data.
Resumo:
PURPOSE: The main goal of this study was to develop and compare two different techniques for classification of specific types of corneal shapes when Zernike coefficients are used as inputs. A feed-forward artificial Neural Network (NN) and discriminant analysis (DA) techniques were used. METHODS: The inputs both for the NN and DA were the first 15 standard Zernike coefficients for 80 previously classified corneal elevation data files from an Eyesys System 2000 Videokeratograph (VK), installed at the Departamento de Oftalmologia of the Escola Paulista de Medicina, São Paulo. The NN had 5 output neurons which were associated with 5 typical corneal shapes: keratoconus, with-the-rule astigmatism, against-the-rule astigmatism, "regular" or "normal" shape and post-PRK. RESULTS: The NN and DA responses were statistically analyzed in terms of precision ([true positive+true negative]/total number of cases). Mean overall results for all cases for the NN and DA techniques were, respectively, 94% and 84.8%. CONCLUSION: Although we used a relatively small database, results obtained in the present study indicate that Zernike polynomials as descriptors of corneal shape may be a reliable parameter as input data for diagnostic automation of VK maps, using either NN or DA.
Resumo:
We present a molecular phylogenetic analysis of caenophidian (advanced) snakes using sequences from two mitochondrial genes (12S and 16S rRNA) and one nuclear (c-mos) gene (1681 total base pairs), and with 131 terminal taxa sampled from throughout all major caenophidian lineages but focussing on Neotropical xenodontines. Direct optimization parsimony analysis resulted in a well-resolved phylogenetic tree, which corroborates some clades identified in previous analyses and suggests new hypotheses for the composition and relationships of others. The major salient points of our analysis are: (1) placement of Acrochordus, Xenodermatids, and Pareatids as successive outgroups to all remaining caenophidians (including viperids, elapids, atractaspidids, and all other "colubrid" groups); (2) within the latter group, viperids and homalopsids are sucessive sister clades to all remaining snakes; (3) the following monophyletic clades within crown group caenophidians: Afro-Asian psammophiids (including Mimophis from Madagascar), Elapidae (including hydrophiines but excluding Homoroselaps), Pseudoxyrhophiinae, Colubrinae, Natricinae, Dipsadinae, and Xenodontinae. Homoroselaps is associated with atractaspidids. Our analysis suggests some taxonomic changes within xenodontines, including new taxonomy for Alsophis elegans, Liophis amarali, and further taxonomic changes within Xenodontini and the West Indian radiation of xenodontines. Based on our molecular analysis, we present a revised classification for caenophidians and provide morphological diagnoses for many of the included clades; we also highlight groups where much more work is needed. We name as new two higher taxonomic clades within Caenophidia, one new subfamily within Dipsadidae, and, within Xenodontinae five new tribes, six new genera and two resurrected genera. We synonymize Xenoxybelis and Pseudablabes with Philodryas; Erythrolamprus with Liophis; and Lystrophis and Waglerophis with Xenodon.
Resumo:
This paper describes a new food classification which assigns foodstuffs according to the extent and purpose of the industrial processing applied to them. Three main groups are defined: unprocessed or minimally processed foods (group 1), processed culinary and food industry ingredients (group 2), and ultra-processed food products (group 3). The use of this classification is illustrated by applying it to data collected in the Brazilian Household Budget Survey which was conducted in 2002/2003 through a probabilistic sample of 48,470 Brazilian households. The average daily food availability was 1,792 kcal/person being 42.5% from group 1 (mostly rice and beans and meat and milk), 37.5% from group 2 (mostly vegetable oils, sugar, and flours), and 20% from group 3 (mostly breads, biscuits, sweets, soft drinks, and sausages). The share of group 3 foods increased with income, and represented almost one third of all calories in higher income households. The impact of the replacement of group 1 foods and group 2 ingredients by group 3 products on the overall quality of the diet, eating patterns and health is discussed.
Resumo:
A análise da mortalidade tem sido muito usada em saúde pública, e a causa básica da morte é uma variável bastante estudada. Na maioria dos países, há obrigatoriedade de o médico preencher a declaração de óbito (DO), informando às autoridades a ocorrência do evento, características do falecido e causas da morte. Quando há dois ou mais diagnósticos na declaração das causas da morte, surge a questão da seleção da causa básica. As normas para o preenchimento das causas de morte pelos médicos nas DO e as regras para a seleção da causa básica, quando mais de uma causa é declarada, estão definidas pela OMS, visando à comparabilidade internacional. O objetivo deste trabalho é avaliar se a aplicação das Regras Internacionais de Classificação da causa básica permite a seleção da real causa básica, mesmo se declarada incorretamente pelo médico. O material pertence ao "Estudo sobre a mortalidade de mulheres em idade fértil", sendo que 1.315 casos satisfizeram os requisitos de inclusão. Para cada morte foi realizada uma investigação através de entrevistas domiciliárias, consultas aos prontuários hospitalares e assemelhados. Médicos treinados e calibrados preenchiam uma DO nova, após a leitura de toda a informação, e selecionavam a "verdadeira" causa básica da morte. Esta era comparada com a causa básica da DO original, obtida por meio das Regras Internacionais. Entre as DO, em 1.192 (90,6%) houve concordância com a verdadeira causa básica obtida após a investigação. Concluiu-se que as Regras Internacionais permitem selecionar a real causa básica, mesmo quando o médico preenche inadequadamente a DO
Resumo:
This work proposes a new approach using a committee machine of artificial neural networks to classify masses found in mammograms as benign or malignant. Three shape factors, three edge-sharpness measures, and 14 texture measures are used for the classification of 20 regions of interest (ROIs) related to malignant tumors and 37 ROIs related to benign masses. A group of multilayer perceptrons (MLPs) is employed as a committee machine of neural network classifiers. The classification results are reached by combining the responses of the individual classifiers. Experiments involving changes in the learning algorithm of the committee machine are conducted. The classification accuracy is evaluated using the area A. under the receiver operating characteristics (ROC) curve. The A, result for the committee machine is compared with the A, results obtained using MLPs and single-layer perceptrons (SLPs), as well as a linear discriminant analysis (LDA) classifier Tests are carried out using the student's t-distribution. The committee machine classifier outperforms the MLP SLP, and LDA classifiers in the following cases: with the shape measure of spiculation index, the A, values of the four methods are, in order 0.93, 0.84, 0.75, and 0.76; and with the edge-sharpness measure of acutance, the values are 0.79, 0.70, 0.69, and 0.74. Although the features with which improvement is obtained with the committee machines are not the same as those that provided the maximal value of A(z) (A(z) = 0.99 with some shape features, with or without the committee machine), they correspond to features that are not critically dependent on the accuracy of the boundaries of the masses, which is an important result. (c) 2008 SPIE and IS&T.
Resumo:
Aims. In this work, we describe the pipeline for the fast supervised classification of light curves observed by the CoRoT exoplanet CCDs. We present the classification results obtained for the first four measured fields, which represent a one-year in-orbit operation. Methods. The basis of the adopted supervised classification methodology has been described in detail in a previous paper, as is its application to the OGLE database. Here, we present the modifications of the algorithms and of the training set to optimize the performance when applied to the CoRoT data. Results. Classification results are presented for the observed fields IRa01, SRc01, LRc01, and LRa01 of the CoRoT mission. Statistics on the number of variables and the number of objects per class are given and typical light curves of high-probability candidates are shown. We also report on new stellar variability types discovered in the CoRoT data. The full classification results are publicly available.
Resumo:
We use QCD sum rules (QCDSR) to calculate the width of the radiative decay of the meson X(3872), assumed to be a mixture between charmonium and exotic molecular [c (q) over bar][q (c) over bar] states with J(PC) = 1(++). We find that in a small range for the values of the mixing angle, 5 degrees <= theta <= 13 degrees, we get the branching ratio Gamma(X -> J/psi gamma)/Gamma(X -> J/psi pi(+)pi(-)) = 0.19 +/- 0.13, which is in agreement, with the experimental value. This result is compatible with the analysis of the mass and decay width of the mode J/psi(n pi) performed in the same approach.
Resumo:
We evaluate the mass of the B(s0) scalar meson and the coupling constant in the B(s0)BK vertex in the framework of QCD sum rules. We consider the B(s0) as a tetraquark state to evaluate its mass. We get m(Bs0) = (5.85 +/- 0.13) GeV, which is in agreement, considering the uncertainties, with predictions supposing it as a b (s) over bar state or a B (K) over bar bound state with J(P) = 0(+). To evaluate the g(Bs0BK) coupling, we use the three-point correlation functions of the vertex, considering B(s0) as a normal b (s) over bar state. The obtained coupling constant is: g(Bs0BK) = (16.3 +/- 3.2) GeV. This number is in agreement with light-cone QCD sum rules calculation. We have also compared the decay width of the B(s0) -> BK process considering the B(s0) to be a b (s) over bar state and a BK molecular state. The width obtained for the BK molecular state is twice as big as the width obtained for the b (s) over bar state. Therefore, we conclude that with the knowledge of the mass and the decay width of the B(s0) meson, one can discriminate between the different theoretical proposals for its structure.
Resumo:
We use QCD sum rules to test the nature of the meson X(3872), assumed to be a mixture between charmonium and exotic molecular [c (q) over bar][q (c) over bar] states with J(PC) = 1(++). We find that there is only a small range for the values of the mixing angle theta that can provide simultaneously good agreement with the experimental value of the mass and the decay width, and this range is 5(0) <= theta <= 3(0). In this range we get m(X) = (3.77 +/- 0.18) GeV and Gamma(X -> J/psi pi(+)pi(-)) = (9.3 +/- 6.9) MeV, which are compatible, within the errors, with the experimental values. We, therefore, conclude that the X(3872) is approximately 97% a charmonium state with 3% admixture of similar to 88% D(0)D*(0) molecule and similar to 12% D(+)D*(-) molecule.
Resumo:
We investigate the widths of the recently observed charmonium like resonances X(3872), Z(4430), and Z(2)(4250) using QCD sum rules. Extending previous analyses regarding these states as diquark-antiquark states or molecules of D mesons, we introduce the Breit-Wigner function in the pole term. We find that introducing the width increases the mass at the small Borel window region. Using the operator-product expansion up to dimension 8, we find that the sum rules based on interpolating current with molecular components give a stable Borel curve from which both the masses and widths of these resonances can be well obtained. Thus the QCD sum rule approach strongly favors the molecular description of these states.