977 resultados para Classification tree


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The cultivation of fruit plants from temperate climate in tropical or subtropical regions can be a good income alternative for the producer. However, due to the little existent information about cultivation of those fruit plants, the producers use imported techniques of other producing areas, or even an association of practices used for other fruit plants, pointing out the leaf spray fertilization of micronutrients without appropriate scientific base. In this context, the objective of this study was to verify the effect of the leaf spray fertilization of B and Zn on productivity and fruit quality of Japanese pear tree. The experiment was conducted from 2004 to 2005, in Ilha Solteira, in northwestern São Paulo State-Brazil. The climate is, according to the Köpppen Classification, tropical wet and dry (Aw). The 'Okusankichi' cultivar, grafted on Pyrus communis L. rootstock was used as well as doses of 110 g.ha-1 of B and 250 g.ha-1 of Zn in each application. The treatments were: T1. water, T2. boric acid, T3. zinc sulfate, T4. T2 + T3, T5. boric acid + urea + citric acid + EDTA, T6. zinc sulfate + urea + citric acid + EDTA, T7. T5 + T6, T8. boric acid + urea + citric acid + EDTA + sodium molibdate + sulfur + calcium chloride, T9. zinc sulfate + urea + citric acid + EDTA + Fe sulfate + Mn sulfate + Mg sulfate and, T10. T8+T9. A randomized blocks design was used and the averages were compared by Tukey test. In the first crop the mixture of boric acid with quelating agents were efficient to supply B to the plants and zinc sulfate plus quelating agents were efficient to increase Zn leaf content. However, the productivity and the fruit quality were not influenced by the leaf spray of B and Zn. In the second crop the leaf content of B and Zn and the productivity were not influenced by the leaf spray; the boric acid and the zinc sulfate with or without quelating agents increased the contents of total soluble solids and, the boric acid with or without quelating agents increased the contents of total titratable acidity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fourty-two White Leghorns laying hens, from the commercial Cuban hybrid L-33, were used for eight weeks during the laying peak (36 to 43 weeks of age), to assess the substitution of corn by cassava root meal (Manihot esculenta Crantz) and the crude soybean oil by crude oil of African palm tree (Elaeis guineensis J.) in the diets of laying hens. Analysis of variance was conducted, according to simple classification design, with three treatments and 14 repetitions (a cage with a hen). The treatments consisted of three diets (1- corn meal + soybean oil; 2- 25 % cassava meal + African palm tree oil; 3- 53 % cassava meal + African palm tree oil), with 15.71 % CP; 3.83 % Ca and 0.36 % P available. The viability was of 100 % in all treatments. No differences were found for laying (92.21, 92.09 and 91.59 %), which surpassed the potential of this hybrid during the laying peak (90 %), conversion (118g feedstuff/egg in the three treatments), egg mass produced (3066, 3114 and 3071 g/bird) and mass conversion (1.99, 1.95 y 1.98 feed consumed/egg mass). The pigmentation of the egg yolk was reduced as the level of cassava meal increased in the diets (6, 4 and 3 at Roche's scale), as well as the cost of the feed consumed in 56 d per hen (2.56, 2.15 and 1.83 USD/bird). The possibility of substituting, totally, corn meal by that of cassava and soybean oil by that of the African palm tree in the diets of laying hens during the laying peak was determined, with positive economic effect and without damaging the productive performance of birds.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Breast cancer is the most common cancer among women. In CAD systems, several studies have investigated the use of wavelet transform as a multiresolution analysis tool for texture analysis and could be interpreted as inputs to a classifier. In classification, polynomial classifier has been used due to the advantages of providing only one model for optimal separation of classes and to consider this as the solution of the problem. In this paper, a system is proposed for texture analysis and classification of lesions in mammographic images. Multiresolution analysis features were extracted from the region of interest of a given image. These features were computed based on three different wavelet functions, Daubechies 8, Symlet 8 and bi-orthogonal 3.7. For classification, we used the polynomial classification algorithm to define the mammogram images as normal or abnormal. We also made a comparison with other artificial intelligence algorithms (Decision Tree, SVM, K-NN). A Receiver Operating Characteristics (ROC) curve is used to evaluate the performance of the proposed system. Our system is evaluated using 360 digitized mammograms from DDSM database and the result shows that the algorithm has an area under the ROC curve Az of 0.98 ± 0.03. The performance of the polynomial classifier has proved to be better in comparison to other classification algorithms. © 2013 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification of tree species is a key step for sustainable management plans of forest resources, as well as for several other applications that are based on such surveys. However, the present available techniques are dependent on the presence of tree structures, such as flowers, fruits, and leaves, limiting the identification process to certain periods of the year Therefore, this article introduces a study on the application of statistical parameters for texture classification of tree trunk images. For that, 540 samples from five Brazilian native deciduous species were acquired and measures of entropy, uniformity, smoothness, asymmetry (third moment), mean, and standard deviation were obtained from the presented textures. Using a decision tree, a biometric species identification system was constructed and resulted to a 0.84 average precision rate for species classification with 0.83accuracy and 0.79 agreement. Thus, it can be considered that the use of texture presented in trunk images can represent an important advance in tree identification, since the limitations of the current techniques can be overcome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a survey of evolutionary algorithms that are designed for decision-tree induction. In this context, most of the paper focuses on approaches that evolve decision trees as an alternate heuristics to the traditional top-down divide-and-conquer approach. Additionally, we present some alternative methods that make use of evolutionary algorithms to improve particular components of decision-tree classifiers. The paper's original contributions are the following. First, it provides an up-to-date overview that is fully focused on evolutionary algorithms and decision trees and does not concentrate on any specific evolutionary approach. Second, it provides a taxonomy, which addresses works that evolve decision trees and works that design decision-tree components by the use of evolutionary algorithms. Finally, a number of references are provided that describe applications of evolutionary algorithms for decision-tree induction in different domains. At the end of this paper, we address some important issues and open questions that can be the subject of future research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background Smear negative pulmonary tuberculosis (SNPT) accounts for 30% of pulmonary tuberculosis cases reported yearly in Brazil. This study aimed to develop a prediction model for SNPT for outpatients in areas with scarce resources. Methods The study enrolled 551 patients with clinical-radiological suspicion of SNPT, in Rio de Janeiro, Brazil. The original data was divided into two equivalent samples for generation and validation of the prediction models. Symptoms, physical signs and chest X-rays were used for constructing logistic regression and classification and regression tree models. From the logistic regression, we generated a clinical and radiological prediction score. The area under the receiver operator characteristic curve, sensitivity, and specificity were used to evaluate the model's performance in both generation and validation samples. Results It was possible to generate predictive models for SNPT with sensitivity ranging from 64% to 71% and specificity ranging from 58% to 76%. Conclusion The results suggest that those models might be useful as screening tools for estimating the risk of SNPT, optimizing the utilization of more expensive tests, and avoiding costs of unnecessary anti-tuberculosis treatment. Those models might be cost-effective tools in a health care network with hierarchical distribution of scarce resources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hierarchical multi-label classification is a complex classification task where the classes involved in the problem are hierarchically structured and each example may simultaneously belong to more than one class in each hierarchical level. In this paper, we extend our previous works, where we investigated a new local-based classification method that incrementally trains a multi-layer perceptron for each level of the classification hierarchy. Predictions made by a neural network in a given level are used as inputs to the neural network responsible for the prediction in the next level. We compare the proposed method with one state-of-the-art decision-tree induction method and two decision-tree induction methods, using several hierarchical multi-label classification datasets. We perform a thorough experimental analysis, showing that our method obtains competitive results to a robust global method regarding both precision and recall evaluation measures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Machine learning comprises a series of techniques for automatic extraction of meaningful information from large collections of noisy data. In many real world applications, data is naturally represented in structured form. Since traditional methods in machine learning deal with vectorial information, they require an a priori form of preprocessing. Among all the learning techniques for dealing with structured data, kernel methods are recognized to have a strong theoretical background and to be effective approaches. They do not require an explicit vectorial representation of the data in terms of features, but rely on a measure of similarity between any pair of objects of a domain, the kernel function. Designing fast and good kernel functions is a challenging problem. In the case of tree structured data two issues become relevant: kernel for trees should not be sparse and should be fast to compute. The sparsity problem arises when, given a dataset and a kernel function, most structures of the dataset are completely dissimilar to one another. In those cases the classifier has too few information for making correct predictions on unseen data. In fact, it tends to produce a discriminating function behaving as the nearest neighbour rule. Sparsity is likely to arise for some standard tree kernel functions, such as the subtree and subset tree kernel, when they are applied to datasets with node labels belonging to a large domain. A second drawback of using tree kernels is the time complexity required both in learning and classification phases. Such a complexity can sometimes prevents the kernel application in scenarios involving large amount of data. This thesis proposes three contributions for resolving the above issues of kernel for trees. A first contribution aims at creating kernel functions which adapt to the statistical properties of the dataset, thus reducing its sparsity with respect to traditional tree kernel functions. Specifically, we propose to encode the input trees by an algorithm able to project the data onto a lower dimensional space with the property that similar structures are mapped similarly. By building kernel functions on the lower dimensional representation, we are able to perform inexact matchings between different inputs in the original space. A second contribution is the proposal of a novel kernel function based on the convolution kernel framework. Convolution kernel measures the similarity of two objects in terms of the similarities of their subparts. Most convolution kernels are based on counting the number of shared substructures, partially discarding information about their position in the original structure. The kernel function we propose is, instead, especially focused on this aspect. A third contribution is devoted at reducing the computational burden related to the calculation of a kernel function between a tree and a forest of trees, which is a typical operation in the classification phase and, for some algorithms, also in the learning phase. We propose a general methodology applicable to convolution kernels. Moreover, we show an instantiation of our technique when kernels such as the subtree and subset tree kernels are employed. In those cases, Direct Acyclic Graphs can be used to compactly represent shared substructures in different trees, thus reducing the computational burden and storage requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Current methods to characterize mesenchymal stem cells (MSCs) are limited to CD marker expression, plastic adherence and their ability to differentiate into adipogenic, osteogenic and chondrogenic precursors. It seems evident that stem cells undergoing differentiation should differ in many aspects, such as morphology and possibly also behaviour; however, such a correlation has not yet been exploited for fate prediction of MSCs. Primary human MSCs from bone marrow were expanded and pelleted to form high-density cultures and were then randomly divided into four groups to differentiate into adipogenic, osteogenic chondrogenic and myogenic progenitor cells. The cells were expanded as heterogeneous and tracked with time-lapse microscopy to record cell shape, using phase-contrast microscopy. The cells were segmented using a custom-made image-processing pipeline. Seven morphological features were extracted for each of the segmented cells. Statistical analysis was performed on the seven-dimensional feature vectors, using a tree-like classification method. Differentiation of cells was monitored with key marker genes and histology. Cells in differentiation media were expressing the key genes for each of the three pathways after 21 days, i.e. adipogenic, osteogenic and chondrogenic, which was also confirmed by histological staining. Time-lapse microscopy data were obtained and contained new evidence that two cell shape features, eccentricity and filopodia (= 'fingers') are highly informative to classify myogenic differentiation from all others. However, no robust classifiers could be identified for the other cell differentiation paths. The results suggest that non-invasive automated time-lapse microscopy could potentially be used to predict the stem cell fate of hMSCs for clinical application, based on morphology for earlier time-points. The classification is challenged by cell density, proliferation and possible unknown donor-specific factors, which affect the performance of morphology-based approaches. Copyright © 2012 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To deliver sample estimates provided with the necessary probability foundation to permit generalization from the sample data subset to the whole target population being sampled, probability sampling strategies are required to satisfy three necessary not sufficient conditions: (i) All inclusion probabilities be greater than zero in the target population to be sampled. If some sampling units have an inclusion probability of zero, then a map accuracy assessment does not represent the entire target region depicted in the map to be assessed. (ii) The inclusion probabilities must be: (a) knowable for nonsampled units and (b) known for those units selected in the sample: since the inclusion probability determines the weight attached to each sampling unit in the accuracy estimation formulas, if the inclusion probabilities are unknown, so are the estimation weights. This original work presents a novel (to the best of these authors' knowledge, the first) probability sampling protocol for quality assessment and comparison of thematic maps generated from spaceborne/airborne Very High Resolution (VHR) images, where: (I) an original Categorical Variable Pair Similarity Index (CVPSI, proposed in two different formulations) is estimated as a fuzzy degree of match between a reference and a test semantic vocabulary, which may not coincide, and (II) both symbolic pixel-based thematic quality indicators (TQIs) and sub-symbolic object-based spatial quality indicators (SQIs) are estimated with a degree of uncertainty in measurement in compliance with the well-known Quality Assurance Framework for Earth Observation (QA4EO) guidelines. Like a decision-tree, any protocol (guidelines for best practice) comprises a set of rules, equivalent to structural knowledge, and an order of presentation of the rule set, known as procedural knowledge. The combination of these two levels of knowledge makes an original protocol worth more than the sum of its parts. The several degrees of novelty of the proposed probability sampling protocol are highlighted in this paper, at the levels of understanding of both structural and procedural knowledge, in comparison with related multi-disciplinary works selected from the existing literature. In the experimental session the proposed protocol is tested for accuracy validation of preliminary classification maps automatically generated by the Satellite Image Automatic MapperT (SIAMT) software product from two WorldView-2 images and one QuickBird-2 image provided by DigitalGlobe for testing purposes. In these experiments, collected TQIs and SQIs are statistically valid, statistically significant, consistent across maps and in agreement with theoretical expectations, visual (qualitative) evidence and quantitative quality indexes of operativeness (OQIs) claimed for SIAMT by related papers. As a subsidiary conclusion, the statistically consistent and statistically significant accuracy validation of the SIAMT pre-classification maps proposed in this contribution, together with OQIs claimed for SIAMT by related works, make the operational (automatic, accurate, near real-time, robust, scalable) SIAMT software product eligible for opening up new inter-disciplinary research and market opportunities in accordance with the visionary goal of the Global Earth Observation System of Systems (GEOSS) initiative and the QA4EO international guidelines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Macromolecular transport systems in bacteria currently are classified by function and sequence comparisons into five basic types. In this classification system, type II and type IV secretion systems both possess members of a superfamily of genes for putative NTP hydrolase (NTPase) proteins that are strikingly similar in structure, function, and sequence. These include VirB11, TrbB, TraG, GspE, PilB, PilT, and ComG1. The predicted protein product of tadA, a recently discovered gene required for tenacious adherence of Actinobacillus actinomycetemcomitans, also has significant sequence similarity to members of this superfamily and to several unclassified and uncharacterized gene products of both Archaea and Bacteria. To understand the relationship of tadA and tadA-like genes to those encoding the putative NTPases of type II/IV secretion, we used a phylogenetic approach to obtain a genealogy of 148 NTPase genes and reconstruct a scenario of gene superfamily evolution. In this phylogeny, clear distinctions can be made between type II and type IV families and their constituent subfamilies. In addition, the subgroup containing tadA constitutes a novel and extremely widespread subfamily of the family encompassing all putative NTPases of type IV secretion systems. We report diagnostic amino acid residue positions for each major monophyletic family and subfamily in the phylogenetic tree, and we propose an easy method for precisely classifying and naming putative NTPase genes based on phylogeny. This molecular key-based method can be applied to other gene superfamilies and represents a valuable tool for genome analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aims of the project were twofold: 1) To investigate classification procedures for remotely sensed digital data, in order to develop modifications to existing algorithms and propose novel classification procedures; and 2) To investigate and develop algorithms for contextual enhancement of classified imagery in order to increase classification accuracy. The following classifiers were examined: box, decision tree, minimum distance, maximum likelihood. In addition to these the following algorithms were developed during the course of the research: deviant distance, look up table and an automated decision tree classifier using expert systems technology. Clustering techniques for unsupervised classification were also investigated. Contextual enhancements investigated were: mode filters, small area replacement and Wharton's CONAN algorithm. Additionally methods for noise and edge based declassification and contextual reclassification, non-probabilitic relaxation and relaxation based on Markov chain theory were developed. The advantages of per-field classifiers and Geographical Information Systems were investigated. The conclusions presented suggest suitable combinations of classifier and contextual enhancement, given user accuracy requirements and time constraints. These were then tested for validity using a different data set. A brief examination of the utility of the recommended contextual algorithms for reducing the effects of data noise was also carried out.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present syllable-based duration modelling in the context of a prosody model for Standard Yorùbá (SY) text-to-speech (TTS) synthesis applications. Our prosody model is conceptualised around a modular holistic framework. This framework is implemented using the Relational Tree (R-Tree) techniques. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration, intonation, and intensity, using different techniques and their subsequent integration. We applied the Fuzzy Decision Tree (FDT) technique to model the duration dimension. In order to evaluate the effectiveness of FDT in duration modelling, we have also developed a Classification And Regression Tree (CART) based duration model using the same speech data. Each of these models was integrated into our R-Tree based prosody model. We performed both quantitative (i.e. Root Mean Square Error (RMSE) and Correlation (Corr)) and qualitative (i.e. intelligibility and naturalness) evaluations on the two duration models. The results show that CART models the training data more accurately than FDT. The FDT model, however, shows a better ability to extrapolate from the training data since it achieved a better accuracy for the test data set. Our qualitative evaluation results show that our FDT model produces synthesised speech that is perceived to be more natural than our CART model. In addition, we also observed that the expressiveness of FDT is much better than that of CART. That is because the representation in FDT is not restricted to a set of piece-wise or discrete constant approximation. We, therefore, conclude that the FDT approach is a practical approach for duration modelling in SY TTS applications. © 2006 Elsevier Ltd. All rights reserved.