974 resultados para Classification Tree Pruning


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Breast cancer is the most common cancer among women. In CAD systems, several studies have investigated the use of wavelet transform as a multiresolution analysis tool for texture analysis and could be interpreted as inputs to a classifier. In classification, polynomial classifier has been used due to the advantages of providing only one model for optimal separation of classes and to consider this as the solution of the problem. In this paper, a system is proposed for texture analysis and classification of lesions in mammographic images. Multiresolution analysis features were extracted from the region of interest of a given image. These features were computed based on three different wavelet functions, Daubechies 8, Symlet 8 and bi-orthogonal 3.7. For classification, we used the polynomial classification algorithm to define the mammogram images as normal or abnormal. We also made a comparison with other artificial intelligence algorithms (Decision Tree, SVM, K-NN). A Receiver Operating Characteristics (ROC) curve is used to evaluate the performance of the proposed system. Our system is evaluated using 360 digitized mammograms from DDSM database and the result shows that the algorithm has an area under the ROC curve Az of 0.98 ± 0.03. The performance of the polynomial classifier has proved to be better in comparison to other classification algorithms. © 2013 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification of tree species is a key step for sustainable management plans of forest resources, as well as for several other applications that are based on such surveys. However, the present available techniques are dependent on the presence of tree structures, such as flowers, fruits, and leaves, limiting the identification process to certain periods of the year Therefore, this article introduces a study on the application of statistical parameters for texture classification of tree trunk images. For that, 540 samples from five Brazilian native deciduous species were acquired and measures of entropy, uniformity, smoothness, asymmetry (third moment), mean, and standard deviation were obtained from the presented textures. Using a decision tree, a biometric species identification system was constructed and resulted to a 0.84 average precision rate for species classification with 0.83accuracy and 0.79 agreement. Thus, it can be considered that the use of texture presented in trunk images can represent an important advance in tree identification, since the limitations of the current techniques can be overcome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The supply of cold hours needed to the dormancy breaking of shoots is the limiting factor for the cultivation of temperate climate fruit trees in warmer regions. In subtropical conditions, it is necessary to use chemical products to promote uniform sprouting. This research aimed at evaluating the effect of garlic extract and hydrogen cyanamide in sprouting, growth, production and production cycle of the fig tree. The experiment was conducted during the production cycles of 2011/12 and 2012/13. We used plants from the cultivar Roxo de Valinhos. Production pruning was made in the months of July/2011 and July/2012, and the following treatments were applied immediately after it: 2% hydrogen cyanamide and garlic extract in 4%, 8% and 12% doses, and a control treatment. Split plots were used as the experimental design, with five repetitions in blocks; each plot consisted of five treatments with hydrogen cyanamide, garlic extract and control; the subplots consisted of two production cycles. The use of hydrogen cyanamide promoted an anticipation of sprouting and the use of hydrogen cyanamide and garlic extract promoted a concentration of the productive period, when compared to the control. The estimated garlic extract dose that promoted the highest production per plant was 3%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The carbohydrates translocation and consequently growth and production of fig tree (Ficuscarica L.) vary according to the different management on cultivation conditions. The aim of this study was to evaluate the changes in the levels and total carbohydrates accumulation together with growth and “Roxo de Valinhos” fig trees production onimplementation of orchards in initial phase, cultivated with and without irrigation. We adopted a factorial arrangement (2 x 7) with four repetitions distributed in installments (with and without irrigation) subdivided in time (collect time). Destructive analyzes were performed at 40, 80, 120, 160, 200, 240 and 280 days after pruning (DAP) and are measured: stem diameter and branch, stem length and branch, number of leaves, internodes and fruit. Subsequently, the plant parts were sectioned to obtain the leaf area, length and roots volume, fresh and dry matter weight. The number, weight and total productivity of fruits were evaluated. The media of all growth attributes and production characteristics were higher in treatments with water irrigation. The total carbohydrate content was higher at 120 and 160 DAP and the carbohydrates accumulation was increasing for most institutions over the plants development, except for the leaves that showed a decrease in the levels at 160 DAP. The fruits showed greater carbohydrates accumulation in relation to the other evaluated organs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The production of sound, clean fruit is unquestionably one of the major problems facing the modern fruit grower. Culture may be neglected and pruning delayed for a time but the omission of sprays for even a single season demonstrates their absolute necessity. This applies equally to the commercial grower and to the farmer or gardener who has only a few trees. Spray materials, equipment, management, schedules, insect pests and orchard diseases are discussed in this 1928 extension circular.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a survey of evolutionary algorithms that are designed for decision-tree induction. In this context, most of the paper focuses on approaches that evolve decision trees as an alternate heuristics to the traditional top-down divide-and-conquer approach. Additionally, we present some alternative methods that make use of evolutionary algorithms to improve particular components of decision-tree classifiers. The paper's original contributions are the following. First, it provides an up-to-date overview that is fully focused on evolutionary algorithms and decision trees and does not concentrate on any specific evolutionary approach. Second, it provides a taxonomy, which addresses works that evolve decision trees and works that design decision-tree components by the use of evolutionary algorithms. Finally, a number of references are provided that describe applications of evolutionary algorithms for decision-tree induction in different domains. At the end of this paper, we address some important issues and open questions that can be the subject of future research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background Smear negative pulmonary tuberculosis (SNPT) accounts for 30% of pulmonary tuberculosis cases reported yearly in Brazil. This study aimed to develop a prediction model for SNPT for outpatients in areas with scarce resources. Methods The study enrolled 551 patients with clinical-radiological suspicion of SNPT, in Rio de Janeiro, Brazil. The original data was divided into two equivalent samples for generation and validation of the prediction models. Symptoms, physical signs and chest X-rays were used for constructing logistic regression and classification and regression tree models. From the logistic regression, we generated a clinical and radiological prediction score. The area under the receiver operator characteristic curve, sensitivity, and specificity were used to evaluate the model's performance in both generation and validation samples. Results It was possible to generate predictive models for SNPT with sensitivity ranging from 64% to 71% and specificity ranging from 58% to 76%. Conclusion The results suggest that those models might be useful as screening tools for estimating the risk of SNPT, optimizing the utilization of more expensive tests, and avoiding costs of unnecessary anti-tuberculosis treatment. Those models might be cost-effective tools in a health care network with hierarchical distribution of scarce resources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hierarchical multi-label classification is a complex classification task where the classes involved in the problem are hierarchically structured and each example may simultaneously belong to more than one class in each hierarchical level. In this paper, we extend our previous works, where we investigated a new local-based classification method that incrementally trains a multi-layer perceptron for each level of the classification hierarchy. Predictions made by a neural network in a given level are used as inputs to the neural network responsible for the prediction in the next level. We compare the proposed method with one state-of-the-art decision-tree induction method and two decision-tree induction methods, using several hierarchical multi-label classification datasets. We perform a thorough experimental analysis, showing that our method obtains competitive results to a robust global method regarding both precision and recall evaluation measures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Machine learning comprises a series of techniques for automatic extraction of meaningful information from large collections of noisy data. In many real world applications, data is naturally represented in structured form. Since traditional methods in machine learning deal with vectorial information, they require an a priori form of preprocessing. Among all the learning techniques for dealing with structured data, kernel methods are recognized to have a strong theoretical background and to be effective approaches. They do not require an explicit vectorial representation of the data in terms of features, but rely on a measure of similarity between any pair of objects of a domain, the kernel function. Designing fast and good kernel functions is a challenging problem. In the case of tree structured data two issues become relevant: kernel for trees should not be sparse and should be fast to compute. The sparsity problem arises when, given a dataset and a kernel function, most structures of the dataset are completely dissimilar to one another. In those cases the classifier has too few information for making correct predictions on unseen data. In fact, it tends to produce a discriminating function behaving as the nearest neighbour rule. Sparsity is likely to arise for some standard tree kernel functions, such as the subtree and subset tree kernel, when they are applied to datasets with node labels belonging to a large domain. A second drawback of using tree kernels is the time complexity required both in learning and classification phases. Such a complexity can sometimes prevents the kernel application in scenarios involving large amount of data. This thesis proposes three contributions for resolving the above issues of kernel for trees. A first contribution aims at creating kernel functions which adapt to the statistical properties of the dataset, thus reducing its sparsity with respect to traditional tree kernel functions. Specifically, we propose to encode the input trees by an algorithm able to project the data onto a lower dimensional space with the property that similar structures are mapped similarly. By building kernel functions on the lower dimensional representation, we are able to perform inexact matchings between different inputs in the original space. A second contribution is the proposal of a novel kernel function based on the convolution kernel framework. Convolution kernel measures the similarity of two objects in terms of the similarities of their subparts. Most convolution kernels are based on counting the number of shared substructures, partially discarding information about their position in the original structure. The kernel function we propose is, instead, especially focused on this aspect. A third contribution is devoted at reducing the computational burden related to the calculation of a kernel function between a tree and a forest of trees, which is a typical operation in the classification phase and, for some algorithms, also in the learning phase. We propose a general methodology applicable to convolution kernels. Moreover, we show an instantiation of our technique when kernels such as the subtree and subset tree kernels are employed. In those cases, Direct Acyclic Graphs can be used to compactly represent shared substructures in different trees, thus reducing the computational burden and storage requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The introduction of dwarfed rootstocks in apple crop has led to a new concept of intensive planting systems with the aim of producing early high yield and with returns of the initial high investment. Although yield is an important aspect to the grower, the consumer has become demanding regards fruit quality and is generally attracted by appearance. To fulfil the consumer’s expectations the grower may need to choose a proper training system along with an ideal pruning technique, which ensure a good light distribution in different parts of the canopy and a marketable fruit quality in terms of size and skin colour. Although these aspects are important, these fruits might not reach the proper ripening stage within the canopy because they are often heterogeneous. To describe the variability present in a tree, a software (PlantToon®), was used to recreate the tree architecture in 3D in the two training systems. The ripening stage of each of the fruits was determined using a non-destructive device (DA-Meter), thus allowing to estimate the fruit ripening variability. This study deals with some of the main parameters that can influence fruit quality and ripening stage within the canopy and orchard management techniques that can ameliorate a ripening fruit homogeneity. Significant differences in fruit quality were found within the canopies due to their position, flowering time and bud wood age. Bi-axis appeared to be suitable for high density planting, even though the fruit quality traits resulted often similar to those obtained with a Slender Spindle, suggesting similar fruit light availability within the canopies. Crop load confirmed to be an important factor that influenced fruit quality as much as the interesting innovative pruning method “Click”, in intensive planting systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Current methods to characterize mesenchymal stem cells (MSCs) are limited to CD marker expression, plastic adherence and their ability to differentiate into adipogenic, osteogenic and chondrogenic precursors. It seems evident that stem cells undergoing differentiation should differ in many aspects, such as morphology and possibly also behaviour; however, such a correlation has not yet been exploited for fate prediction of MSCs. Primary human MSCs from bone marrow were expanded and pelleted to form high-density cultures and were then randomly divided into four groups to differentiate into adipogenic, osteogenic chondrogenic and myogenic progenitor cells. The cells were expanded as heterogeneous and tracked with time-lapse microscopy to record cell shape, using phase-contrast microscopy. The cells were segmented using a custom-made image-processing pipeline. Seven morphological features were extracted for each of the segmented cells. Statistical analysis was performed on the seven-dimensional feature vectors, using a tree-like classification method. Differentiation of cells was monitored with key marker genes and histology. Cells in differentiation media were expressing the key genes for each of the three pathways after 21 days, i.e. adipogenic, osteogenic and chondrogenic, which was also confirmed by histological staining. Time-lapse microscopy data were obtained and contained new evidence that two cell shape features, eccentricity and filopodia (= 'fingers') are highly informative to classify myogenic differentiation from all others. However, no robust classifiers could be identified for the other cell differentiation paths. The results suggest that non-invasive automated time-lapse microscopy could potentially be used to predict the stem cell fate of hMSCs for clinical application, based on morphology for earlier time-points. The classification is challenged by cell density, proliferation and possible unknown donor-specific factors, which affect the performance of morphology-based approaches. Copyright © 2012 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To deliver sample estimates provided with the necessary probability foundation to permit generalization from the sample data subset to the whole target population being sampled, probability sampling strategies are required to satisfy three necessary not sufficient conditions: (i) All inclusion probabilities be greater than zero in the target population to be sampled. If some sampling units have an inclusion probability of zero, then a map accuracy assessment does not represent the entire target region depicted in the map to be assessed. (ii) The inclusion probabilities must be: (a) knowable for nonsampled units and (b) known for those units selected in the sample: since the inclusion probability determines the weight attached to each sampling unit in the accuracy estimation formulas, if the inclusion probabilities are unknown, so are the estimation weights. This original work presents a novel (to the best of these authors' knowledge, the first) probability sampling protocol for quality assessment and comparison of thematic maps generated from spaceborne/airborne Very High Resolution (VHR) images, where: (I) an original Categorical Variable Pair Similarity Index (CVPSI, proposed in two different formulations) is estimated as a fuzzy degree of match between a reference and a test semantic vocabulary, which may not coincide, and (II) both symbolic pixel-based thematic quality indicators (TQIs) and sub-symbolic object-based spatial quality indicators (SQIs) are estimated with a degree of uncertainty in measurement in compliance with the well-known Quality Assurance Framework for Earth Observation (QA4EO) guidelines. Like a decision-tree, any protocol (guidelines for best practice) comprises a set of rules, equivalent to structural knowledge, and an order of presentation of the rule set, known as procedural knowledge. The combination of these two levels of knowledge makes an original protocol worth more than the sum of its parts. The several degrees of novelty of the proposed probability sampling protocol are highlighted in this paper, at the levels of understanding of both structural and procedural knowledge, in comparison with related multi-disciplinary works selected from the existing literature. In the experimental session the proposed protocol is tested for accuracy validation of preliminary classification maps automatically generated by the Satellite Image Automatic MapperT (SIAMT) software product from two WorldView-2 images and one QuickBird-2 image provided by DigitalGlobe for testing purposes. In these experiments, collected TQIs and SQIs are statistically valid, statistically significant, consistent across maps and in agreement with theoretical expectations, visual (qualitative) evidence and quantitative quality indexes of operativeness (OQIs) claimed for SIAMT by related papers. As a subsidiary conclusion, the statistically consistent and statistically significant accuracy validation of the SIAMT pre-classification maps proposed in this contribution, together with OQIs claimed for SIAMT by related works, make the operational (automatic, accurate, near real-time, robust, scalable) SIAMT software product eligible for opening up new inter-disciplinary research and market opportunities in accordance with the visionary goal of the Global Earth Observation System of Systems (GEOSS) initiative and the QA4EO international guidelines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Macromolecular transport systems in bacteria currently are classified by function and sequence comparisons into five basic types. In this classification system, type II and type IV secretion systems both possess members of a superfamily of genes for putative NTP hydrolase (NTPase) proteins that are strikingly similar in structure, function, and sequence. These include VirB11, TrbB, TraG, GspE, PilB, PilT, and ComG1. The predicted protein product of tadA, a recently discovered gene required for tenacious adherence of Actinobacillus actinomycetemcomitans, also has significant sequence similarity to members of this superfamily and to several unclassified and uncharacterized gene products of both Archaea and Bacteria. To understand the relationship of tadA and tadA-like genes to those encoding the putative NTPases of type II/IV secretion, we used a phylogenetic approach to obtain a genealogy of 148 NTPase genes and reconstruct a scenario of gene superfamily evolution. In this phylogeny, clear distinctions can be made between type II and type IV families and their constituent subfamilies. In addition, the subgroup containing tadA constitutes a novel and extremely widespread subfamily of the family encompassing all putative NTPases of type IV secretion systems. We report diagnostic amino acid residue positions for each major monophyletic family and subfamily in the phylogenetic tree, and we propose an easy method for precisely classifying and naming putative NTPase genes based on phylogeny. This molecular key-based method can be applied to other gene superfamilies and represents a valuable tool for genome analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Os motores de indução trifásicos são os principais elementos de conversão de energia elétrica em mecânica motriz aplicados em vários setores produtivos. Identificar um defeito no motor em operação pode fornecer, antes que ele falhe, maior segurança no processo de tomada de decisão sobre a manutenção da máquina, redução de custos e aumento de disponibilidade. Nesta tese são apresentas inicialmente uma revisão bibliográfica e a metodologia geral para a reprodução dos defeitos nos motores e a aplicação da técnica de discretização dos sinais de correntes e tensões no domínio do tempo. É também desenvolvido um estudo comparativo entre métodos de classificação de padrões para a identificação de defeitos nestas máquinas, tais como: Naive Bayes, k-Nearest Neighbor, Support Vector Machine (Sequential Minimal Optimization), Rede Neural Artificial (Perceptron Multicamadas), Repeated Incremental Pruning to Produce Error Reduction e C4.5 Decision Tree. Também aplicou-se o conceito de Sistemas Multiagentes (SMA) para suportar a utilização de múltiplos métodos concorrentes de forma distribuída para reconhecimento de padrões de defeitos em rolamentos defeituosos, quebras nas barras da gaiola de esquilo do rotor e curto-circuito entre as bobinas do enrolamento do estator de motores de indução trifásicos. Complementarmente, algumas estratégias para a definição da severidade dos defeitos supracitados em motores foram exploradas, fazendo inclusive uma averiguação da influência do desequilíbrio de tensão na alimentação da máquina para a determinação destas anomalias. Os dados experimentais foram adquiridos por meio de uma bancada experimental em laboratório com motores de potência de 1 e 2 cv acionados diretamente na rede elétrica, operando em várias condições de desequilíbrio das tensões e variações da carga mecânica aplicada ao eixo do motor.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.