351 resultados para pruning
Resumo:
We introduce and describe the Multiple Gravity Assist problem, a global optimisation problem that is of great interest in the design of spacecraft and their trajectories. We discuss its formalization and we show, in one particular problem instance, the performance of selected state of the art heuristic global optimisation algorithms. A deterministic search space pruning algorithm is then developed and its polynomial time and space complexity derived. The algorithm is shown to achieve search space reductions of greater than six orders of magnitude, thus reducing significantly the complexity of the subsequent optimisation.
Resumo:
A fast backward elimination algorithm is introduced based on a QR decomposition and Givens transformations to prune radial-basis-function networks. Nodes are sequentially removed using an increment of error variance criterion. The procedure is terminated by using a prediction risk criterion so as to obtain a model structure with good generalisation properties. The algorithm can be used to postprocess radial basis centres selected using a k-means routine and, in this mode, it provides a hybrid supervised centre selection approach.
Resumo:
Analyzes the use of linear and neural network models for financial distress classification, with emphasis on the issues of input variable selection and model pruning. A data-driven method for selecting input variables (financial ratios, in this case) is proposed. A case study involving 60 British firms in the period 1997-2000 is used for illustration. It is shown that the use of the Optimal Brain Damage pruning technique can considerably improve the generalization ability of a neural model. Moreover, the set of financial ratios obtained with the proposed selection procedure is shown to be an appropriate alternative to the ratios usually employed by practitioners.
Resumo:
In a world where data is captured on a large scale the major challenge for data mining algorithms is to be able to scale up to large datasets. There are two main approaches to inducing classification rules, one is the divide and conquer approach, also known as the top down induction of decision trees; the other approach is called the separate and conquer approach. A considerable amount of work has been done on scaling up the divide and conquer approach. However, very little work has been conducted on scaling up the separate and conquer approach.In this work we describe a parallel framework that allows the parallelisation of a certain family of separate and conquer algorithms, the Prism family. Parallelisation helps the Prism family of algorithms to harvest additional computer resources in a network of computers in order to make the induction of classification rules scale better on large datasets. Our framework also incorporates a pre-pruning facility for parallel Prism algorithms.
Resumo:
The Prism family of algorithms induces modular classification rules which, in contrast to decision tree induction algorithms, do not necessarily fit together into a decision tree structure. Classifiers induced by Prism algorithms achieve a comparable accuracy compared with decision trees and in some cases even outperform decision trees. Both kinds of algorithms tend to overfit on large and noisy datasets and this has led to the development of pruning methods. Pruning methods use various metrics to truncate decision trees or to eliminate whole rules or single rule terms from a Prism rule set. For decision trees many pre-pruning and postpruning methods exist, however for Prism algorithms only one pre-pruning method has been developed, J-pruning. Recent work with Prism algorithms examined J-pruning in the context of very large datasets and found that the current method does not use its full potential. This paper revisits the J-pruning method for the Prism family of algorithms and develops a new pruning method Jmax-pruning, discusses it in theoretical terms and evaluates it empirically.
Resumo:
The Prism family of algorithms induces modular classification rules in contrast to the Top Down Induction of Decision Trees (TDIDT) approach which induces classification rules in the intermediate form of a tree structure. Both approaches achieve a comparable classification accuracy. However in some cases Prism outperforms TDIDT. For both approaches pre-pruning facilities have been developed in order to prevent the induced classifiers from overfitting on noisy datasets, by cutting rule terms or whole rules or by truncating decision trees according to certain metrics. There have been many pre-pruning mechanisms developed for the TDIDT approach, but for the Prism family the only existing pre-pruning facility is J-pruning. J-pruning not only works on Prism algorithms but also on TDIDT. Although it has been shown that J-pruning produces good results, this work points out that J-pruning does not use its full potential. The original J-pruning facility is examined and the use of a new pre-pruning facility, called Jmax-pruning, is proposed and evaluated empirically. A possible pre-pruning facility for TDIDT based on Jmax-pruning is also discussed.
Resumo:
Prism is a modular classification rule generation method based on the ‘separate and conquer’ approach that is alternative to the rule induction approach using decision trees also known as ‘divide and conquer’. Prism often achieves a similar level of classification accuracy compared with decision trees, but tends to produce a more compact noise tolerant set of classification rules. As with other classification rule generation methods, a principle problem arising with Prism is that of overfitting due to over-specialised rules. In addition, over-specialised rules increase the associated computational complexity. These problems can be solved by pruning methods. For the Prism method, two pruning algorithms have been introduced recently for reducing overfitting of classification rules - J-pruning and Jmax-pruning. Both algorithms are based on the J-measure, an information theoretic means for quantifying the theoretical information content of a rule. Jmax-pruning attempts to exploit the J-measure to its full potential because J-pruning does not actually achieve this and may even lead to underfitting. A series of experiments have proved that Jmax-pruning may outperform J-pruning in reducing overfitting. However, Jmax-pruning is computationally relatively expensive and may also lead to underfitting. This paper reviews the Prism method and the two existing pruning algorithms above. It also proposes a novel pruning algorithm called Jmid-pruning. The latter is based on the J-measure and it reduces overfitting to a similar level as the other two algorithms but is better in avoiding underfitting and unnecessary computational effort. The authors conduct an experimental study on the performance of the Jmid-pruning algorithm in terms of classification accuracy and computational efficiency. The algorithm is also evaluated comparatively with the J-pruning and Jmax-pruning algorithms.
Resumo:
The huanglongbing (HLB) disease of citrus trees, caused by Candidatus Liberibacter asiaticus and Ca. Liberibacter americanus, was first reported in Brazil in March, 2004. The presence of the disease has caused serious concerns among growers. Pruning experiments were conducted to determine if removal of symptomatic branches or the entire canopy (decapitation) would eliminate infected tissues and save HLB-affected trees. Pruning was done in five blocks on a total of 592 3- to 16 year-old 'Valencia', 'Hamlin' or 'Pera' sweet orange trees showing no symptoms or with two levels of symptom severity. Ten decapitated trees per block were caged and all trees were treated with insecticides to control the psyllid vector, Diaphorina citri. Mottled leaves reappeared on most symptomatic (69.2%) as well on some asymptomatic (7.6%) pruned trees, regardless of age, variety, and pruning procedure. Presence of the pathogen (Ca. Liberibacter americanus) in all symptomatic trees was confirmed by PCR. In general, the greater the symptom severity before pruning the lower the percentage of trees that remained asymptomatic after pruning.
Resumo:
In this article we describe a feature extraction algorithm for pattern classification based on Bayesian Decision Boundaries and Pruning techniques. The proposed method is capable of optimizing MLP neural classifiers by retaining those neurons in the hidden layer that realy contribute to correct classification. Also in this article we proposed a method which defines a plausible number of neurons in the hidden layer based on the stem-and-leaf graphics of training samples. Experimental investigation reveals the efficiency of the proposed method. © 2002 IEEE.
Resumo:
The aim of the study was to evaluate production and determine the level of total soluble solids for cherry tomatoes, under protected cultivation carried out with different types of spacing and pruning. The experiment was performed according to a randomized block design in a 2×2 factorial scheme, with two types of spacing between plants and two types of pruning, and with five repetitions. The cultivar 'Sindy' (De Ruiter) was utilized. Each experimental parcel contained seven plants, and fruits were collected from the five central plants. The seedlings were produced in Styrofoam trays of 128 cells and transplanted at 33 days after planting using two types of spacing between plants (0.3 and 0.5 m) and 1 m spacing between rows. The plants were grown as single-or double-stem form and staked individually. The parameters evaluated were the number of fruit per plant, fresh weight of fruit and the level of total soluble solids expressed in °Brix. There was no evidence of significant interaction between the treatments. For fresh weight of fruit per plant, there was a significant effect when the plants were grown with a spacing of 1 × 0.5 m, with 4.12 kg per plant, compared to a production of 3.00 kg per plant with a spacing of 1 × 0.3 m. With regard to the number of fruit per plant, a significant difference was seen between the two types of spacing, where a spacing of 1 × 0.3 m yielded a lower number of fruit per plant (188.8), compared to that observed with a spacing of 1 × 0.5 m (238.1). With regard to the two types of pruning, there was a significant effect for only the number of fruit per plant, where the mean number of fruit was 188.4 with one stem and 238.4 with two stems. No significant difference was observed between the treatments for the level of total soluble solids. It is concluded that for the cultivar 'Sindy', under protected cultivation, production is better with a spacing of 0.5 m between plants and the use of two stems per plant.
Resumo:
Pattern recognition in large amount of data has been paramount in the last decade, since that is not straightforward to design interactive and real time classification systems. Very recently, the Optimum-Path Forest classifier was proposed to overcome such limitations, together with its training set pruning algorithm, which requires a parameter that has been empirically set up to date. In this paper, we propose a Harmony Search-based algorithm that can find near optimal values for that. The experimental results have showed that our algorithm is able to find proper values for the OPF pruning algorithm parameter. © 2011 IEEE.
Resumo:
The purpose of this study was to evaluate the physical and mechanical properties of particleboard made with pruning wastes from Ipê (Tabebuia serratifolia) and Chapéu-de-Sol (Terminalia catappa) trees. Particleboards were prepared with both wood species, using all the material produced by grinding the pruning wastes. The particleboards had dimensions of 45×45 cm, a thickness of approximately 11.5 mm and an average density of 664 kg/m3. A urea-formaldehyde adhesive was used in the proportion of 12% of the dry particle mass. The particleboards were pressed at a temperature of 130 C for 10 mins. The physical and mechanical properties analyzed were density, moisture content, thickness swelling, percentage of lignin and cellulose, modulus of resilience, modulus of elasticity and tensile strength parallel to the grain, accordingly to the standards NBR 14810 and CS 236-66 (1968). The particleboards were considered to be of medium density. The particle size significantly affected the static bending strength and tensile strength parallel to the grain. Ipê presented better results, demonstrating a potential for the production and use of particleboard made from this species. © The Author(s) 2013.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)