220 resultados para recursive partitioning algorithm
Resumo:
One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.
Resumo:
This paper presents a parallel genetic algorithm to the Steiner Problem in Networks. Several previous papers have proposed the adoption of GAs and others metaheuristics to solve the SPN demonstrating the validity of their approaches. This work differs from them for two main reasons: the dimension and the characteristics of the networks adopted in the experiments and the aim from which it has been originated. The reason that aimed this work was namely to build a comparison term for validating deterministic and computationally inexpensive algorithms which can be used in practical engineering applications, such as the multicast transmission in the Internet. On the other hand, the large dimensions of our sample networks require the adoption of a parallel implementation of the Steiner GA, which is able to deal with such large problem instances.
Resumo:
Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations.
Resumo:
The paper presents a design for a hardware genetic algorithm which uses a pipeline of systolic arrays. These arrays have been designed using systolic synthesis techniques which involve expressing the algorithm as a set of uniform recurrence relations. The final design divorces the fitness function evaluation from the hardware and can process chromosomes of different lengths, giving the design a generic quality. The paper demonstrates the design methodology by progressively re-writing a simple genetic algorithm, expressed in C code, into a form from which systolic structures can be deduced. This paper extends previous work by introducing a simplification to a previous systolic design for the genetic algorithm. The simplification results in the removal of 2N 2 + 4N cells and reduces the time complexity by 3N + 1 cycles.
Resumo:
We advocate the use of systolic design techniques to create custom hardware for Custom Computing Machines. We have developed a hardware genetic algorithm based on systolic arrays to illustrate the feasibility of the approach. The architecture is independent of the lengths of chromosomes used and can be scaled in size to accommodate different population sizes. An FPGA prototype design can process 16 million genes per second.
Resumo:
The results from three types of study with broilers, namely nitrogen (N) balance, bioassays and growth experiments, provided the data used herein. Sets of data on N balance and protein accretion (bioassay studies) were used to assess the ability of the monomolecular equation to describe the relationship between (i) N balance and amino acid (AA) intake and (ii) protein accretion and AA intake. The model estimated the levels of isoleucine, lysine, valine, threonine, methionine, total sulphur AAs and tryptophan resulting in zero balance to be 58, 59, 80, 96, 23, 85 and 32 mg/kg live weight (LW)/day, respectively. These estimates show good agreement with those obtained in previous studies. For the growth experiments, four models, specifically re-parameterized for analysing energy balance data, were evaluated for their ability to determine crude protein (CP) intake at maintenance and efficiency of utilization of CP intake for producing gain. They were: a straight line, two equations representing diminishing returns behaviour (monomolecular and rectangular hyperbola) and one equation describing smooth sigmoidal behaviour with a fixed point of inflexion (Gompertz). The estimates of CP requirement for maintenance and efficiency of utilization of CP intake for producing gain varied from 5.4 to 5.9 g/kg LW/day and 0.60 to 0.76, respectively, depending on the models.
Resumo:
Capturing the pattern of structural change is a relevant task in applied demand analysis, as consumer preferences may vary significantly over time. Filtering and smoothing techniques have recently played an increasingly relevant role. A dynamic Almost Ideal Demand System with random walk parameters is estimated in order to detect modifications in consumer habits and preferences, as well as changes in the behavioural response to prices and income. Systemwise estimation, consistent with the underlying constraints from economic theory, is achieved through the EM algorithm. The proposed model is applied to UK aggregate consumption of alcohol and tobacco, using quarterly data from 1963 to 2003. Increased alcohol consumption is explained by a preference shift, addictive behaviour and a lower price elasticity. The dynamic and time-varying specification is consistent with the theoretical requirements imposed at each sample point. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Field experiments were conducted over 3 years to assess the effect of a triazole fungicide programme, and additions of strobilurin fungicides to it, on nitrogen uptake, accumulation and partitioning in a range of winter wheat cultivars. Commensurate with delayed senescence, fungicide programmes, particularly when including strobilurins, improved grain yield through improvements in both crop biomass and harvest index, although the relationship with green area duration of the flag leaf (GFLAD) depended on year and in some cases, cultivar. In all years fungicide treatments significantly increased the amount of nitrogen in the above-ground biomass, the amount of nitrogen in the grain and the nitrogen harvest index. All these effects could be linearly related to the fungicide effect on GFLAD. These relationships occasionally interacted with cultivar but there was no evidence that fungicide mode of action affected the relationship between GFLAD and yield of nitrogen in the grain. Fungicide treatments significantly reduced the amount of soil mineral N at harvest and when severe disease had been controlled, the net remobilization of N from the vegetation to the grain after anthesis. Fungicide maintained the filling of grain with both dry matter and nitrogen. The proportionate accumulation of nitrogen in the grain was later than that of dry matter and this difference was greater when fungicide had been applied. Effects of fungicide on grain protein concentration and its relationship with GFLAD were inconsistent over year and cultivar. There were several instances where grain protein concentration was unaffected despite large (1(.)5 t/ha) increases in grain yield following fungicide use. Dilution of grain protein concentration following fungicide use, when it did occur, was small compared with what would be predicted by adoption of other yield increasing techniques such as the selection of high yielding cultivars (based on currently available cultivars) or by growing wheat in favourable climates.
Resumo:
The primary purpose of this study was to model the partitioning of evapotranspiration in a maize-sunflower intercrop at various canopy covers. The Shuttleworth-Wallace (SW) model was extended for intercropping systems to include both crop transpiration and soil evaporation and allowing interaction between the two. To test the accuracy of the extended SW model, two field experiments of maize-sunflower intercrop were conducted in 1998 and 1999. Plant transpiration and soil evaporation were measured using sap flow gauges and lysimeters, respectively. The mean prediction error (simulated minus measured values) for transpiration was zero (which indicated no overall bias in estimation error), and its accuracy was not affected by the plant growth stages, but simulated transpiration during high measured transpiration rates tended to be slightly underestimated. Overall, the predictions for daily soil evaporation were also accurate. Model estimation errors were probably due to the simplified modelling of soil water content, stomatal resistances and soil heat flux as well as due to the uncertainties in characterising the 2 micrometeorological conditions. The SW’s prediction of transpiration was most sensitive to parameters most directly related to the canopy characteristics such as the partitioning of captured solar radiation, canopy resistance, and bulk boundary layer resistance.
Resumo:
Growth patterns and cropping were evaluated over the season for the everbearing strawberry 'Everest' at a range of temperatures (15-27degreesC) in two light environments (ambient and 50% shade). The highest yield was recorded for unshaded plants grown at 23degreesC, but the optimum temperature for vegetative growth was 15degreesC. With increasing temperature fruit number increased, but fruit weight decreased. Fruit weight was also significantly reduced by shade, and although 'Everest' showed a degree of shade tolerance in vegetative growth, yield was consistently reduced by shade. Shade also reduced the number of crowns developed by the plants over the course of the season, emphasising that crown number was ultimately the limiting factor for yield potential. We conclude that, in contrast to Junebearers which partition more assimilates to fruit at temperatures around 15degreesC (Le Miere et al., 1998), optimised cropping in the everbearer 'Everest' is achieved at the significantly higher temperature of 23degreesC. These findings have significance for commercial production, in which protection tends to reduce light levels but increase average temperature throughout the season.