229 resultados para Cluster Counting Algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the results of the application of a parallel Genetic Algorithm (GA) in order to design a Fuzzy Proportional Integral (FPI) controller for active queue management on Internet routers. The Active Queue Management (AQM) policies are those policies of router queue management that allow the detection of network congestion, the notification of such occurrences to the hosts on the network borders, and the adoption of a suitable control policy. Two different parallel implementations of the genetic algorithm are adopted to determine an optimal configuration of the FPI controller parameters. Finally, the results of several experiments carried out on a forty nodes cluster of workstations are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a parallel genetic algorithm to the Steiner Problem in Networks. Several previous papers have proposed the adoption of GAs and others metaheuristics to solve the SPN demonstrating the validity of their approaches. This work differs from them for two main reasons: the dimension and the characteristics of the networks adopted in the experiments and the aim from which it has been originated. The reason that aimed this work was namely to build a comparison term for validating deterministic and computationally inexpensive algorithms which can be used in practical engineering applications, such as the multicast transmission in the Internet. On the other hand, the large dimensions of our sample networks require the adoption of a parallel implementation of the Steiner GA, which is able to deal with such large problem instances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper presents a design for a hardware genetic algorithm which uses a pipeline of systolic arrays. These arrays have been designed using systolic synthesis techniques which involve expressing the algorithm as a set of uniform recurrence relations. The final design divorces the fitness function evaluation from the hardware and can process chromosomes of different lengths, giving the design a generic quality. The paper demonstrates the design methodology by progressively re-writing a simple genetic algorithm, expressed in C code, into a form from which systolic structures can be deduced. This paper extends previous work by introducing a simplification to a previous systolic design for the genetic algorithm. The simplification results in the removal of 2N 2 + 4N cells and reduces the time complexity by 3N + 1 cycles.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We advocate the use of systolic design techniques to create custom hardware for Custom Computing Machines. We have developed a hardware genetic algorithm based on systolic arrays to illustrate the feasibility of the approach. The architecture is independent of the lengths of chromosomes used and can be scaled in size to accommodate different population sizes. An FPGA prototype design can process 16 million genes per second.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While the Cluster spacecraft were located near the high-latitude magnetopause, between 10:10 and 10:40 UT on 16 January 2004, three typical flux transfer event (FTE) signatures were observed. During this interval, simultaneous and conjugated all-sky camera measurements, recorded at Yellow River Station, Svalbard, are available at 630.0 and 557.7nm that show poleward-moving auroral forms (PMAFs), consistent with magnetic reconnection at dayside magnetopause. Simultaneous FTEs seen at the magnetopause mainly move northward, but having duskward (eastward) and tailward velocity components, roughly consistent with the observed direction of motion of the PMAFs in all-sky images. Between the PMAFs meridional keograms, extracted from the all-sky images, show intervals of lower intensity aurora which migrate equatorward just before the PMAFs intensify. This is strong evidence for an equatorward eroding and poleward moving open-closed boundary (OCB) associated with a variable magnetopause reconnection rate under variable IMF conditions. From the durations of the PMAFs we infer that the evolution time of FTEs is 5-11 minutes from its origin on magnetopause to its addition to the polar cap.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is known that germin, which is a marker of the onset of growth in germinating wheat, is an oxalate oxidase, and also that germins possess sequence similarity with legumin and vicilin seed storage proteins. These two pieces of information have been combined in order to generate a 3D model of germin based on the structure of vicilin and to examine the model with regard to a potential oxalate oxidase active site. A cluster of three histidine residues has been located within the conserved beta-barrel structure. While there is a relatively low level of overall sequence similarity between the model and the vicilin structures, the conservation of amino acids important in maintaining the scaffold of the beta-barrel lends confidence to the juxtaposition of the histidine residues. The cluster is similar structurally to those found in copper amine oxidase and other proteins, leading to the suggestion that it defines a metal-binding location within the oxalate oxidase active site. It is also proposed that the structural elements involved in intermolecular interactions in vicilins may play a role in oligomer formation in germin/oxalate oxidase.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Capturing the pattern of structural change is a relevant task in applied demand analysis, as consumer preferences may vary significantly over time. Filtering and smoothing techniques have recently played an increasingly relevant role. A dynamic Almost Ideal Demand System with random walk parameters is estimated in order to detect modifications in consumer habits and preferences, as well as changes in the behavioural response to prices and income. Systemwise estimation, consistent with the underlying constraints from economic theory, is achieved through the EM algorithm. The proposed model is applied to UK aggregate consumption of alcohol and tobacco, using quarterly data from 1963 to 2003. Increased alcohol consumption is explained by a preference shift, addictive behaviour and a lower price elasticity. The dynamic and time-varying specification is consistent with the theoretical requirements imposed at each sample point. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The contribution investigates the problem of estimating the size of a population, also known as the missing cases problem. Suppose a registration system is targeting to identify all cases having a certain characteristic such as a specific disease (cancer, heart disease, ...), disease related condition (HIV, heroin use, ...) or a specific behavior (driving a car without license). Every case in such a registration system has a certain notification history in that it might have been identified several times (at least once) which can be understood as a particular capture-recapture situation. Typically, cases are left out which have never been listed at any occasion, and it is this frequency one wants to estimate. In this paper modelling is concentrating on the counting distribution, e.g. the distribution of the variable that counts how often a given case has been identified by the registration system. Besides very simple models like the binomial or Poisson distribution, finite (nonparametric) mixtures of these are considered providing rather flexible modelling tools. Estimation is done using maximum likelihood by means of the EM algorithm. A case study on heroin users in Bangkok in the year 2001 is completing the contribution.