956 resultados para Cluster Counting Algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conferência - 16th International Symposium on Wireless Personal Multimedia Communications (WPMC)- Jun 24-27, 2013

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Research on the problem of feature selection for clustering continues to develop. This is a challenging task, mainly due to the absence of class labels to guide the search for relevant features. Categorical feature selection for clustering has rarely been addressed in the literature, with most of the proposed approaches having focused on numerical data. In this work, we propose an approach to simultaneously cluster categorical data and select a subset of relevant features. Our approach is based on a modification of a finite mixture model (of multinomial distributions), where a set of latent variables indicate the relevance of each feature. To estimate the model parameters, we implement a variant of the expectation-maximization algorithm that simultaneously selects the subset of relevant features, using a minimum message length criterion. The proposed approach compares favourably with two baseline methods: a filter based on an entropy measure and a wrapper based on mutual information. The results obtained on synthetic data illustrate the ability of the proposed expectation-maximization method to recover ground truth. An application to real data, referred to official statistics, shows its usefulness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cluster analysis for categorical data has been an active area of research. A well-known problem in this area is the determination of the number of clusters, which is unknown and must be inferred from the data. In order to estimate the number of clusters, one often resorts to information criteria, such as BIC (Bayesian information criterion), MML (minimum message length, proposed by Wallace and Boulton, 1968), and ICL (integrated classification likelihood). In this work, we adopt the approach developed by Figueiredo and Jain (2002) for clustering continuous data. They use an MML criterion to select the number of clusters and a variant of the EM algorithm to estimate the model parameters. This EM variant seamlessly integrates model estimation and selection in a single algorithm. For clustering categorical data, we assume a finite mixture of multinomial distributions and implement a new EM algorithm, following a previous version (Silvestre et al., 2008). Results obtained with synthetic datasets are encouraging. The main advantage of the proposed approach, when compared to the above referred criteria, is the speed of execution, which is especially relevant when dealing with large data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In data clustering, the problem of selecting the subset of most relevant features from the data has been an active research topic. Feature selection for clustering is a challenging task due to the absence of class labels for guiding the search for relevant features. Most methods proposed for this goal are focused on numerical data. In this work, we propose an approach for clustering and selecting categorical features simultaneously. We assume that the data originate from a finite mixture of multinomial distributions and implement an integrated expectation-maximization (EM) algorithm that estimates all the parameters of the model and selects the subset of relevant features simultaneously. The results obtained on synthetic data illustrate the performance of the proposed approach. An application to real data, referred to official statistics, shows its usefulness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Trabalho Final de Mestrado para obtenção do grau de Mestre em Engenharia de Electrónica e Telecomunicações

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objectivo do estudo: comparar o desempenho dos algoritmos Pencil Beam Convolution (PBC) e do Analytical Anisotropic Algorithm (AAA) no planeamento do tratamento de tumores de mama com radioterapia conformacional a 3D.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Firms located within a cluster have access to tacit, complex and specific local knowledge which allow them to develop competitive advantage. However, firms have no equal ability to access and to apply that knowledge, meaning that not all have a similar knowledge absorptive capacity. Using a sample of the largest Portuguese firms within a footwear cluster, this paper examine whether there are significant differences in firm’s absorptive capacity and whether such differences within a cluster are related to firms’ specific characteristics. The results suggest that absorptive capacity is significantly associated with the firms’ characteristics, namely size, export intensity and position within the cluster.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Known algorithms capable of scheduling implicit-deadline sporadic tasks over identical processors at up to 100% utilisation invariably involve numerous preemptions and migrations. To the challenge of devising a scheduling scheme with as few preemptions and migrations as possible, for a given guaranteed utilisation bound, we respond with the algorithm NPS-F. It is configurable with a parameter, trading off guaranteed schedulable utilisation (up to 100%) vs preemptions. For any possible configuration, NPS-F introduces fewer preemptions than any other known algorithm matching its utilisation bound. A clustered variant of the algorithm, for systems made of multicore chips, eliminates (costly) off-chip task migrations, by dividing processors into disjoint clusters, formed by cores on the same chip (with the cluster size being a parameter). Clusters are independently scheduled (each, using non-clustered NPS-F). The utilisation bound is only moderately affected. We also formulate an important extension (applicable to both clustered and non-clustered NPS-F) which optimises the supply of processing time to executing tasks and makes it more granular. This reduces processing capacity requirements for schedulability without increasing preemptions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Consider a single processor and a software system. The software system comprises components and interfaces where each component has an associated interface and each component comprises a set of constrained-deadline sporadic tasks. A scheduling algorithm (called global scheduler) determines at each instant which component is active. The active component uses another scheduling algorithm (called local scheduler) to determine which task is selected for execution on the processor. The interface of a component makes certain information about a component visible to other components; the interfaces of all components are used for schedulability analysis. We address the problem of generating an interface for a component based on the tasks inside the component. We desire to (i) incur only a small loss in schedulability analysis due to the interface and (ii) ensure that the amount of space (counted in bits) of the interface is small; this is because such an interface hides as much details of the component as possible. We present an algorithm for generating such an interface.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mestrado em Controlo de Gestão e dos Negócios

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The IEEE 802.15.4 standard provides appealing features to simultaneously support real-time and non realtime traffic, but it is only capable of supporting real-time communications from at most seven devices. Additionally, it cannot guarantee delay bounds lower than the superframe duration. Motivated by this problem, in this paper we propose an Explicit Guaranteed time slot Sharing and Allocation scheme (EGSA) for beacon-enabled IEEE 802.15.4 networks. This scheme is capable of providing tighter delay bounds for real-time communications by splitting the Contention Free access Period (CFP) into smaller mini time slots and by means of a new guaranteed bandwidth allocation scheme for a set of devices with periodic messages. At the same the novel bandwidth allocation scheme can maximize the duration of the CFP for non real-time communications. Performance analysis results show that the EGSA scheme works efficiently and outperforms competitor schemes both in terms of guaranteed delay and bandwidth utilization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a 12*(1+|R|/(4m))-speed algorithm for scheduling constrained-deadline sporadic real-time tasks on a multiprocessor comprising m processors where a task may request one of |R| sequentially-reusable shared resources.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modeling the fundamental performance limits of Wireless Sensor Networks (WSNs) is of paramount importance to understand their behavior under the worst-case conditions and to make the appropriate design choices. This is particular relevant for time-sensitive WSN applications, where the timing behavior of the network protocols (message transmission must respect deadlines) impacts on the correct operation of these applications. In that direction this paper contributes with a methodology based on Network Calculus, which enables quick and efficient worst-case dimensioning of static or even dynamically changing cluster-tree WSNs where the data sink can either be static or mobile. We propose closed-form recurrent expressions for computing the worst-case end-to-end delays, buffering and bandwidth requirements across any source-destination path in a cluster-tree WSN. We show how to apply our methodology to the case of IEEE 802.15.4/ZigBee cluster-tree WSNs. Finally, we demonstrate the validity and analyze the accuracy of our methodology through a comprehensive experimental study using commercially available technology, namely TelosB motes running TinyOS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cluster scheduling and collision avoidance are crucial issues in large-scale cluster-tree Wireless Sensor Networks (WSNs). The paper presents a methodology that provides a Time Division Cluster Scheduling (TDCS) mechanism based on the cyclic extension of RCPS/TC (Resource Constrained Project Scheduling with Temporal Constraints) problem for a cluster-tree WSN, assuming bounded communication errors. The objective is to meet all end-to-end deadlines of a predefined set of time-bounded data flows while minimizing the energy consumption of the nodes by setting the TDCS period as long as possible. Sinceeach cluster is active only once during the period, the end-to-end delay of a given flow may span over several periods when there are the flows with opposite direction. The scheduling tool enables system designers to efficiently configure all required parameters of the IEEE 802.15.4/ZigBee beaconenabled cluster-tree WSNs in the network design time. The performance evaluation of thescheduling tool shows that the problems with dozens of nodes can be solved while using optimal solvers.