8 resultados para Hybrid clustering algorithm
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
We propose simple heuristics for the assembly line worker assignment and balancing problem. This problem typically occurs in assembly lines in sheltered work centers for the disabled. Different from the well-known simple assembly line balancing problem, the task execution times vary according to the assigned worker. We develop a constructive heuristic framework based on task and worker priority rules defining the order in which the tasks and workers should be assigned to the workstations. We present a number of such rules and compare their performance across three possible uses: as a stand-alone method, as an initial solution generator for meta-heuristics, and as a decoder for a hybrid genetic algorithm. Our results show that the heuristics are fast, they obtain good results as a stand-alone method and are efficient when used as a initial solution generator or as a solution decoder within more elaborate approaches.
Resumo:
The ubiquity of time series data across almost all human endeavors has produced a great interest in time series data mining in the last decade. While dozens of classification algorithms have been applied to time series, recent empirical evidence strongly suggests that simple nearest neighbor classification is exceptionally difficult to beat. The choice of distance measure used by the nearest neighbor algorithm is important, and depends on the invariances required by the domain. For example, motion capture data typically requires invariance to warping, and cardiology data requires invariance to the baseline (the mean value). Similarly, recent work suggests that for time series clustering, the choice of clustering algorithm is much less important than the choice of distance measure used.In this work we make a somewhat surprising claim. There is an invariance that the community seems to have missed, complexity invariance. Intuitively, the problem is that in many domains the different classes may have different complexities, and pairs of complex objects, even those which subjectively may seem very similar to the human eye, tend to be further apart under current distance measures than pairs of simple objects. This fact introduces errors in nearest neighbor classification, where some complex objects may be incorrectly assigned to a simpler class. Similarly, for clustering this effect can introduce errors by “suggesting” to the clustering algorithm that subjectively similar, but complex objects belong in a sparser and larger diameter cluster than is truly warranted.We introduce the first complexity-invariant distance measure for time series, and show that it generally produces significant improvements in classification and clustering accuracy. We further show that this improvement does not compromise efficiency, since we can lower bound the measure and use a modification of triangular inequality, thus making use of most existing indexing and data mining algorithms. We evaluate our ideas with the largest and most comprehensive set of time series mining experiments ever attempted in a single work, and show that complexity-invariant distance measures can produce improvements in classification and clustering in the vast majority of cases.
Resumo:
This paper addresses the m-machine no-wait flow shop problem where the set-up time of a job is separated from its processing time. The performance measure considered is the total flowtime. A new hybrid metaheuristic Genetic Algorithm-Cluster Search is proposed to solve the scheduling problem. The performance of the proposed method is evaluated and the results are compared with the best method reported in the literature. Experimental tests show superiority of the new method for the test problems set, regarding the solution quality. (c) 2012 Elsevier Ltd. All rights reserved.
Resumo:
There are some variants of the widely used Fuzzy C-Means (FCM) algorithm that support clustering data distributed across different sites. Those methods have been studied under different names, like collaborative and parallel fuzzy clustering. In this study, we offer some augmentation of the two FCM-based clustering algorithms used to cluster distributed data by arriving at some constructive ways of determining essential parameters of the algorithms (including the number of clusters) and forming a set of systematically structured guidelines such as a selection of the specific algorithm depending on the nature of the data environment and the assumptions being made about the number of clusters. A thorough complexity analysis, including space, time, and communication aspects, is reported. A series of detailed numeric experiments is used to illustrate the main ideas discussed in the study.
Resumo:
The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.
Resumo:
A power transformer needs continuous monitoring and fast protection as it is a very expensive piece of equipment and an essential element in an electrical power system. The most common protection technique used is the percentage differential logic, which provides discrimination between an internal fault and different operating conditions. Unfortunately, there are some operating conditions of power transformers that can mislead the conventional protection affecting the power system stability negatively. This study proposes the development of a new algorithm to improve the protection performance by using fuzzy logic, artificial neural networks and genetic algorithms. An electrical power system was modelled using Alternative Transients Program software to obtain the operational conditions and fault situations needed to test the algorithm developed, as well as a commercial differential relay. Results show improved reliability, as well as a fast response of the proposed technique when compared with conventional ones.
Resumo:
This paper addressed the problem of water-demand forecasting for real-time operation of water supply systems. The present study was conducted to identify the best fit model using hourly consumption data from the water supply system of Araraquara, Sa approximate to o Paulo, Brazil. Artificial neural networks (ANNs) were used in view of their enhanced capability to match or even improve on the regression model forecasts. The ANNs used were the multilayer perceptron with the back-propagation algorithm (MLP-BP), the dynamic neural network (DAN2), and two hybrid ANNs. The hybrid models used the error produced by the Fourier series forecasting as input to the MLP-BP and DAN2, called ANN-H and DAN2-H, respectively. The tested inputs for the neural network were selected literature and correlation analysis. The results from the hybrid models were promising, DAN2 performing better than the tested MLP-BP models. DAN2-H, identified as the best model, produced a mean absolute error (MAE) of 3.3 L/s and 2.8 L/s for training and test set, respectively, for the prediction of the next hour, which represented about 12% of the average consumption. The best forecasting model for the next 24 hours was again DAN2-H, which outperformed other compared models, and produced a MAE of 3.1 L/s and 3.0 L/s for training and test set respectively, which represented about 12% of average consumption. DOI: 10.1061/(ASCE)WR.1943-5452.0000177. (C) 2012 American Society of Civil Engineers.
Resumo:
This paper discusses the power allocation with fixed rate constraint problem in multi-carrier code division multiple access (MC-CDMA) networks, that has been solved through game theoretic perspective by the use of an iterative water-filling algorithm (IWFA). The problem is analyzed under various interference density configurations, and its reliability is studied in terms of solution existence and uniqueness. Moreover, numerical results reveal the approach shortcoming, thus a new method combining swarm intelligence and IWFA is proposed to make practicable the use of game theoretic approaches in realistic MC-CDMA systems scenarios. The contribution of this paper is twofold: (i) provide a complete analysis for the existence and uniqueness of the game solution, from simple to more realist and complex interference scenarios; (ii) propose a hybrid power allocation optimization method combining swarm intelligence, game theory and IWFA. To corroborate the effectiveness of the proposed method, an outage probability analysis in realistic interference scenarios, and a complexity comparison with the classical IWFA are presented. (C) 2011 Elsevier B.V. All rights reserved.