849 resultados para constrained clustering
Resumo:
In data mining, efforts have focused on finding methods for efficient and effective cluster analysis in large databases. Active themes of research focus on the scalability of clustering methods, the effectiveness of methods for clustering complex shapes and types of data, high-dimensional clustering techniques, and methods for clustering mixed numerical and categorical data in large databases. One of the most accuracy approach based on dynamic modeling of cluster similarity is called Chameleon. In this paper we present a modified hierarchical clustering algorithm that used the main idea of Chameleon and the effectiveness of suggested approach will be demonstrated by the experimental results.
Resumo:
The purpose of this paper is to explain the notion of clustering and a concrete clustering method- agglomerative hierarchical clustering algorithm. It shows how a data mining method like clustering can be applied to the analysis of stocks, traded on the Bulgarian Stock Exchange in order to identify similar temporal behavior of the traded stocks. This problem is solved with the aid of a data mining tool that is called XLMiner™ for Microsoft Excel Office.
Resumo:
This paper proposes a constrained nonparametric method of estimating an input distance function. A regression function is estimated via kernel methods without functional form assumptions. To guarantee that the estimated input distance function satisfies its properties, monotonicity constraints are imposed on the regression surface via the constraint weighted bootstrapping method borrowed from statistics literature. The first, second, and cross partial analytical derivatives of the estimated input distance function are derived, and thus the elasticities measuring input substitutability can be computed from them. The method is then applied to a cross-section of 3,249 Norwegian timber producers.
Resumo:
The re-entrant flow shop scheduling problem (RFSP) is regarded as a NP-hard problem and attracted the attention of both researchers and industry. Current approach attempts to minimize the makespan of RFSP without considering the interdependency between the resource constraints and the re-entrant probability. This paper proposed Multi-level genetic algorithm (GA) by including the co-related re-entrant possibility and production mode in multi-level chromosome encoding. Repair operator is incorporated in the Multi-level genetic algorithm so as to revise the infeasible solution by resolving the resource conflict. With the objective of minimizing the makespan, Multi-level genetic algorithm (GA) is proposed and ANOVA is used to fine tune the parameter setting of GA. The experiment shows that the proposed approach is more effective to find the near-optimal schedule than the simulated annealing algorithm for both small-size problem and large-size problem. © 2013 Published by Elsevier Ltd.
Resumo:
A new distance function to compare arbitrary partitions is proposed. Clustering of image collections and image segmentation give objects to be matched. Offered metric intends for combination of visual features and metadata analysis to solve a semantic gap between low-level visual features and high-level human concept.
Resumo:
In a paper the method of complex systems and processes clustering based use of genetic algorithm is offered. The aspects of its realization and shaping of fitness-function are considered. The solution of clustering task of Ukraine areas on socio-economic indexes is represented and comparative analysis with outcomes of classical methods is realized.
Resumo:
In recent years, there has been an increas-ing interest in learning a distributed rep-resentation of word sense. Traditional context clustering based models usually require careful tuning of model parame-ters, and typically perform worse on infre-quent word senses. This paper presents a novel approach which addresses these lim-itations by first initializing the word sense embeddings through learning sentence-level embeddings from WordNet glosses using a convolutional neural networks. The initialized word sense embeddings are used by a context clustering based model to generate the distributed representations of word senses. Our learned represen-tations outperform the publicly available embeddings on 2 out of 4 metrics in the word similarity task, and 6 out of 13 sub tasks in the analogical reasoning task.
Resumo:
2000 Mathematics Subject Classification: 90C48, 49N15, 90C25
Resumo:
Иван Гинчев - Класът на ℓ-устойчивите в точка функции, дефиниран в [2] и разширяващ класа на C1,1 функциите, се обобщава от скаларни за векторни функции. Доказани са някои свойства на ℓ-устойчивите векторни функции. Показано е, че векторни оптимизационни задачи с ограничения допускат условия от втори ред изразени чрез посочни производни, което обобщава резултати от [2] и [5].
Resumo:
2000 Mathematics Subject Classification: 62H30
Resumo:
In recent years, there has been an increasing interest in learning a distributed representation of word sense. Traditional context clustering based models usually require careful tuning of model parameters, and typically perform worse on infrequent word senses. This paper presents a novel approach which addresses these limitations by first initializing the word sense embeddings through learning sentence-level embeddings from WordNet glosses using a convolutional neural networks. The initialized word sense embeddings are used by a context clustering based model to generate the distributed representations of word senses. Our learned representations outperform the publicly available embeddings on half of the metrics in the word similarity task, 6 out of 13 sub tasks in the analogical reasoning task, and gives the best overall accuracy in the word sense effect classification task, which shows the effectiveness of our proposed distributed distribution learning model.
Resumo:
Using data from the 2004 wave of the Afrobarometer survey, this study examines correlates of household hardship in three countries of sub-Saharan Africa: Tanzania, Zambia, and Zimbabwe. Findings provide partial support for the hypothesized relationship. Specifically, poverty reduction initiatives and informal assistance are associated with reduced hardship while civic engagement is related to an increase in household hardship. We also note that certain demographic characteristics are linked to hardship. Policy and practice implications are suggested. © The Author(s) 2011.
Resumo:
In this paper, we focus on the design of bivariate EDAs for discrete optimization problems and propose a new approach named HSMIEC. While the current EDAs require much time in the statistical learning process as the relationships among the variables are too complicated, we employ the Selfish gene theory (SG) in this approach, as well as a Mutual Information and Entropy based Cluster (MIEC) model is also set to optimize the probability distribution of the virtual population. This model uses a hybrid sampling method by considering both the clustering accuracy and clustering diversity and an incremental learning and resample scheme is also set to optimize the parameters of the correlations of the variables. Compared with several benchmark problems, our experimental results demonstrate that HSMIEC often performs better than some other EDAs, such as BMDA, COMIT, MIMIC and ECGA. © 2009 Elsevier B.V. All rights reserved.
Resumo:
We propose weakly-constrained stream and block codes with tunable pattern-dependent statistics and demonstrate that the block code capacity at large block sizes is close to the the prediction obtained from a simple Markov model published earlier. We demonstrate the feasibility of the code by presenting original encoding and decoding algorithms with a complexity log-linear in the block size and with modest table memory requirements. We also show that when such codes are used for mitigation of patterning effects in optical fibre communications, a gain of about 0.5dB is possible under realistic conditions, at the expense of small redundancy 10%). © 2006 IEEE.
Resumo:
Abstract Driven by the political and economic forces of cross-strait, Taiwan has become one of the major source markets for Hong Kong tourism industry since 1987. The major purposes of this study were to investigate the following factors (1) The influential factors of travel motivation, (2) The clusters of travel motivations, (3) The marketing segmentation of clusters of Taiwanese tourists to visit Hong Kong. Through ten travel agents, self-report surveys were distributed to collect data from 366 Taiwanese travelers. Hence, four push factors and six pull factors were identified as travel motivations through the factor analysis. Combined with the cluster analysis; five new groups were founded. Finally, five clusters which process unique profiles (location difference, visiting frequency, travel satisfaction, and destination loyalty) were addressed. The suggestions of developing effective market strategies to attract Taiwanese tourists to Hong Kong were also provided.