840 resultados para Evolutionary clustering
Resumo:
This paper addresses the m-machine no-wait flow shop problem where the set-up time of a job is separated from its processing time. The performance measure considered is the total flowtime. A new hybrid metaheuristic Genetic Algorithm-Cluster Search is proposed to solve the scheduling problem. The performance of the proposed method is evaluated and the results are compared with the best method reported in the literature. Experimental tests show superiority of the new method for the test problems set, regarding the solution quality. (c) 2012 Elsevier Ltd. All rights reserved.
Resumo:
In a paper the method of complex systems and processes clustering based use of genetic algorithm is offered. The aspects of its realization and shaping of fitness-function are considered. The solution of clustering task of Ukraine areas on socio-economic indexes is represented and comparative analysis with outcomes of classical methods is realized.
Resumo:
One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.
Resumo:
The Capacitated Centered Clustering Problem (CCCP) consists of defining a set of p groups with minimum dissimilarity on a network with n points. Demand values are associated with each point and each group has a demand capacity. The problem is well known to be NP-hard and has many practical applications. In this paper, the hybrid method Clustering Search (CS) is implemented to solve the CCCP. This method identifies promising regions of the search space by generating solutions with a metaheuristic, such as Genetic Algorithm, and clustering them into clusters that are then explored further with local search heuristics. Computational results considering instances available in the literature are presented to demonstrate the efficacy of CS. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
In this paper we deal with the problem of boosting the Optimum-Path Forest (OPF) clustering approach using evolutionary-based optimization techniques. As the OPF classifier performs an exhaustive search to find out the size of sample's neighborhood that allows it to reach the minimum graph cut as a quality measure, we compared several optimization techniques that can obtain close graph cut values to the ones obtained by brute force. Experiments in two public datasets in the context of unsupervised network intrusion detection have showed the evolutionary optimization techniques can find suitable values for the neighborhood faster than the exhaustive search. Additionally, we have showed that it is not necessary to employ many agents for such task, since the neighborhood size is defined by discrete values, with constrain the set of possible solution to a few ones.
Resumo:
Creative industries tend to concentrate mainly around large- and medium-sized cities, forming creative local production systems. The text analyses the forces behind clustering of creative industries to provide the first empirical explanation of the determinants of creative employment clustering following a multidisciplinary approach based on cultural and creative economics, evolutionary geography and urban economics. A comparative analysis has been performed for Italy and Spain. The results show different patterns of creative employment clustering in both countries. The small role of historical and cultural endowments, the size of the place, the average size of creative industries, the productive diversity and the concentration of human capital and creative class have been found as common factors of clustering in both countries.
Resumo:
Genetic diversity of contemporary domesticated species is shaped by both natural and human-driven processes. However, until now, little is known about how domestication has imprinted the variation of fruit tree species. In this study, we reconstruct the recent evolutionary history of the domesticated almond tree, Prunus dulcis, around the Mediterranean basin, using a combination of nuclear and chloroplast microsatellites [i.e. simple sequence repeat (SSRs)] to investigate patterns of genetic diversity. Whereas conservative chloroplast SSRs show a widespread haplotype and rare locally distributed variants, nuclear SSRs show a pattern of isolation by distance with clines of diversity from the East to the West of the Mediterranean basin, while Bayesian genetic clustering reveals a substantial longitudinal genetic structure. Both kinds of markers thus support a single domestication event, in the eastern side of the Mediterranean basin. In addition, model-based estimation of the timing of genetic divergence among those clusters is estimated sometime during the Holocene, a result that is compatible with human-mediated dispersal of almond tree out of its centre of origin. Still, the detection of region-specific alleles suggests that gene flow from relictual wild preglacial populations (in North Africa) or from wild counterparts (in the Near East) could account for a fraction of the diversity observed.
Resumo:
Abstract The object of game theory lies in the analysis of situations where different social actors have conflicting requirements and where their individual decisions will all influence the global outcome. In this framework, several games have been invented to capture the essence of various dilemmas encountered in many common important socio-economic situations. Even though these games often succeed in helping us understand human or animal behavior in interactive settings, some experiments have shown that people tend to cooperate with each other in situations for which classical game theory strongly recommends them to do the exact opposite. Several mechanisms have been invoked to try to explain the emergence of this unexpected cooperative attitude. Among them, repeated interaction, reputation, and belonging to a recognizable group have often been mentioned. However, the work of Nowak and May (1992) showed that the simple fact of arranging the players according to a spatial structure and only allowing them to interact with their immediate neighbors is sufficient to sustain a certain amount of cooperation even when the game is played anonymously and without repetition. Nowak and May's study and much of the following work was based on regular structures such as two-dimensional grids. Axelrod et al. (2002) showed that by randomizing the choice of neighbors, i.e. by actually giving up a strictly local geographical structure, cooperation can still emerge, provided that the interaction patterns remain stable in time. This is a first step towards a social network structure. However, following pioneering work by sociologists in the sixties such as that of Milgram (1967), in the last few years it has become apparent that many social and biological interaction networks, and even some technological networks, have particular, and partly unexpected, properties that set them apart from regular or random graphs. Among other things, they usually display broad degree distributions, and show small-world topological structure. Roughly speaking, a small-world graph is a network where any individual is relatively close, in terms of social ties, to any other individual, a property also found in random graphs but not in regular lattices. However, in contrast with random graphs, small-world networks also have a certain amount of local structure, as measured, for instance, by a quantity called the clustering coefficient. In the same vein, many real conflicting situations in economy and sociology are not well described neither by a fixed geographical position of the individuals in a regular lattice, nor by a random graph. Furthermore, it is a known fact that network structure can highly influence dynamical phenomena such as the way diseases spread across a population and ideas or information get transmitted. Therefore, in the last decade, research attention has naturally shifted from random and regular graphs towards better models of social interaction structures. The primary goal of this work is to discover whether or not the underlying graph structure of real social networks could give explanations as to why one finds higher levels of cooperation in populations of human beings or animals than what is prescribed by classical game theory. To meet this objective, I start by thoroughly studying a real scientific coauthorship network and showing how it differs from biological or technological networks using divers statistical measurements. Furthermore, I extract and describe its community structure taking into account the intensity of a collaboration. Finally, I investigate the temporal evolution of the network, from its inception to its state at the time of the study in 2006, suggesting also an effective view of it as opposed to a historical one. Thereafter, I combine evolutionary game theory with several network models along with the studied coauthorship network in order to highlight which specific network properties foster cooperation and shed some light on the various mechanisms responsible for the maintenance of this same cooperation. I point out the fact that, to resist defection, cooperators take advantage, whenever possible, of the degree-heterogeneity of social networks and their underlying community structure. Finally, I show that cooperation level and stability depend not only on the game played, but also on the evolutionary dynamic rules used and the individual payoff calculations. Synopsis Le but de la théorie des jeux réside dans l'analyse de situations dans lesquelles différents acteurs sociaux, avec des objectifs souvent conflictuels, doivent individuellement prendre des décisions qui influenceront toutes le résultat global. Dans ce cadre, plusieurs jeux ont été inventés afin de saisir l'essence de divers dilemmes rencontrés dans d'importantes situations socio-économiques. Bien que ces jeux nous permettent souvent de comprendre le comportement d'êtres humains ou d'animaux en interactions, des expériences ont montré que les individus ont parfois tendance à coopérer dans des situations pour lesquelles la théorie classique des jeux prescrit de faire le contraire. Plusieurs mécanismes ont été invoqués pour tenter d'expliquer l'émergence de ce comportement coopératif inattendu. Parmi ceux-ci, la répétition des interactions, la réputation ou encore l'appartenance à des groupes reconnaissables ont souvent été mentionnés. Toutefois, les travaux de Nowak et May (1992) ont montré que le simple fait de disposer les joueurs selon une structure spatiale en leur permettant d'interagir uniquement avec leurs voisins directs est suffisant pour maintenir un certain niveau de coopération même si le jeu est joué de manière anonyme et sans répétitions. L'étude de Nowak et May, ainsi qu'un nombre substantiel de travaux qui ont suivi, étaient basés sur des structures régulières telles que des grilles à deux dimensions. Axelrod et al. (2002) ont montré qu'en randomisant le choix des voisins, i.e. en abandonnant une localisation géographique stricte, la coopération peut malgré tout émerger, pour autant que les schémas d'interactions restent stables au cours du temps. Ceci est un premier pas en direction d'une structure de réseau social. Toutefois, suite aux travaux précurseurs de sociologues des années soixante, tels que ceux de Milgram (1967), il est devenu clair ces dernières années qu'une grande partie des réseaux d'interactions sociaux et biologiques, et même quelques réseaux technologiques, possèdent des propriétés particulières, et partiellement inattendues, qui les distinguent de graphes réguliers ou aléatoires. Entre autres, ils affichent en général une distribution du degré relativement large ainsi qu'une structure de "petit-monde". Grossièrement parlant, un graphe "petit-monde" est un réseau où tout individu se trouve relativement près de tout autre individu en termes de distance sociale, une propriété également présente dans les graphes aléatoires mais absente des grilles régulières. Par contre, les réseaux "petit-monde" ont, contrairement aux graphes aléatoires, une certaine structure de localité, mesurée par exemple par une quantité appelée le "coefficient de clustering". Dans le même esprit, plusieurs situations réelles de conflit en économie et sociologie ne sont pas bien décrites ni par des positions géographiquement fixes des individus en grilles régulières, ni par des graphes aléatoires. De plus, il est bien connu que la structure même d'un réseau peut passablement influencer des phénomènes dynamiques tels que la manière qu'a une maladie de se répandre à travers une population, ou encore la façon dont des idées ou une information s'y propagent. Ainsi, durant cette dernière décennie, l'attention de la recherche s'est tout naturellement déplacée des graphes aléatoires et réguliers vers de meilleurs modèles de structure d'interactions sociales. L'objectif principal de ce travail est de découvrir si la structure sous-jacente de graphe de vrais réseaux sociaux peut fournir des explications quant aux raisons pour lesquelles on trouve, chez certains groupes d'êtres humains ou d'animaux, des niveaux de coopération supérieurs à ce qui est prescrit par la théorie classique des jeux. Dans l'optique d'atteindre ce but, je commence par étudier un véritable réseau de collaborations scientifiques et, en utilisant diverses mesures statistiques, je mets en évidence la manière dont il diffère de réseaux biologiques ou technologiques. De plus, j'extrais et je décris sa structure de communautés en tenant compte de l'intensité d'une collaboration. Finalement, j'examine l'évolution temporelle du réseau depuis son origine jusqu'à son état en 2006, date à laquelle l'étude a été effectuée, en suggérant également une vue effective du réseau par opposition à une vue historique. Par la suite, je combine la théorie évolutionnaire des jeux avec des réseaux comprenant plusieurs modèles et le réseau de collaboration susmentionné, afin de déterminer les propriétés structurelles utiles à la promotion de la coopération et les mécanismes responsables du maintien de celle-ci. Je mets en évidence le fait que, pour ne pas succomber à la défection, les coopérateurs exploitent dans la mesure du possible l'hétérogénéité des réseaux sociaux en termes de degré ainsi que la structure de communautés sous-jacente de ces mêmes réseaux. Finalement, je montre que le niveau de coopération et sa stabilité dépendent non seulement du jeu joué, mais aussi des règles de la dynamique évolutionnaire utilisées et du calcul du bénéfice d'un individu.
Resumo:
BACKGROUND: Genes involved in arbuscular mycorrhizal (AM) symbiosis have been identified primarily by mutant screens, followed by identification of the mutated genes (forward genetics). In addition, a number of AM-related genes has been identified by their AM-related expression patterns, and their function has subsequently been elucidated by knock-down or knock-out approaches (reverse genetics). However, genes that are members of functionally redundant gene families, or genes that have a vital function and therefore result in lethal mutant phenotypes, are difficult to identify. If such genes are constitutively expressed and therefore escape differential expression analyses, they remain elusive. The goal of this study was to systematically search for AM-related genes with a bioinformatics strategy that is insensitive to these problems. The central element of our approach is based on the fact that many AM-related genes are conserved only among AM-competent species. RESULTS: Our approach involves genome-wide comparisons at the proteome level of AM-competent host species with non-mycorrhizal species. Using a clustering method we first established orthologous/paralogous relationships and subsequently identified protein clusters that contain members only of the AM-competent species. Proteins of these clusters were then analyzed in an extended set of 16 plant species and ranked based on their relatedness among AM-competent monocot and dicot species, relative to non-mycorrhizal species. In addition, we combined the information on the protein-coding sequence with gene expression data and with promoter analysis. As a result we present a list of yet uncharacterized proteins that show a strongly AM-related pattern of sequence conservation, indicating that the respective genes may have been under selection for a function in AM. Among the top candidates are three genes that encode a small family of similar receptor-like kinases that are related to the S-locus receptor kinases involved in sporophytic self-incompatibility. CONCLUSIONS: We present a new systematic strategy of gene discovery based on conservation of the protein-coding sequence that complements classical forward and reverse genetics. This strategy can be applied to diverse other biological phenomena if species with established genome sequences fall into distinguished groups that differ in a defined functional trait of interest.
Resumo:
Naïvement perçu, le processus d’évolution est une succession d’événements de duplication et de mutations graduelles dans le génome qui mènent à des changements dans les fonctions et les interactions du protéome. La famille des hydrolases de guanosine triphosphate (GTPases) similaire à Ras constitue un bon modèle de travail afin de comprendre ce phénomène fondamental, car cette famille de protéines contient un nombre limité d’éléments qui diffèrent en fonctionnalité et en interactions. Globalement, nous désirons comprendre comment les mutations singulières au niveau des GTPases affectent la morphologie des cellules ainsi que leur degré d’impact sur les populations asynchrones. Mon travail de maîtrise vise à classifier de manière significative différents phénotypes de la levure Saccaromyces cerevisiae via l’analyse de plusieurs critères morphologiques de souches exprimant des GTPases mutées et natives. Notre approche à base de microscopie et d’analyses bioinformatique des images DIC (microscopie d’interférence différentielle de contraste) permet de distinguer les phénotypes propres aux cellules natives et aux mutants. L’emploi de cette méthode a permis une détection automatisée et une caractérisation des phénotypes mutants associés à la sur-expression de GTPases constitutivement actives. Les mutants de GTPases constitutivement actifs Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L et Rsr1 G12V ont été analysés avec succès. En effet, l’implémentation de différents algorithmes de partitionnement, permet d’analyser des données qui combinent les mesures morphologiques de population native et mutantes. Nos résultats démontrent que l’algorithme Fuzzy C-Means performe un partitionnement efficace des cellules natives ou mutantes, où les différents types de cellules sont classifiés en fonction de plusieurs facteurs de formes cellulaires obtenus à partir des images DIC. Cette analyse démontre que les mutations Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L et Rsr1 G12V induisent respectivement des phénotypes amorphe, allongé, rond et large qui sont représentés par des vecteurs de facteurs de forme distincts. Ces distinctions sont observées avec différentes proportions (morphologie mutante / morphologie native) dans les populations de mutants. Le développement de nouvelles méthodes automatisées d’analyse morphologique des cellules natives et mutantes s’avère extrêmement utile pour l’étude de la famille des GTPases ainsi que des résidus spécifiques qui dictent leurs fonctions et réseau d’interaction. Nous pouvons maintenant envisager de produire des mutants de GTPases qui inversent leur fonction en ciblant des résidus divergents. La substitution fonctionnelle est ensuite détectée au niveau morphologique grâce à notre nouvelle stratégie quantitative. Ce type d’analyse peut également être transposé à d’autres familles de protéines et contribuer de manière significative au domaine de la biologie évolutive.
Resumo:
In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification or erroneous annotations in the database.
Resumo:
The evolutionary history of gains and losses of vegetative reproductive propagules (soredia) in Porpidia s.l., a group of lichen-forming ascomycetes, was clarified using Bayesian Markov chain Monte Carlo (MCMC) approaches to monophyly tests and a combined MCMC and maximum likelihood approach to ancestral character state reconstructions. The MCMC framework provided confidence estimates for the reconstructions of relationships and ancestral character states, which formed the basis for tests of evolutionary hypotheses. Monophyly tests rejected all hypotheses that predicted any clustering of reproductive modes in extant taxa. In addition, a nearest-neighbor statistic could not reject the hypothesis that the vegetative reproductive mode is randomly distributed throughout the group. These results show that transitions between presence and absence of the vegetative reproductive mode within Porpidia s.l. occurred several times and independently of each other. Likelihood reconstructions of ancestral character states at selected nodes suggest that - contrary to previous thought - the ancestor to Porpidia s.l. already possessed the vegetative reproductive mode. Furthermore, transition rates are reconstructed asymmetrically with the vegetative reproductive mode being gained at a much lower rate than it is lost. A cautious note has to be added, because a simulation study showed that the ancestral character state reconstructions were highly dependent on taxon sampling. However, our central conclusions, particularly the higher rate of change from vegetative reproductive mode present to absent than vice versa within Porpidia s.l., were found to be broadly independent of taxon sampling. [Ancestral character state reconstructions; Ascomycota, Bayesian inference; hypothesis testing; likelihood; MCMC; Porpidia; reproductive systems]
Resumo:
We present some additions to a fuzzy variable radius niche technique called Dynamic Niche Clustering (DNC) (Gan and Warwick, 1999; 2000; 2001) that enable the identification and creation of niches of arbitrary shape through a mechanism called Niche Linkage. We show that by using this mechanism it is possible to attain better feature extraction from the underlying population.