878 resultados para Genetic Algorithm optimization
Resumo:
Graph pebbling is a network model for studying whether or not a given supply of discrete pebbles can satisfy a given demand via pebbling moves. A pebbling move across an edge of a graph takes two pebbles from one endpoint and places one pebble at the other endpoint; the other pebble is lost in transit as a toll. It has been shown that deciding whether a supply can meet a demand on a graph is NP-complete. The pebbling number of a graph is the smallest t such that every supply of t pebbles can satisfy every demand of one pebble. Deciding if the pebbling number is at most k is NP 2 -complete. In this paper we develop a tool, called theWeight Function Lemma, for computing upper bounds and sometimes exact values for pebbling numbers with the assistance of linear optimization. With this tool we are able to calculate the pebbling numbers of much larger graphs than in previous algorithms, and much more quickly as well. We also obtain results for many families of graphs, in many cases by hand, with much simpler and remarkably shorter proofs than given in previously existing arguments (certificates typically of size at most the number of vertices times the maximum degree), especially for highly symmetric graphs. Here we apply theWeight Function Lemma to several specific graphs, including the Petersen, Lemke, 4th weak Bruhat, Lemke squared, and two random graphs, as well as to a number of infinite families of graphs, such as trees, cycles, graph powers of cycles, cubes, and some generalized Petersen and Coxeter graphs. This partly answers a question of Pachter, et al., by computing the pebbling exponent of cycles to within an asymptotically small range. It is conceivable that this method yields an approximation algorithm for graph pebbling.
Resumo:
Le rétinoblastome (Rb) est une tumeur provenant des cellules rétiniennes progénitrices des photorécepteurs. C'est la tumeur pédiatrique maligne la plus fréquente avec une incidence par naissance évaluée entre 1/15Ό00 et 1/20Ό00. Les enfants atteints de Rb sont diagnostiqué dans leur grande majorité avant l'âge de 4 ans, soit le temps nécessaire à la différentiation et à la maturation des photorécepteurs et donc à la disparition de la cellule d'origine du Rb. La survie du patient, la sauvegarde oculaire et le pronostic visuel restent excellents pour autant que le traitement ne soit pas différé. Dans sa variante non héréditaire (60%) le Rb est toujours unilatéral et sporadique. Le Rb héréditaire de transmission dominante autosomique (40%), se décline sous toutes les formes, familiale (10%) ou sporadique (30%), que l'atteinte soit unilatérale ou bilatérale. La majorité des mutations causales sont uniques et distribuées de façon aléatoire sur la totalité du gène RB1 sans région prédisposante. La détection de ces mutations est couteuse et chronophage, tout en présentant un taux de détection relativement bas; surtout dans les cas de Rb sporadiques unilatéraux. Dans le but d'identifier les patients présentant un risque réel de développer un Rb, et de réduire le nombre d'examens sous narcose requis pour le dépistage de la maladie chez les sujets à risque, nous avons développé une stratégie sensible, rapide, efficace et peu couteuse basée sur une analyse de l'haplotype intragénique. Cet algorithme prend en compte a) la perte d'hétérozygotie intratumorale du gène RB1, b) l'origine paternelle préférentielle des nouvelles mutations germinales et c) un risque a priori dérivé des données empiriques de Vogel. Pendant la période allant de janvier 1994 à décembre 2006, nous avons comparé l'apparition de nouveau Rb parmi la fratrie et la descendance de patient atteints au nombre de nouveaux cas attendus calculé par notre algorithme. 134 familles ont été étudiées. L'analyse moléculaire a été effectuée chez 570 personnes dont 99 patients âgés de moins de 4 ans et donc à risque de développer un Rb. Parmi cette cohorte, nous avons observé l'apparition d'un cas de Rb, alors que les risques cumulés a posteriori calculé par notre algorithme prédisait l'apparition de 1.77 nouveau cas. Dans cette étude, nous avons pu valider notre algorithme prédisant la récurrence de Rb chez les parents de 1er degré de patients atteints. Cet outil devrait grandement faciliter le conseil génétique ainsi que le suivi des patients à risque de développer un Rb, surtout dans les cas ou le séquençage direct du gène RB1 n'est pas disponible ou est resté non informatif. - Purpose: Most RBI mutations are unique and distributed throughout the RBI gene. Their detection can be time-consuming and the yield especially low in cases of conservatively-treated sporadic unilateral retinoblas-toma (Rb) patients. In order to identify patients with true risk of developing Rb, and to reduce the number of unnecessary examinations under anesthesia in all other cases, we developed a universal sensitive, efficient and cost-effective strategy based on intragenic haplotype analysis. Methods: This algorithm allows the calculation of the a posteriori risk of developing Rb and takes into account (a) RBI loss of heterozygosity in tumors, (b) preferential paternal origin of new germline mutations, (c) a priori risk derived from empirical data by Vogel, and (d) disease penetrance of 90% in most cases. We report the occurrence of Rb in first degree relatives of patients with sporadic Rb who visited the Jules Gonin Eye Hospital, Lausanne, Switzerland, from January 1994 to December 2006 compared to expected new cases of Rb using our algorithm. Results: A total of 134 families with sporadic Rb were enrolled; testing was performed in 570 individuals and 99 patients younger than 4 years old were identified. We observed one new case of Rb. Using our algorithm, the cumulated total a posteriori risk of recurrence was 1.77. Conclusions: This is the first time that linkage analysis has been validated to monitor the risk of recurrence in sporadic Rb. This should be a useful tool in genetic counseling, especially when direct RBI screening for mutations leaves a negative result or is unavailable.
Resumo:
Floor cleaning is a typical robot application. There are several mobile robots aviable in the market for domestic applications most of them with random path-planning algorithms. In this paper we study the cleaning coverage performances of a random path-planning mobile robot and propose an optimized control algorithm, some methods to estimate the are of the room, the evolution of the cleaning and the time needed for complete coverage.
Resumo:
PURPOSE: Most RB1 mutations are unique and distributed throughout the RB1 gene. Their detection can be time-consuming and the yield especially low in cases of conservatively-treated sporadic unilateral retinoblastoma (Rb) patients. In order to identify patients with true risk of developing Rb, and to reduce the number of unnecessary examinations under anesthesia in all other cases, we developed a universal sensitive, efficient and cost-effective strategy based on intragenic haplotype analysis. METHODS: This algorithm allows the calculation of the a posteriori risk of developing Rb and takes into account (a) RB1 loss of heterozygosity in tumors, (b) preferential paternal origin of new germline mutations, (c) a priori risk derived from empirical data by Vogel, and (d) disease penetrance of 90% in most cases. We report the occurrence of Rb in first degree relatives of patients with sporadic Rb who visited the Jules Gonin Eye Hospital, Lausanne, Switzerland, from January 1994 to December 2006 compared to expected new cases of Rb using our algorithm. RESULTS: A total of 134 families with sporadic Rb were enrolled; testing was performed in 570 individuals and 99 patients younger than 4 years old were identified. We observed one new case of Rb. Using our algorithm, the cumulated total a posteriori risk of recurrence was 1.77. CONCLUSIONS: This is the first time that linkage analysis has been validated to monitor the risk of recurrence in sporadic Rb. This should be a useful tool in genetic counseling, especially when direct RB1 screening for mutations leaves a negative result or is unavailable.
Resumo:
BACKGROUND AND OBJECTIVE: Memantine, a frequently prescribed anti-dementia drug, is mainly eliminated unchanged by the kidneys, partly via tubular secretion. Considerable inter-individual variability in plasma concentrations has been reported. We aimed to investigate clinical and genetic factors influencing memantine disposition. METHODS: A population pharmacokinetic study was performed including data from 108 patients recruited in a naturalistic setting. Patients were genotyped for common polymorphisms in renal cation transporters (SLC22A1/2/5, SLC47A1, ABCB1) and nuclear receptors (NR1I2, NR1I3, RXR, PPAR) involved in transporter expression. RESULTS: The average clearance was 5.2 L/h with a 27 % inter-individual variability (percentage coefficient of variation). Glomerular filtration rate (p = 0.007) and sex (p = 0.001) markedly influenced memantine clearance. NR1I2 rs1523130 was identified as the unique significant genetic covariate for memantine clearance (p = 0.006), with carriers of the NR1I2 rs1523130 CT/TT genotypes presenting a 16 % slower memantine elimination than carriers of the CC genotype. CONCLUSION: The better understanding of inter-individual variability of memantine disposition might be beneficial in the context of individual dose optimization.
Resumo:
PRECON S.A is a manufacturing company dedicated to produce prefabricatedconcrete parts to several industries as rail transportation andagricultural industries.Recently, PRECON signed a contract with RENFE,the Spanish Nnational Rail Transportation Company to manufacturepre-stressed concrete sleepers for siding of the new railways of the highspeed train AVE. The scheduling problem associated with the manufacturingprocess of the sleepers is very complex since it involves severalconstraints and objectives. The constraints are related with productioncapacity, the quantity of available moulds, satisfying demand and otheroperational constraints. The two main objectives are related withmaximizing the usage of the manufacturing resources and minimizing themoulds movements. We developed a deterministic crowding genetic algorithmfor this multiobjective problem. The algorithm has proved to be a powerfuland flexible tool to solve the large-scale instance of this complex realscheduling problem.
Resumo:
Abstract : This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems. Résumé : Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes.
Resumo:
In recent years, protein-ligand docking has become a powerful tool for drug development. Although several approaches suitable for high throughput screening are available, there is a need for methods able to identify binding modes with high accuracy. This accuracy is essential to reliably compute the binding free energy of the ligand. Such methods are needed when the binding mode of lead compounds is not determined experimentally but is needed for structure-based lead optimization. We present here a new docking software, called EADock, that aims at this goal. It uses an hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 A around the center of mass of the ligand position in the crystal structure, and on the contrary to other benchmarks, our algorithm was fed with optimized ligand positions up to 10 A root mean square deviation (RMSD) from the crystal structure, excluding the latter. This validation illustrates the efficiency of our sampling strategy, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures could be explained by the presence of crystal contacts in the experimental structure. Finally, the ability of EADock to accurately predict binding modes on a real application was illustrated by the successful docking of the RGD cyclic pentapeptide on the alphaVbeta3 integrin, starting far away from the binding pocket.
Resumo:
Pharmacokinetic variability in drug levels represent for some drugs a major determinant of treatment success, since sub-therapeutic concentrations might lead to toxic reactions, treatment discontinuation or inefficacy. This is true for most antiretroviral drugs, which exhibit high inter-patient variability in their pharmacokinetics that has been partially explained by some genetic and non-genetic factors. The population pharmacokinetic approach represents a very useful tool for the description of the dose-concentration relationship, the quantification of variability in the target population of patients and the identification of influencing factors. It can thus be used to make predictions and dosage adjustment optimization based on Bayesian therapeutic drug monitoring (TDM). This approach has been used to characterize the pharmacokinetics of nevirapine (NVP) in 137 HIV-positive patients followed within the frame of a TDM program. Among tested covariates, body weight, co-administration of a cytochrome (CYP) 3A4 inducer or boosted atazanavir as well as elevated aspartate transaminases showed an effect on NVP elimination. In addition, genetic polymorphism in the CYP2B6 was associated with reduced NVP clearance. Altogether, these factors could explain 26% in NVP variability. Model-based simulations were used to compare the adequacy of different dosage regimens in relation to the therapeutic target associated with treatment efficacy. In conclusion, the population approach is very useful to characterize the pharmacokinetic profile of drugs in a population of interest. The quantification and the identification of the sources of variability is a rational approach to making optimal dosage decision for certain drugs administered chronically.
Resumo:
Fetal MRI reconstruction aims at finding a high-resolution image given a small set of low-resolution images. It is usually modeled as an inverse problem where the regularization term plays a central role in the reconstruction quality. Literature has considered several regularization terms s.a. Dirichlet/Laplacian energy, Total Variation (TV)- based energies and more recently non-local means. Although TV energies are quite attractive because of their ability in edge preservation, standard explicit steepest gradient techniques have been applied to optimize fetal-based TV energies. The main contribution of this work lies in the introduction of a well-posed TV algorithm from the point of view of convex optimization. Specifically, our proposed TV optimization algorithm or fetal reconstruction is optimal w.r.t. the asymptotic and iterative convergence speeds O(1/n2) and O(1/√ε), while existing techniques are in O(1/n2) and O(1/√ε). We apply our algorithm to (1) clinical newborn data, considered as ground truth, and (2) clinical fetal acquisitions. Our algorithm compares favorably with the literature in terms of speed and accuracy.
Resumo:
The development and optimization of efficient transformation protocols is essential in new citrus breeding programs, not only for rootstock, but also for scion improvement. Transgenic 'Hamlin' sweet orange (Citrus sinensis (L.) Osbeck) plants were obtained by Agrobacterium tumefaciens-mediated transformation of epicotyl segments collected from seedlings germinated in vitro. Factors influencing genetic transformation efficiency were evaluated including seedling incubation conditions, time of inoculation with Agrobacterium and co-culture conditions. Epicotyl segments were adequate explants for transformation, regenerating plants by direct organogenesis. Higher percentage of transformation was obtained with explants collected from seedlings germinated in darkness, transferred to 16 hours photoperiod for 2-3 weeks, and inoculated with Agrobacterium for 15-45 min. The best co-culture condition was the incubation of the explants in darkness, for three days in culture medium supplemented with 100 muM of acetosyringone. Genetic transformation was confirmed by performing beta-glucoronidase (GUS) assays and, subsequently, by PCR amplification for the nptII and GUS genes.
Resumo:
We evaluate the performance of different optimization techniques developed in the context of optical flow computation with different variational models. In particular, based on truncated Newton methods (TN) that have been an effective approach for large-scale unconstrained optimization, we de- velop the use of efficient multilevel schemes for computing the optical flow. More precisely, we evaluate the performance of a standard unidirectional mul- tilevel algorithm - called multiresolution optimization (MR/OPT), to a bidrec- tional multilevel algorithm - called full multigrid optimization (FMG/OPT). The FMG/OPT algorithm treats the coarse grid correction as an optimiza- tion search direction and eventually scales it using a line search. Experimental results on different image sequences using four models of optical flow com- putation show that the FMG/OPT algorithm outperforms both the TN and MR/OPT algorithms in terms of the computational work and the quality of the optical flow estimation.
Resumo:
Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.
Resumo:
3 Summary 3. 1 English The pharmaceutical industry has been facing several challenges during the last years, and the optimization of their drug discovery pipeline is believed to be the only viable solution. High-throughput techniques do participate actively to this optimization, especially when complemented by computational approaches aiming at rationalizing the enormous amount of information that they can produce. In siiico techniques, such as virtual screening or rational drug design, are now routinely used to guide drug discovery. Both heavily rely on the prediction of the molecular interaction (docking) occurring between drug-like molecules and a therapeutically relevant target. Several softwares are available to this end, but despite the very promising picture drawn in most benchmarks, they still hold several hidden weaknesses. As pointed out in several recent reviews, the docking problem is far from being solved, and there is now a need for methods able to identify binding modes with a high accuracy, which is essential to reliably compute the binding free energy of the ligand. This quantity is directly linked to its affinity and can be related to its biological activity. Accurate docking algorithms are thus critical for both the discovery and the rational optimization of new drugs. In this thesis, a new docking software aiming at this goal is presented, EADock. It uses a hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with .the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 R around the center of mass of the ligand position in the crystal structure, and conversely to other benchmarks, our algorithms was fed with optimized ligand positions up to 10 A root mean square deviation 2MSD) from the crystal structure. This validation illustrates the efficiency of our sampling heuristic, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best-ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures in this benchmark could be explained by the presence of crystal contacts in the experimental structure. EADock has been used to understand molecular interactions involved in the regulation of the Na,K ATPase, and in the activation of the nuclear hormone peroxisome proliferatoractivated receptors a (PPARa). It also helped to understand the action of common pollutants (phthalates) on PPARy, and the impact of biotransformations of the anticancer drug Imatinib (Gleevec®) on its binding mode to the Bcr-Abl tyrosine kinase. Finally, a fragment-based rational drug design approach using EADock was developed, and led to the successful design of new peptidic ligands for the a5ß1 integrin, and for the human PPARa. In both cases, the designed peptides presented activities comparable to that of well-established ligands such as the anticancer drug Cilengitide and Wy14,643, respectively. 3.2 French Les récentes difficultés de l'industrie pharmaceutique ne semblent pouvoir se résoudre que par l'optimisation de leur processus de développement de médicaments. Cette dernière implique de plus en plus. de techniques dites "haut-débit", particulièrement efficaces lorsqu'elles sont couplées aux outils informatiques permettant de gérer la masse de données produite. Désormais, les approches in silico telles que le criblage virtuel ou la conception rationnelle de nouvelles molécules sont utilisées couramment. Toutes deux reposent sur la capacité à prédire les détails de l'interaction moléculaire entre une molécule ressemblant à un principe actif (PA) et une protéine cible ayant un intérêt thérapeutique. Les comparatifs de logiciels s'attaquant à cette prédiction sont flatteurs, mais plusieurs problèmes subsistent. La littérature récente tend à remettre en cause leur fiabilité, affirmant l'émergence .d'un besoin pour des approches plus précises du mode d'interaction. Cette précision est essentielle au calcul de l'énergie libre de liaison, qui est directement liée à l'affinité du PA potentiel pour la protéine cible, et indirectement liée à son activité biologique. Une prédiction précise est d'une importance toute particulière pour la découverte et l'optimisation de nouvelles molécules actives. Cette thèse présente un nouveau logiciel, EADock, mettant en avant une telle précision. Cet algorithme évolutionnaire hybride utilise deux pressions de sélections, combinées à une gestion de la diversité sophistiquée. EADock repose sur CHARMM pour les calculs d'énergie et la gestion des coordonnées atomiques. Sa validation a été effectuée sur 37 complexes protéine-ligand cristallisés, incluant 11 protéines différentes. L'espace de recherche a été étendu à une sphère de 151 de rayon autour du centre de masse du ligand cristallisé, et contrairement aux comparatifs habituels, l'algorithme est parti de solutions optimisées présentant un RMSD jusqu'à 10 R par rapport à la structure cristalline. Cette validation a permis de mettre en évidence l'efficacité de notre heuristique de recherche car des modes d'interactions présentant un RMSD inférieur à 2 R par rapport à la structure cristalline ont été classés premier pour 68% des complexes. Lorsque les cinq meilleures solutions sont prises en compte, le taux de succès grimpe à 78%, et 92% lorsque la totalité de la dernière génération est prise en compte. La plupart des erreurs de prédiction sont imputables à la présence de contacts cristallins. Depuis, EADock a été utilisé pour comprendre les mécanismes moléculaires impliqués dans la régulation de la Na,K ATPase et dans l'activation du peroxisome proliferatoractivated receptor a (PPARa). Il a également permis de décrire l'interaction de polluants couramment rencontrés sur PPARy, ainsi que l'influence de la métabolisation de l'Imatinib (PA anticancéreux) sur la fixation à la kinase Bcr-Abl. Une approche basée sur la prédiction des interactions de fragments moléculaires avec protéine cible est également proposée. Elle a permis la découverte de nouveaux ligands peptidiques de PPARa et de l'intégrine a5ß1. Dans les deux cas, l'activité de ces nouveaux peptides est comparable à celles de ligands bien établis, comme le Wy14,643 pour le premier, et le Cilengitide (PA anticancéreux) pour la seconde.
Genetic diversity between improved banana diploids using canonical variables and the Ward-MLM method
Resumo:
The objective of this work was to estimate the genetic diversity of improved banana diploids using data from quantitative analysis and from simple sequence repeats (SSR) marker, simultaneously. The experiment was carried out with 33 diploids, in an augmented block design with 30 regular treatments and three common ones. Eighteen agronomic characteristics and 20 SSR primers were used. The agronomic characteristics and the SSR were analyzed simultaneously by the Ward-MLM, cluster, and IML procedures. The Ward clustering method considered the combined matrix obtained by the Gower algorithm. The Ward-MLM procedure identified three ideal groups (G1, G2, and G3) based on pseudo-F and pseudo-t² statistics. The dendrogram showed relative similarity between the G1 genotypes, justified by genealogy. In G2, 'Calcutta 4' appears in 62% of the genealogies. Similar behavior was observed in G3, in which the 028003-01 diploid is the male parent of the 086079-10 and 042079-06 genotypes. The method with canonical variables had greater discriminatory power than Ward-MLM. Although reduced, the genetic variability available is sufficient to be used in the development of new hybrids.