937 resultados para clonal selection algorithm
Resumo:
This paper presents a parallel surrogate-based global optimization method for computationally expensive objective functions that is more effective for larger numbers of processors. To reach this goal, we integrated concepts from multi-objective optimization and tabu search into, single objective, surrogate optimization. Our proposed derivative-free algorithm, called SOP, uses non-dominated sorting of points for which the expensive function has been previously evaluated. The two objectives are the expensive function value of the point and the minimum distance of the point to previously evaluated points. Based on the results of non-dominated sorting, P points from the sorted fronts are selected as centers from which many candidate points are generated by random perturbations. Based on surrogate approximation, the best candidate point is subsequently selected for expensive evaluation for each of the P centers, with simultaneous computation on P processors. Centers that previously did not generate good solutions are tabu with a given tenure. We show almost sure convergence of this algorithm under some conditions. The performance of SOP is compared with two RBF based methods. The test results show that SOP is an efficient method that can reduce time required to find a good near optimal solution. In a number of cases the efficiency of SOP is so good that SOP with 8 processors found an accurate answer in less wall-clock time than the other algorithms did with 32 processors.
Resumo:
Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^
Resumo:
Mass spectrometry (MS) data provide a promising strategy for biomarker discovery. For this purpose, the detection of relevant peakbins in MS data is currently under intense research. Data from mass spectrometry are challenging to analyze because of their high dimensionality and the generally low number of samples available. To tackle this problem, the scientific community is becoming increasingly interested in applying feature subset selection techniques based on specialized machine learning algorithms. In this paper, we present a performance comparison of some metaheuristics: best first (BF), genetic algorithm (GA), scatter search (SS) and variable neighborhood search (VNS). Up to now, all the algorithms, except for GA, have been first applied to detect relevant peakbins in MS data. All these metaheuristic searches are embedded in two different filter and wrapper schemes coupled with Naive Bayes and SVM classifiers.
Resumo:
This paper studies feature subset selection in classification using a multiobjective estimation of distribution algorithm. We consider six functions, namely area under ROC curve, sensitivity, specificity, precision, F1 measure and Brier score, for evaluation of feature subsets and as the objectives of the problem. One of the characteristics of these objective functions is the existence of noise in their values that should be appropriately handled during optimization. Our proposed algorithm consists of two major techniques which are specially designed for the feature subset selection problem. The first one is a solution ranking method based on interval values to handle the noise in the objectives of this problem. The second one is a model estimation method for learning a joint probabilistic model of objectives and variables which is used to generate new solutions and advance through the search space. To simplify model estimation, l1 regularized regression is used to select a subset of problem variables before model learning. The proposed algorithm is compared with a well-known ranking method for interval-valued objectives and a standard multiobjective genetic algorithm. Particularly, the effects of the two new techniques are experimentally investigated. The experimental results show that the proposed algorithm is able to obtain comparable or better performance on the tested datasets.
Resumo:
Este artículo propone un método para llevar a cabo la calibración de las familias de discontinuidades en macizos rocosos. We present a novel approach for calibration of stochastic discontinuity network parameters based on genetic algorithms (GAs). To validate the approach, examples of application of the method to cases with known parameters of the original Poisson discontinuity network are presented. Parameters of the model are encoded as chromosomes using a binary representation, and such chromosomes evolve as successive generations of a randomly generated initial population, subjected to GA operations of selection, crossover and mutation. Such back-calculated parameters are employed to make assessments about the inference capabilities of the model using different objective functions with different probabilities of crossover and mutation. Results show that the predictive capabilities of GAs significantly depend on the type of objective function considered; and they also show that the calibration capabilities of the genetic algorithm can be acceptable for practical engineering applications, since in most cases they can be expected to provide parameter estimates with relatively small errors for those parameters of the network (such as intensity and mean size of discontinuities) that have the strongest influence on many engineering applications.
Resumo:
Automatic blood glucose classification may help specialists to provide a better interpretation of blood glucose data, downloaded directly from patients glucose meter and will contribute in the development of decision support systems for gestational diabetes. This paper presents an automatic blood glucose classifier for gestational diabetes that compares 6 different feature selection methods for two machine learning algorithms: neural networks and decision trees. Three searching algorithms, Greedy, Best First and Genetic, were combined with two different evaluators, CSF and Wrapper, for the feature selection. The study has been made with 6080 blood glucose measurements from 25 patients. Decision trees with a feature set selected with the Wrapper evaluator and the Best first search algorithm obtained the best accuracy: 95.92%.
Resumo:
With the advent of cloud computing model, distributed caches have become the cornerstone for building scalable applications. Popular systems like Facebook [1] or Twitter use Memcached [5], a highly scalable distributed object cache, to speed up applications by avoiding database accesses. Distributed object caches assign objects to cache instances based on a hashing function, and objects are not moved from a cache instance to another unless more instances are added to the cache and objects are redistributed. This may lead to situations where some cache instances are overloaded when some of the objects they store are frequently accessed, while other cache instances are less frequently used. In this paper we propose a multi-resource load balancing algorithm for distributed cache systems. The algorithm aims at balancing both CPU and Memory resources among cache instances by redistributing stored data. Considering the possible conflict of balancing multiple resources at the same time, we give CPU and Memory resources weighted priorities based on the runtime load distributions. A scarcer resource is given a higher weight than a less scarce resource when load balancing. The system imbalance degree is evaluated based on monitoring information, and the utility load of a node, a unit for resource consumption. Besides, since continuous rebalance of the system may affect the QoS of applications utilizing the cache system, our data selection policy ensures that each data migration minimizes the system imbalance degree and hence, the total reconfiguration cost can be minimized. An extensive simulation is conducted to compare our policy with other policies. Our policy shows a significant improvement in time efficiency and decrease in reconfiguration cost.
Resumo:
The generator differential protection is one of the most important electrical protections of synchronous generator stator windings. Its operation principle is based on the comparison of the input current and output current at each phase winding. Unwanted trip commands are usually caused by CT saturation, wrong CT selection, or the fact that they may come from different manufacturers. In generators grounded through high impedance, only phase-to-phase or three-phase faults can be detected by the differential protection. This kind of fault causes differential current to flow in, at least, two phases of the winding. Several cases of unwanted trip commands caused by the appearance of differential current in only one phase of the generator have been reported. In this paper multi-phase criterion is proposed for generator differential protection algorithm when applied to high impedance grounded generators.
Resumo:
In the last decade, Object Based Image Analysis (OBIA) has been accepted as an effective method for processing high spatial resolution multiband images. This image analysis method is an approach that starts with the segmentation of the image. Image segmentation in general is a procedure to partition an image into homogenous groups (segments). In practice, visual interpretation is often used to assess the quality of segmentation and the analysis relies on the experience of an analyst. In an effort to address the issue, in this study, we evaluate several seed selection strategies for an automatic image segmentation methodology based on a seeded region growing-merging approach. In order to evaluate the segmentation quality, segments were subjected to spatial autocorrelation analysis using Moran's I index and intra-segment variance analysis. We apply the algorithm to image segmentation using an aerial multiband image.
Resumo:
Piotr Omenzetter and Simon Hoell’s work within the Lloyd’s Register Foundation Centre for Safety and Reliability Engineering at the University of Aberdeen is supported by Lloyd’s Register Foundation. The Foundation helps to protect life and property by supporting engineering-related education, public engagement and the application of research.
Resumo:
Plasmodium falciparum, the agent of malignant malaria, is one of mankind’s most severe scourges. Efforts to develop preventive vaccines or remedial drugs are handicapped by the parasite’s rapid evolution of drug resistance and protective antigens. We examine 25 DNA sequences of the gene coding for the highly polymorphic antigenic circumsporozoite protein. We observe total absence of silent nucleotide variation in the two nonrepeated regions of the gene. We propose that this absence reflects a recent origin (within several thousand years) of the world populations of P. falciparum from a single individual; the amino acid polymorphisms observed in these nonrepeat regions would result from strong natural selection. Analysis of these polymorphisms indicates that: (i) the incidence of recombination events does not increase with nucleotide distance; (ii) the strength of linkage disequilibrium between nucleotides is also independent of distance; and (iii) haplotypes in the two nonrepeat regions are correlated with one another, but not with the central repeat region they span. We propose two hypotheses: (i) variation in the highly polymorphic central repeat region arises by mitotic intragenic recombination, and (ii) the population structure of P. falciparum is clonal—a state of affairs that persists in spite of the necessary stage of physiological sexuality that the parasite must sustain in the mosquito vector to complete its life cycle.
Resumo:
Most human cancers are of monoclonal origin and display many genetic alterations. In an effort to determine whether clonal expansion itself could account for the large number of genetic alterations, we compared spontaneous transformation in cloned and uncloned populations of NIH 3T3 cells. We have reported that progressive transformation of these cells, which is driven by the stress of prolonged contact inhibition at confluence, occurs far more frequently in cultures of recent monoclonal origin than in their uncloned progenitors. In the present work we asked how coculturing six clones at early and late stages of progression would affect the dynamics of transformation in repeated rounds of confluence. When coculture started with clones in early stages of transformation, marked by light focus formation, there was a strong inhibition of the progression to the dense focus formation that occurred in separate cultures of the individual clones. In contrast, when coculture started after the individual clones had progressed to dense focus formation, there was selection of transformants from the clone producing the largest and densest foci. Mixing the cells of a single clone with a large excess of uncloned cells from a subline that was refractory to transformation markedly decreased the size of dense foci from clones in transit from light to dense focus formation, but had much less effect on foci from clones with an established capacity for dense focus formation. The major finding of protection against progression by coculturing clones in early stages of transformation suggests that the expansion of a rogue clone in vivo increasingly isolates many of its cells from genetically stabilizing interactions with surrounding clones. Such clonal isolation might account for the increase in mutation rates associated with the dysplasia in colorectal adenomas that signifies the transition between benign and malignant growth.
Resumo:
A comparison was made of the competence for neoplastic transformation in three different sublines of NIH 3T3 cells and multiple clonal derivatives of each. Over 90% of the neoplastic foci produced by an uncloned transformed (t-SA′) subline on a confluent background of nontransformed cells were of the dense, multilayered type, but about half of the t-SA′ clones produced only light foci in assays without background. This asymmetry apparently arose from the failure of the light focus formers to register on a background of nontransformed cells. Comparison was made of the capacity for confluence-mediated transformation between uncloned parental cultures and their clonal derivatives by using two nontransformed sublines, one of which was highly sensitive and the other relatively refractory to confluence-mediated transformation. Transformation was more frequent in the clones than in the uncloned parental cultures for both sublines. This was dramatically so in the refractory subline, where the uncloned culture showed no overt sign of transformation in serially repeated assays but increasing numbers of its clones exhibited progressive transformation. The reason for the greater susceptibility of the pure clones is apparently the suppression of transformation among the diverse membership that makes up the uncloned parental culture. Progressive selection toward increasing degrees of transformation in confluent cultures plays a major role in the development of dense focus formers, but direct induction by the constraint of confluence may contribute by heritably damaging cells. In view of our finding of increased susceptibility to transformation in clonal versus uncloned populations, expansion of some clones at the expense of others during the aging process would contribute to the marked increase of cancer with age.
Resumo:
B cells with a rearranged heavy-chain variable region VHa allotype-encoding VH1 gene segment predominate throughout the life of normal rabbits and appear to be the source of the majority of serum immunoglobulins, which thus bear VHa allotypes. The functional role(s) of these VH framework region (FR) allotypic structures has not been defined. We show here that B cells expressing surface immunoglobulin with VHa2 allotypic specificities are preferentially expanded and positively selected in the appendix of young rabbits. By flow cytometry, a higher proportion of a2+ B cells were progressing through the cell cycle (S/G2/M) compared to a2- B cells, most of which were in the G1/G0 phase of the cell cycle. The majority of appendix B cells in dark zones of germinal centers of normal 6-week-old rabbits were proliferating and very little apoptosis were observed. In contrast, in 6-week-old VH-mutant ali/ali rabbits, little cell proliferation and extensive apoptosis were observed. Nonetheless even in the absence of VH1, B cells with a2-like surface immunoglobulin had developed and expanded in the appendix of 11-week-old mutants. The numbers and tissue localization of B cells undergoing apoptosis then appeared similar to those found in 6-week-old normal appendix. Thus, B cells with immunoglobulin receptors lacking the VHa2 allotypic structures were less likely to undergo clonal expansion and maturation. These data suggest that "positive" selection of B lymphocytes through FR1 and FR3 VHa allotypic structures occurs during their development in the appendix.
Resumo:
We present a modelling method to estimate the 3-D geometry and location of homogeneously magnetized sources from magnetic anomaly data. As input information, the procedure needs the parameters defining the magnetization vector (intensity, inclination and declination) and the Earth's magnetic field direction. When these two vectors are expected to be different in direction, we propose to estimate the magnetization direction from the magnetic map. Then, using this information, we apply an inversion approach based on a genetic algorithm which finds the geometry of the sources by seeking the optimum solution from an initial population of models in successive iterations through an evolutionary process. The evolution consists of three genetic operators (selection, crossover and mutation), which act on each generation, and a smoothing operator, which looks for the best fit to the observed data and a solution consisting of plausible compact sources. The method allows the use of non-gridded, non-planar and inaccurate anomaly data and non-regular subsurface partitions. In addition, neither constraints for the depth to the top of the sources nor an initial model are necessary, although previous models can be incorporated into the process. We show the results of a test using two complex synthetic anomalies to demonstrate the efficiency of our inversion method. The application to real data is illustrated with aeromagnetic data of the volcanic island of Gran Canaria (Canary Islands).