954 resultados para genetic algorithms.


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: In Southern European countries up to one-third of the patients with hereditary hemochromatosis (HH) do not present the common HFE risk genotype. In order to investigate the molecular basis of these cases we have designed a gene panel for rapid and simultaneous analysis of 6 HH-related genes (HFE, TFR2, HJV, HAMP, SLC40A1 and FTL) by next-generation sequencing (NGS). Materials and Methods: Eighty-eight iron overload Portuguese patients, negative for the common HFE mutations, were analysed. A TruSeq Custom Amplicon kit (TSCA, by Illumina) was designed in order to generate 97 amplicons covering exons, intron/exon junctions and UTRs of the mentioned genes with a cumulative target sequence of 12115bp. Amplicons were sequenced in the MiSeq instrument (IIlumina) using 250bp paired-end reads. Sequences were aligned against human genome reference hg19 using alignment and variant caller algorithms in the MiSeq reporter software. Novel variants were validated by Sanger sequencing and their pathogenic significance were assessed by in silico studies. Results: We found a total of 55 different genetic variants. These include novel pathogenic missense and splicing variants (in HFE and TFR2), a very rare variant in IRE of FTL, a variant that originates a novel translation initiation codon in the HAMP gene, among others. Conclusion: The merging of TSCA methodology and NGS technology appears to be an appropriate tool for simultaneous and fast analysis of HH-related genes in a large number of samples. However, establishing the clinical relevance of NGS-detected variants for HH development remains a hard-working task, requiring further functional studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The multitude of motif detection algorithms developed to date have largely focused on the detection of patterns in primary sequence. Since sequence-dependent DNA structure and flexibility may also play a role in protein-DNA interactions, the simultaneous exploration of sequence-and structure-based hypotheses about the composition of binding sites and the ordering of features in a regulatory region should be considered as well. The consideration of structural features requires the development of new detection tools that can deal with data types other than primary sequence. Results: GANN ( available at http://bioinformatics.org.au/gann) is a machine learning tool for the detection of conserved features in DNA. The software suite contains programs to extract different regions of genomic DNA from flat files and convert these sequences to indices that reflect sequence and structural composition or the presence of specific protein binding sites. The machine learning component allows the classification of different types of sequences based on subsamples of these indices, and can identify the best combinations of indices and machine learning architecture for sequence discrimination. Another key feature of GANN is the replicated splitting of data into training and test sets, and the implementation of negative controls. In validation experiments, GANN successfully merged important sequence and structural features to yield good predictive models for synthetic and real regulatory regions. Conclusion: GANN is a flexible tool that can search through large sets of sequence and structural feature combinations to identify those that best characterize a set of sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Women who have germline mutations in the BRCA1 gene are at substantially increased lifetime risk of developing breast and ovarian cancer but are otherwise normal. Currently. early age of onset of cancer and a strong family history are relied upon as the chief clues as to who should be offered genetic testing. Certain morphologic and immunohistochemical features are overrepresented in BRCA1-associated breast cancers but these differences have not been incorporated into the current selection criteria for genetic testing. Design: Each of the 4 pathologists studied 30 known cases of BRCA1- and BRCA2-associated breast cancer from kConFab families. After reviewing the literature, we agreed on a semiquantitative scoring system for estimating the chances of presence of an underlying BRCA1 mutation, based on the number of the reported prototypic features present. After a time lag of 12 months, we each examined a series of 62 deidentified cases of breast cancer, inclusive of cases of BRCA1-associated breast cancer and controls. The controls included cases of BRCA2-associated breast cancer and sporadic cases. Results: Our predictions had a sensitivity of 92%, specificity of 86%, positive predictive value of 61%, and negative predictive value of 98%. For comparison the sensitivity of currently used selection criteria are in the range of 25% to 30%. Conclusion: The inclusion of morphologic and immunohistochemical features of breast cancers in algorithms to predict the likelihood of presence of germline mutations in the BRCA1 gene improves the accuracy of the selection process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this letter, we propose a class of self-stabilizing learning algorithms for minor component analysis (MCA), which includes a few well-known MCA learning algorithms. Self-stabilizing means that the sign of the weight vector length change is independent of the presented input vector. For these algorithms, rigorous global convergence proof is given and the convergence rate is also discussed. By combining the positive properties of these algorithms, a new learning algorithm is proposed which can improve the performance. Simulations are employed to confirm our theoretical results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In empirical studies of Evolutionary Algorithms, it is usually desirable to evaluate and compare algorithms using as many different parameter settings and test problems as possible, in border to have a clear and detailed picture of their performance. Unfortunately, the total number of experiments required may be very large, which often makes such research work computationally prohibitive. In this paper, the application of a statistical method called racing is proposed as a general-purpose tool to reduce the computational requirements of large-scale experimental studies in evolutionary algorithms. Experimental results are presented that show that racing typically requires only a small fraction of the cost of an exhaustive experimental study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we address some issue related to evaluating and testing evolutionary algorithms. A landscape generator based on Gaussian functions is proposed for generating a variety of continuous landscapes as fitness functions. Through some initial experiments, we illustrate the usefulness of this landscape generator in testing evolutionary algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The generalised transportation problem (GTP) is an extension of the linear Hitchcock transportation problem. However, it does not have the unimodularity property, which means the linear programming solution (like the simplex method) cannot guarantee to be integer. This is a major difference between the GTP and the Hitchcock transportation problem. Although some special algorithms, such as the generalised stepping-stone method, have been developed, but they are based on the linear programming model and the integer solution requirement of the GTP is relaxed. This paper proposes a genetic algorithm (GA) to solve the GTP and a numerical example is presented to show the algorithm and its efficiency.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Water-alternating-gas (WAG) is an enhanced oil recovery method combining the improved macroscopic sweep of water flooding with the improved microscopic displacement of gas injection. The optimal design of the WAG parameters is usually based on numerical reservoir simulation via trial and error, limited by the reservoir engineer’s availability. Employing optimisation techniques can guide the simulation runs and reduce the number of function evaluations. In this study, robust evolutionary algorithms are utilized to optimise hydrocarbon WAG performance in the E-segment of the Norne field. The first objective function is selected to be the net present value (NPV) and two global semi-random search strategies, a genetic algorithm (GA) and particle swarm optimisation (PSO) are tested on different case studies with different numbers of controlling variables which are sampled from the set of water and gas injection rates, bottom-hole pressures of the oil production wells, cycle ratio, cycle time, the composition of the injected hydrocarbon gas (miscible/immiscible WAG) and the total WAG period. In progressive experiments, the number of decision-making variables is increased, increasing the problem complexity while potentially improving the efficacy of the WAG process. The second objective function is selected to be the incremental recovery factor (IRF) within a fixed total WAG simulation time and it is optimised using the same optimisation algorithms. The results from the two optimisation techniques are analyzed and their performance, convergence speed and the quality of the optimal solutions found by the algorithms in multiple trials are compared for each experiment. The distinctions between the optimal WAG parameters resulting from NPV and oil recovery optimisation are also examined. This is the first known work optimising over this complete set of WAG variables. The first use of PSO to optimise a WAG project at the field scale is also illustrated. Compared to the reference cases, the best overall values of the objective functions found by GA and PSO were 13.8% and 14.2% higher, respectively, if NPV is optimised over all the above variables, and 14.2% and 16.2% higher, respectively, if IRF is optimised.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Improvements in genomic technology, both in the increased speed and reduced cost of sequencing, have expanded the appreciation of the abundance of human genetic variation. However the sheer amount of variation, as well as the varying type and genomic content of variation, poses a challenge in understanding the clinical consequence of a single mutation. This work uses several methodologies to interpret the observed variation in the human genome, and presents novel strategies for the prediction of allele pathogenicity.

Using the zebrafish model system as an in vivo assay of allele function, we identified a novel driver of Bardet-Biedl Syndrome (BBS) in CEP76. A combination of targeted sequencing of 785 cilia-associated genes in a cohort of BBS patients and subsequent in vivo functional assays recapitulating the human phenotype gave strong evidence for the role of CEP76 mutations in the pathology of an affected family. This portion of the work demonstrated the necessity of functional testing in validating disease-associated mutations, and added to the catalogue of known BBS disease genes.

Further study into the role of copy-number variations (CNVs) in a cohort of BBS patients showed the significant contribution of CNVs to disease pathology. Using high-density array comparative genomic hybridization (aCGH) we were able to identify pathogenic CNVs as small as several hundred bp. Dissection of constituent gene and in vivo experiments investigating epistatic interactions between affected genes allowed for an appreciation of several paradigms by which CNVs can contribute to disease. This study revealed that the contribution of CNVs to disease in BBS patients is much higher than previously expected, and demonstrated the necessity of consideration of CNV contribution in future (and retrospective) investigations of human genetic disease.

Finally, we used a combination of comparative genomics and in vivo complementation assays to identify second-site compensatory modification of pathogenic alleles. These pathogenic alleles, which are found compensated in other species (termed compensated pathogenic deviations [CPDs]), represent a significant fraction (from 3 – 10%) of human disease-associated alleles. In silico pathogenicity prediction algorithms, a valuable method of allele prioritization, often misrepresent these alleles as benign, leading to omission of possibly informative variants in studies of human genetic disease. We created a mathematical model that was able to predict CPDs and putative compensatory sites, and functionally showed in vivo that second-site mutation can mitigate the pathogenicity of disease alleles. Additionally, we made publically available an in silico module for the prediction of CPDs and modifier sites.

These studies have advanced the ability to interpret the pathogenicity of multiple types of human variation, as well as made available tools for others to do so as well.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SELECTOR is a software package for studying the evolution of multiallelic genes under balancing or positive selection while simulating complex evolutionary scenarios that integrate demographic growth and migration in a spatially explicit population framework. Parameters can be varied both in space and time to account for geographical, environmental, and cultural heterogeneity. SELECTOR can be used within an approximate Bayesian computation estimation framework. We first describe the principles of SELECTOR and validate the algorithms by comparing its outputs for simple models with theoretical expectations. Then, we show how it can be used to investigate genetic differentiation of loci under balancing selection in interconnected demes with spatially heterogeneous gene flow. We identify situations in which balancing selection reduces genetic differentiation between population groups compared with neutrality and explain conflicting outcomes observed for human leukocyte antigen loci. These results and three previously published applications demonstrate that SELECTOR is efficient and robust for building insight into human settlement history and evolution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In computer vision, training a model that performs classification effectively is highly dependent on the extracted features, and the number of training instances. Conventionally, feature detection and extraction are performed by a domain-expert who, in many cases, is expensive to employ and hard to find. Therefore, image descriptors have emerged to automate these tasks. However, designing an image descriptor still requires domain-expert intervention. Moreover, the majority of machine learning algorithms require a large number of training examples to perform well. However, labelled data is not always available or easy to acquire, and dealing with a large dataset can dramatically slow down the training process. In this paper, we propose a novel Genetic Programming based method that automatically synthesises a descriptor using only two training instances per class. The proposed method combines arithmetic operators to evolve a model that takes an image and generates a feature vector. The performance of the proposed method is assessed using six datasets for texture classification with different degrees of rotation, and is compared with seven domain-expert designed descriptors. The results show that the proposed method is robust to rotation, and has significantly outperformed, or achieved a comparable performance to, the baseline methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Evolutionary algorithms alone cannot solve optimization problems very efficiently since there are many random (not very rational) decisions in these algorithms. Combination of evolutionary algorithms and other techniques have been proven to be an efficient optimization methodology. In this talk, I will explain the basic ideas of our three algorithms along this line (1): Orthogonal genetic algorithm which treats crossover/mutation as an experimental design problem, (2) Multiobjective evolutionary algorithm based on decomposition (MOEA/D) which uses decomposition techniques from traditional mathematical programming in multiobjective optimization evolutionary algorithm, and (3) Regular model based multiobjective estimation of distribution algorithms (RM-MEDA) which uses the regular property and machine learning methods for improving multiobjective evolutionary algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Calluna vulgaris is one of the most important landscaping plants produced in Germany. Its enormous economic success is due to the prolonged flower attractiveness of mutants in flower morphology, the so-called bud-bloomers. In this study, we present the first genetic linkage map of C. vulgaris in which we mapped a locus of the economically highly desired trait " flower type" .Results: The map was constructed in JoinMap 4.1. using 535 AFLP markers from a single mapping population. A large fraction (40%) of markers showed distorted segregation. To test the effect of segregation distortion on linkage estimation, these markers were sorted regarding their segregation ratio and added in groups to the data set. The plausibility of group formation was evaluated by comparison of the " two-way pseudo-testcross" and the " integrated" mapping approach. Furthermore, regression mapping was compared to the multipoint-likelihood algorithm. The majority of maps constructed by different combinations of these methods consisted of eight linkage groups corresponding to the chromosome number of C. vulgaris.Conclusions: All maps confirmed the independent inheritance of the most important horticultural traits " flower type" , " flower colour" , and " leaf colour". An AFLP marker for the most important breeding target " flower type" was identified. The presented genetic map of C. vulgaris can now serve as a basis for further molecular marker selection and map-based cloning of the candidate gene encoding the unique flower architecture of C. vulgaris bud-bloomers. © 2013 Behrend et al.; licensee BioMed Central Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pitch Estimation, also known as Fundamental Frequency (F0) estimation, has been a popular research topic for many years, and is still investigated nowadays. The goal of Pitch Estimation is to find the pitch or fundamental frequency of a digital recording of a speech or musical notes. It plays an important role, because it is the key to identify which notes are being played and at what time. Pitch Estimation of real instruments is a very hard task to address. Each instrument has its own physical characteristics, which reflects in different spectral characteristics. Furthermore, the recording conditions can vary from studio to studio and background noises must be considered. This dissertation presents a novel approach to the problem of Pitch Estimation, using Cartesian Genetic Programming (CGP).We take advantage of evolutionary algorithms, in particular CGP, to explore and evolve complex mathematical functions that act as classifiers. These classifiers are used to identify piano notes pitches in an audio signal. To help us with the codification of the problem, we built a highly flexible CGP Toolbox, generic enough to encode different kind of programs. The encoded evolutionary algorithm is the one known as 1 + , and we can choose the value for . The toolbox is very simple to use. Settings such as the mutation probability, number of runs and generations are configurable. The cartesian representation of CGP can take multiple forms and it is able to encode function parameters. It is prepared to handle with different type of fitness functions: minimization of f(x) and maximization of f(x) and has a useful system of callbacks. We trained 61 classifiers corresponding to 61 piano notes. A training set of audio signals was used for each of the classifiers: half were signals with the same pitch as the classifier (true positive signals) and the other half were signals with different pitches (true negative signals). F-measure was used for the fitness function. Signals with the same pitch of the classifier that were correctly identified by the classifier, count as a true positives. Signals with the same pitch of the classifier that were not correctly identified by the classifier, count as a false negatives. Signals with different pitch of the classifier that were not identified by the classifier, count as a true negatives. Signals with different pitch of the classifier that were identified by the classifier, count as a false positives. Our first approach was to evolve classifiers for identifying artifical signals, created by mathematical functions: sine, sawtooth and square waves. Our function set is basically composed by filtering operations on vectors and by arithmetic operations with constants and vectors. All the classifiers correctly identified true positive signals and did not identify true negative signals. We then moved to real audio recordings. For testing the classifiers, we picked different audio signals from the ones used during the training phase. For a first approach, the obtained results were very promising, but could be improved. We have made slight changes to our approach and the number of false positives reduced 33%, compared to the first approach. We then applied the evolved classifiers to polyphonic audio signals, and the results indicate that our approach is a good starting point for addressing the problem of Pitch Estimation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The current dominance of African runners in long-distance running is an intriguing phenomenon that highlights the close relationship between genetics and physical performance. Many factors in the interesting interaction between genotype and phenotype (eg, high cardiorespiratory fitness, higher hemoglobin concentration, good metabolic efficiency, muscle fiber composition, enzyme profile, diet, altitude training, and psychological aspects) have been proposed in the attempt to explain the extraordinary success of these runners. Increasing evidence shows that genetics may be a determining factor in physical and athletic performance. But, could this also be true for African long-distance runners? Based on this question, this brief review proposed the role of genetic factors (mitochondrial deoxyribonucleic acid, the Y chromosome, and the angiotensin-converting enzyme and the alpha-actinin-3 genes) in the amazing athletic performance observed in African runners, especially the Kenyans and Ethiopians, despite their environmental constraints.