998 resultados para Algorithms genetics
Resumo:
Distributed Genetic Algorithms (DGAs) designed for the Internet have to take its high communication cost into consideration. For island model GAs, the migration topology has a major impact on DGA performance. This paper describes and evaluates an adaptive migration topology optimizer that keeps the communication load low while maintaining high solution quality. Experiments on benchmark problems show that the optimized topology outperforms static or random topologies of the same degree of connectivity. The applicability of the method on real-world problems is demonstrated on a hard optimization problem in VLSI design.
Resumo:
Advances in algorithms for approximate sampling from a multivariable target function have led to solutions to challenging statistical inference problems that would otherwise not be considered by the applied scientist. Such sampling algorithms are particularly relevant to Bayesian statistics, since the target function is the posterior distribution of the unobservables given the observables. In this thesis we develop, adapt and apply Bayesian algorithms, whilst addressing substantive applied problems in biology and medicine as well as other applications. For an increasing number of high-impact research problems, the primary models of interest are often sufficiently complex that the likelihood function is computationally intractable. Rather than discard these models in favour of inferior alternatives, a class of Bayesian "likelihoodfree" techniques (often termed approximate Bayesian computation (ABC)) has emerged in the last few years, which avoids direct likelihood computation through repeated sampling of data from the model and comparing observed and simulated summary statistics. In Part I of this thesis we utilise sequential Monte Carlo (SMC) methodology to develop new algorithms for ABC that are more efficient in terms of the number of model simulations required and are almost black-box since very little algorithmic tuning is required. In addition, we address the issue of deriving appropriate summary statistics to use within ABC via a goodness-of-fit statistic and indirect inference. Another important problem in statistics is the design of experiments. That is, how one should select the values of the controllable variables in order to achieve some design goal. The presences of parameter and/or model uncertainty are computational obstacles when designing experiments but can lead to inefficient designs if not accounted for correctly. The Bayesian framework accommodates such uncertainties in a coherent way. If the amount of uncertainty is substantial, it can be of interest to perform adaptive designs in order to accrue information to make better decisions about future design points. This is of particular interest if the data can be collected sequentially. In a sense, the current posterior distribution becomes the new prior distribution for the next design decision. Part II of this thesis creates new algorithms for Bayesian sequential design to accommodate parameter and model uncertainty using SMC. The algorithms are substantially faster than previous approaches allowing the simulation properties of various design utilities to be investigated in a more timely manner. Furthermore the approach offers convenient estimation of Bayesian utilities and other quantities that are particularly relevant in the presence of model uncertainty. Finally, Part III of this thesis tackles a substantive medical problem. A neurological disorder known as motor neuron disease (MND) progressively causes motor neurons to no longer have the ability to innervate the muscle fibres, causing the muscles to eventually waste away. When this occurs the motor unit effectively ‘dies’. There is no cure for MND, and fatality often results from a lack of muscle strength to breathe. The prognosis for many forms of MND (particularly amyotrophic lateral sclerosis (ALS)) is particularly poor, with patients usually only surviving a small number of years after the initial onset of disease. Measuring the progress of diseases of the motor units, such as ALS, is a challenge for clinical neurologists. Motor unit number estimation (MUNE) is an attempt to directly assess underlying motor unit loss rather than indirect techniques such as muscle strength assessment, which generally is unable to detect progressions due to the body’s natural attempts at compensation. Part III of this thesis builds upon a previous Bayesian technique, which develops a sophisticated statistical model that takes into account physiological information about motor unit activation and various sources of uncertainties. More specifically, we develop a more reliable MUNE method by applying marginalisation over latent variables in order to improve the performance of a previously developed reversible jump Markov chain Monte Carlo sampler. We make other subtle changes to the model and algorithm to improve the robustness of the approach.
Resumo:
Chatrooms, for example Internet Relay Chat, are generally multi-user, multi-channel and multiserver chat-systems which run over the Internet and provide a protocol for real-time text-based conferencing between users all over the world. While a well-trained human observer is able to understand who is chatting with whom, there are no efficient and accurate automated tools to determine the groups of users conversing with each other. A precursor to analysing evolving cyber-social phenomena is to first determine what the conversations are and which groups of chatters are involved in each conversation. We consider this problem in this paper. We propose an algorithm to discover all groups of users that are engaged in conversation. Our algorithms are based on a statistical model of a chatroom that is founded on our experience with real chatrooms. Our approach does not require any semantic analysis of the conversations, rather it is based purely on the statistical information contained in the sequence of posts. We improve the accuracy by applying some graph algorithms to clean the statistical information. We present some experimental results which indicate that one can automatically determine the conversing groups in a chatroom, purely on the basis of statistical analysis.
Resumo:
The main aim of this paper is to describe an adaptive re-planning algorithm based on a RRT and Game Theory to produce an efficient collision free obstacle adaptive Mission Path Planner for Search and Rescue (SAR) missions. This will provide UAV autopilots and flight computers with the capability to autonomously avoid static obstacles and No Fly Zones (NFZs) through dynamic adaptive path replanning. The methods and algorithms produce optimal collision free paths and can be integrated on a decision aid tool and UAV autopilots.
Resumo:
This PhD study has examined the population genetics of the Russian wheat aphid (RWA, Diuraphis noxia), one of the world’s most invasive agricultural pests, throughout its native and introduced global range. Firstly, this study investigated the geographic distribution of genetic diversity within and among RWA populations in western China. Analysis of mitochondrial data from 18 sites provided evidence for the long-term existence and expansion of RWAs in western China. The results refute the hypothesis that RWA is an exotic species only present in China since 1975. The estimated date of RWA expansion throughout western China coincides with the debut of wheat domestication and cultivation practices in western Asia in the Holocene. It is concluded that western China represents the limit of the far eastern native range of this species. Analysis of microsatellite data indicated high contemporary gene flow among northern populations in western China, while clear geographic isolation between northern and southern populations was identified across the Tianshan mountain range and extensive desert regions. Secondly, this study analyzed the worldwide pathway of invasion using both microsatellite and endosymbiont genetic data. Individual RWAs were obtained from native populations in Central Asia and the Middle East and invasive populations in Africa and the Americas. Results indicated two pathways of RWA invasion from 1) Syria in the Middle East to North Africa and 2) Turkey to South Africa, Mexico and then North and South America. Very little clone diversity was identified among invasive populations suggesting that a limited founder event occurred together with predominantly asexual reproduction and rapid population expansion. The most likely explanation for the rapid spread (within two years) from South Africa to the New World is by human movement, probably as a result of the transfer of wheat breeding material. Furthermore, the mitochondrial data revealed the presence of a universal haplotype and it is proposed that this haplotype is representative of a wheat associated super-clone that has gained dominance worldwide as a result of the widespread planting of domesticated wheat. Finally, this study examined salivary gland gene diversity to determine whether a functional basis for RWA invasiveness could be identified. Peroxidase DNA sequence data were obtained for a selection of worldwide RWA samples. Results demonstrated that most native populations were polymorphic while invasive populations were monomorphic, supporting previous conclusions relating to demographic founder effects in invasive populations. Purifying selection most likely explains the existence of a universal allele present in Middle Eastern populations, while balancing selection was evident in East Asian populations. Selection acting on the peroxidase gene may provide an allele-dependent advantage linked to the successful establishment of RWAs on wheat, and ultimately their invasion potential. In conclusion, this study is the most comprehensive molecular genetic investigation of RWA population genetics undertaken to date and provides significant insights into the source and pathway of global invasion and the potential existence of a wheat-adapted genotype that has colonised major wheat growing countries worldwide except for Australia. This research has major biosecurity implications for Australia’s grain industry.
Resumo:
The emergence of pseudo-marginal algorithms has led to improved computational efficiency for dealing with complex Bayesian models with latent variables. Here an unbiased estimator of the likelihood replaces the true likelihood in order to produce a Bayesian algorithm that remains on the marginal space of the model parameter (with latent variables integrated out), with a target distribution that is still the correct posterior distribution. Very efficient proposal distributions can be developed on the marginal space relative to the joint space of model parameter and latent variables. Thus psuedo-marginal algorithms tend to have substantially better mixing properties. However, for pseudo-marginal approaches to perform well, the likelihood has to be estimated rather precisely. This can be difficult to achieve in complex applications. In this paper we propose to take advantage of multiple central processing units (CPUs), that are readily available on most standard desktop computers. Here the likelihood is estimated independently on the multiple CPUs, with the ultimate estimate of the likelihood being the average of the estimates obtained from the multiple CPUs. The estimate remains unbiased, but the variability is reduced. We compare and contrast two different technologies that allow the implementation of this idea, both of which require a negligible amount of extra programming effort. The superior performance of this idea over the standard approach is demonstrated on simulated data from a stochastic volatility model.
Resumo:
Migraine is a common neurological disorder with a significantly heritable component. It is a complex disease and despite numerous molecular genetic studies, the exact pathogenesis causing the neurological disturbance remains poorly understood. Although several known molecular mechanisms have been associated with an increased risk for developing migraine, there remains significant scope for future studies. The majority of studies have investigated the most plausible candidate genes involved in common migraine pathogenesis utilising criteria that takes into account a combination of physiological functionality in conjunction with regions of genomic association. Thus, far genes involved in neurological, vascular or hormonal pathways have been identified and investigated on this basis. Genome-wide association studies (GWAS) studies have helped to identify novel regions that may be associated with migraine and have aided in providing the basis for further molecular investigations. However, further studies utilising sequencing technologies are required to characterise the genetic basis for migraine.
Resumo:
Objectives To investigate the frequency of the ACTN3 R577X polymorphism in elite endurance triathletes, and whether ACTN3 R577X is significantly associated with performance time. Design Cross-sectional study. Methods Saliva samples, questionnaires, and performance times were collected for 196 elite endurance athletes who participated in the 2008 Kona Ironman championship triathlon. Athletes were of predominantly North American, European, and Australian origin. A one-way analysis of variance was conducted to compare performance times between genotype groups. Multiple linear regression analysis was performed to model the effect of questionnaire variables and genotype on performance time. Genotype and allele frequencies were compared to results from different populations using the chi-square test. Results Performance time did not significantly differ between genotype groups, and age, sex, and continent of origin were significant predictors of finishing time (age and sex: p < 5 × 10−6; continent: p = 0.003) though genotype was not. Genotype and allele frequencies obtained (RR 26.5%, RX 50.0%, XX 23.5%, R 51.5%, X 48.5%) were found to be not significantly different from Australian, Spanish, and Italian endurance athletes (p > 0.05), but were significantly different from Kenyan, Ethiopian, and Finnish endurance athletes (p < 0.01). Conclusions Genotype and allele frequencies agreed with those reported for endurance athletes of similar ethnic origin, supporting previous findings for an association between 577X allele and endurance. However, analysis of performance time suggests that ACTN3 does not alone influence endurance performance, or may have a complex effect on endurance performance due to a speed/endurance trade-off.
Resumo:
Background Illumina's Infinium SNP BeadChips are extensively used in both small and large-scale genetic studies. A fundamental step in any analysis is the processing of raw allele A and allele B intensities from each SNP into genotype calls (AA, AB, BB). Various algorithms which make use of different statistical models are available for this task. We compare four methods (GenCall, Illuminus, GenoSNP and CRLMM) on data where the true genotypes are known in advance and data from a recently published genome-wide association study. Results In general, differences in accuracy are relatively small between the methods evaluated, although CRLMM and GenoSNP were found to consistently outperform GenCall. The performance of Illuminus is heavily dependent on sample size, with lower no call rates and improved accuracy as the number of samples available increases. For X chromosome SNPs, methods with sex-dependent models (Illuminus, CRLMM) perform better than methods which ignore gender information (GenCall, GenoSNP). We observe that CRLMM and GenoSNP are more accurate at calling SNPs with low minor allele frequency than GenCall or Illuminus. The sample quality metrics from each of the four methods were found to have a high level of agreement at flagging samples with unusual signal characteristics. Conclusions CRLMM, GenoSNP and GenCall can be applied with confidence in studies of any size, as their performance was shown to be invariant to the number of samples available. Illuminus on the other hand requires a larger number of samples to achieve comparable levels of accuracy and its use in smaller studies (50 or fewer individuals) is not recommended.
Resumo:
Migraine is considered to be a multifactorial disorder in which genetic, environmental, and, in the case of menstrual and menstrually related migraine, hormonal events influence the phenotype. Certainly, the role of female sex hormones in migraine has been well established, yet the mechanism behind this well-known relationship remains unclear. This review focuses on the potential role of hormonally related genes in migraine, summarizes results of candidate gene studies to date, and discusses challenges and issues involved in interpreting hormone-related gene results. In light of the molecular evidence presented, we discuss future approaches for analysis with the view to elucidate the complex genetic architecture that underlies the disorder.
Resumo:
Migraine is a common complex neurological disorder with a well-known but poorly characterized genetic liability. The search for migraine susceptibility genes has been the focus of intense research. It is now believed that common migraine is not a single gene disorder, but attributable to several potentially interacting genetic variants. These variants may differ in each sufferer and interact with environmental factors to set the individual migraine threshold. This genetic liability may play an important role in the clinical heterogeneity seen in migraine and also in the variability of treatment response. This review will look at genetic loci implicated in migraine to date and consider their current or prospective role in migraine therapy. To elucidate the complex nature of migraine genetic liability, approaches that consider detailed endophenotypic profiles that encompass treatment response may provide much more relevant information than simple end diagnosis.
Resumo:
Fundamental misconceptions regarding some basic phylogenetic terminology are presented in this opinion piece. An attempt is made to point out why these misconceptions exist and what may be causing the misapplication of terminology. Clarification is providing via basic definitions and simple explanations. Differences between the scientific fields of genetics and population genetics are discussed. The appropriate use of terminology is advocated and alternative terms are proposed to eliminate one potential source of confusion. It is suggested we use 'sequence data' instead of molecular data and 'non-sequence data' instead of morphological data in the field of phylogenetics and systematics.
Resumo:
Motivation: Gene silencing, also called RNA interference, requires reliable assessment of silencer impacts. A critical task is to find matches between silencer oligomers and sites in the genome, in accordance with one-to-many matching rules (G-U matching, with provision for mismatches). Fast search algorithms are required to support silencer impact assessments in procedures for designing effective silencer sequences.Results: The article presents a matching algorithm and data structures specialized for matching searches, including a kernel procedure that addresses a Boolean version of the database task called the skyline search. Besides exact matches, the algorithm is extended to allow for the location-specific mismatches applicable in plants. Computational tests show that the algorithm is significantly faster than suffix-tree alternatives. © The Author 2010. Published by Oxford University Press. All rights reserved.