26 resultados para variable length Markov chains
em CentAUR: Central Archive University of Reading - UK
Resumo:
Biological Crossover occurs during the early stages of meiosis. During this process the chromosomes undergoing crossover are synapsed together at a number of homogenous sequence sections, it is within such synapsed sections that crossover occurs. The SVLC (Synapsing Variable Length Crossover) Algorithm recurrently synapses homogenous genetic sequences together in order of length. The genomes are considered to be flexible with crossover only being permitted within the synapsed sections. Consequently, common sequences are automatically preserved with only the genetic differences being exchanged, independent of the length of such differences. In addition to providing a rationale for variable length crossover it also provides a genotypic similarity metric for variable length genomes enabling standard niche formation techniques to be utilised. In a simple variable length test problem the SVLC algorithm outperforms current variable length crossover techniques.
Synapsing variable length crossover: An algorithm for crossing and comparing variable length genomes
Resumo:
The Synapsing Variable Length Crossover (SVLC) algorithm provides a biologically inspired method for performing meaningful crossover between variable length genomes. In addition to providing a rationale for variable length crossover it also provides a genotypic similarity metric for variable length genomes enabling standard niche formation techniques to be used with variable length genomes. Unlike other variable length crossover techniques which consider genomes to be rigid inflexible arrays and where some or all of the crossover points are randomly selected, the SVLC algorithm considers genomes to be flexible and chooses non-random crossover points based on the common parental sequence similarity. The SVLC Algorithm recurrently "glues" or synapses homogenous genetic sub-sequences together. This is done in such a way that common parental sequences are automatically preserved in the offspring with only the genetic differences being exchanged or removed, independent of the length of such differences. In a variable length test problem the SVLC algorithm is shown to outperform current variable length crossover techniques. The SVLC algorithm is also shown to work in a more realistic robot neural network controller evolution application.
Resumo:
The synapsing variable-length crossover (SVLC algorithm provides a biologically inspired method for performing meaningful crossover between variable-length genomes. In addition to providing a rationale for variable-length crossover, it also provides a genotypic similarity metric for variable-length genomes, enabling standard niche formation techniques to be used with variable-length genomes. Unlike other variable-length crossover techniques which consider genomes to be rigid inflexible arrays and where some or all of the crossover points are randomly selected, the SVLC algorithm considers genomes to be flexible and chooses non-random crossover points based on the common parental sequence similarity. The SVLC algorithm recurrently "glues" or synapses homogenous genetic subsequences together. This is done in such a way that common parental sequences are automatically preserved in the offspring with only the genetic differences being exchanged or removed, independent of the length of such differences. In a variable-length test problem, the SVLC algorithm compares favorably with current variable-length crossover techniques. The variable-length approach is further advocated by demonstrating how a variable-length genetic algorithm (GA) can obtain a high fitness solution in fewer iterations than a traditional fixed-length GA in a two-dimensional vector approximation task.
Resumo:
Monte Carlo algorithms often aim to draw from a distribution π by simulating a Markov chain with transition kernel P such that π is invariant under P. However, there are many situations for which it is impractical or impossible to draw from the transition kernel P. For instance, this is the case with massive datasets, where is it prohibitively expensive to calculate the likelihood and is also the case for intractable likelihood models arising from, for example, Gibbs random fields, such as those found in spatial statistics and network analysis. A natural approach in these cases is to replace P by an approximation Pˆ. Using theory from the stability of Markov chains we explore a variety of situations where it is possible to quantify how ’close’ the chain given by the transition kernel Pˆ is to the chain given by P . We apply these results to several examples from spatial statistics and network analysis.
Resumo:
The cupin superfamily is a group of functionally diverse proteins that are found in all three kingdoms of life, Archaea, Eubacteria, and Eukaryota. These proteins have a characteristic signature domain comprising two histidine- containing motifs separated by an intermotif region of variable length. This domain consists of six beta strands within a conserved beta barrel structure. Most cupins, such as microbial phosphomannose isomerases (PMIs), AraC- type transcriptional regulators, and cereal oxalate oxidases (OXOs), contain only a single domain, whereas others, such as seed storage proteins and oxalate decarboxylases (OXDCs), are bi-cupins with two pairs of motifs. Although some cupins have known functions and have been characterized at the biochemical level, the majority are known only from gene cloning or sequencing projects. In this study, phylogenetic analyses were conducted on the conserved domain to investigate the evolution and structure/function relationships of cupins, with an emphasis on single- domain plant germin-like proteins (GLPs). An unrooted phylogeny of cupins from a wide spectrum of evolutionary lineages identified three main clusters, microbial PMIs, OXDCs, and plant GLPs. The sister group to the plant GLPs in the global analysis was then used to root a phylogeny of all available plant GLPs. The resulting phylogeny contained three main clades, classifying the GLPs into distinct subfamilies. It is suggested that these subfamilies correlate with functional categories, one of which contains the bifunctional barley germin that has both OXO and superoxide dismutase (SOD) activity. It is proposed that GLPs function primarily as SODs, enzymes that protect plants from the effects of oxidative stress. Closer inspection of the DNA sequence encoding the intermotif region in plant GLPs showed global conservation of thymine in the second codon position, a character associated with hydrophobic residues. Since many of these proteins are multimeric and enzymatically inactive in their monomeric state, this conservation of hydrophobicity is thought to be associated with the need to maintain the various monomer- monomer interactions. The type of structure-based predictive analysis presented in this paper is an important approach for understanding gene function and evolution in an era when genomes from a wide range of organisms are being sequenced at a rapid rate.
Resumo:
This paper compares and contrasts, for the first time, one- and two-component gelation systems that are direct structural analogues and draws conclusions about the molecular recognition pathways that underpin fibrillar self-assembly. The new one-component systems comprise L-lysine-based dendritic headgroups covalently connected to an aliphatic diamine spacer chain via an amide bond, One-component gelators with different generations of headgroup (from first to third generation) and different length spacer chains are reported. The self-assembly of these dendrimers in toluene was elucidated using thermal measurements, circular dichroism (CD) and NMR spectroscopies, scanning electron microscopy (SEM), and small-angle X-ray scattering (SAXS). The observations are compared with previous results for the analogous two-component gelation system in which the dendritic headgroups are bound to the aliphatic spacer chain noncovalently via acid-amine interactions. The one-component system is inherently a more effective gelator, partly as a consequence of the additional covalent amide groups that provide a new hydrogen bonding molecular recognition pathway, whereas the two-component analogue relies solely on intermolecular hydrogen bond interactions between the chiral dendritic headgroups. Furthermore, because these amide groups are important in the assembly process for the one-component system, the chiral information preset in the dendritic headgroups is not always transcribed into the nanoscale assembly, whereas for the two-component system, fiber formation is always accompanied by chiral ordering because the molecular recognition pathway is completely dependent on hydrogen bond interactions between well-organized chiral dendritic headgroups.
Resumo:
Numerous techniques exist which can be used for the task of behavioural analysis and recognition. Common amongst these are Bayesian networks and Hidden Markov Models. Although these techniques are extremely powerful and well developed, both have important limitations. By fusing these techniques together to form Bayes-Markov chains, the advantages of both techniques can be preserved, while reducing their limitations. The Bayes-Markov technique forms the basis of a common, flexible framework for supplementing Markov chains with additional features. This results in improved user output, and aids in the rapid development of flexible and efficient behaviour recognition systems.
Resumo:
In this paper we consider hybrid (fast stochastic approximation and deterministic refinement) algorithms for Matrix Inversion (MI) and Solving Systems of Linear Equations (SLAE). Monte Carlo methods are used for the stochastic approximation, since it is known that they are very efficient in finding a quick rough approximation of the element or a row of the inverse matrix or finding a component of the solution vector. We show how the stochastic approximation of the MI can be combined with a deterministic refinement procedure to obtain MI with the required precision and further solve the SLAE using MI. We employ a splitting A = D – C of a given non-singular matrix A, where D is a diagonal dominant matrix and matrix C is a diagonal matrix. In our algorithm for solving SLAE and MI different choices of D can be considered in order to control the norm of matrix T = D –1C, of the resulting SLAE and to minimize the number of the Markov Chains required to reach given precision. Further we run the algorithms on a mini-Grid and investigate their efficiency depending on the granularity. Corresponding experimental results are presented.
Resumo:
A reconfigurable scalar quantiser capable of accepting n-bit input data is presented. The data length n can be varied in the range 1... N-1 under partial-run time reconfiguration, p-RTR. Issues as improvement in throughput using this reconfigurable quantiser of p-RTR against RTR for data of variable length are considered. The quantiser design referred to as the priority quantiser PQ is then compared against a direct design of the quantiser DIQ. It is then evaluated that for practical quantiser sizes, PQ shows better area usage when both are targeted onto the same FPGA. Other benefits are also identified.
Resumo:
The UK has a target for an 80% reduction in CO2 emissions by 2050 from a 1990 base. Domestic energy use accounts for around 30% of total emissions. This paper presents a comprehensive review of existing models and modelling techniques and indicates how they might be improved by considering individual buying behaviour. Macro (top-down) and micro (bottom-up) models have been reviewed and analysed. It is found that bottom-up models can project technology diffusion due to their higher resolution. The weakness of existing bottom-up models at capturing individual green technology buying behaviour has been identified. Consequently, Markov chains, neural networks and agent-based modelling are proposed as possible methods to incorporate buying behaviour within a domestic energy forecast model. Among the three methods, agent-based models are found to be the most promising, although a successful agent approach requires large amounts of input data. A prototype agent-based model has been developed and tested, which demonstrates the feasibility of an agent approach. This model shows that an agent-based approach is promising as a means to predict the effectiveness of various policy measures.
Resumo:
The availability of crop specimens archived in herbaria and old seed collections represent valuable resources for the analysis of plant genetic diversity and crop domestication. The ability to extract ancient DNA (aDNA) from such samples has recently allowed molecular genetic investigations to be undertaken in ancient materials. While analyses of aDNA initially focused on the use of markers which occur in multiple copies such as the internal transcribed spacer region (ITS) within ribosomal DNA and those requiring amplification of short DNA regions of variable length such as simple sequence repeats (SSRs), emphasis is now moving towards the genotyping of single nucleotide polymorphisms (SNPs), traditionally undertaken in aDNA by Sanger sequencing. Here, using a panel of barley aDNA samples previously surveyed by Sanger sequencing for putative causative SNPs within the flowering-time gene PPD-H1, we assess the utility of the Kompetitive Allele Specific PCR (KASP) genotyping platform for aDNA analysis. We find KASP to out-perform Sanger sequencing in the genotyping of aDNA samples (78% versus 61% success, respectively), as well as being robust to contamination. The small template size (≥46 bp) and one-step, closed-tube amplification/genotyping process make this platform ideally suited to the genotypic analysis of aDNA, a process which is often hampered by template DNA degradation and sample cross-contamination. Such attributes, as well as its flexibility of use and relatively low cost, make KASP particularly relevant to the genetic analysis of aDNA samples. Furthermore, KASP provides a common platform for the genotyping and analysis of corresponding SNPs in ancient, landrace and modern plant materials. The extended haplotype analysis of PPD-H1 undertaken here (allelic variation at which is thought to be important for the spread of domestication and local adaptation) provides further resolution to the previously identified geographic cline of flowering-time allele distribution, illustrating how KASP can be used to aid genetic analyses of aDNA from plant species. We further demonstrate the utility of KASP by genotyping ten additional genetic markers diagnostic for morphological traits in barley, shedding light on the phenotypic traits, alleles and allele combinations present in these unviable ancient specimens, as well as their geographic distributions.
Resumo:
Searching for the optimum tap-length that best balances the complexity and steady-state performance of an adaptive filter has attracted attention recently. Among existing algorithms that can be found in the literature, two of which, namely the segmented filter (SF) and gradient descent (GD) algorithms, are of particular interest as they can search for the optimum tap-length quickly. In this paper, at first, we carefully compare the SF and GD algorithms and show that the two algorithms are equivalent in performance under some constraints, but each has advantages/disadvantages relative to the other. Then, we propose an improved variable tap-length algorithm using the concept of the pseudo fractional tap-length (FT). Updating the tap-length with instantaneous errors in a style similar to that used in the stochastic gradient [or least mean squares (LMS)] algorithm, the proposed FT algorithm not only retains the advantages from both the SF and the GD algorithms but also has significantly less complexity than existing algorithms. Both performance analysis and numerical simulations are given to verify the new proposed algorithm.