27 resultados para local sequence alignment problem

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation: Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function, and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Results: Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from “first passage probability distribution” to summarize statistics of ensemble averaged amino acid propensity values. In this paper, we introduce and elaborate this approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Subunit vaccine discovery is an accepted clinical priority. The empirical approach is time- and labor-consuming and can often end in failure. Rational information-driven approaches can overcome these limitations in a fast and efficient manner. However, informatics solutions require reliable algorithms for antigen identification. All known algorithms use sequence similarity to identify antigens. However, antigenicity may be encoded subtly in a sequence and may not be directly identifiable by sequence alignment. We propose a new alignment-independent method for antigen recognition based on the principal chemical properties of protein amino acid sequences. The method is tested by cross-validation on a training set of bacterial antigens and external validation on a test set of known antigens. The prediction accuracy is 83% for the cross-validation and 80% for the external test set. Our approach is accurate and robust, and provides a potent tool for the in silico discovery of medically relevant subunit vaccines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background - Vaccine development in the post-genomic era often begins with the in silico screening of genome information, with the most probable protective antigens being predicted rather than requiring causative microorganisms to be grown. Despite the obvious advantages of this approach – such as speed and cost efficiency – its success remains dependent on the accuracy of antigen prediction. Most approaches use sequence alignment to identify antigens. This is problematic for several reasons. Some proteins lack obvious sequence similarity, although they may share similar structures and biological properties. The antigenicity of a sequence may be encoded in a subtle and recondite manner not amendable to direct identification by sequence alignment. The discovery of truly novel antigens will be frustrated by their lack of similarity to antigens of known provenance. To overcome the limitations of alignment-dependent methods, we propose a new alignment-free approach for antigen prediction, which is based on auto cross covariance (ACC) transformation of protein sequences into uniform vectors of principal amino acid properties. Results - Bacterial, viral and tumour protein datasets were used to derive models for prediction of whole protein antigenicity. Every set consisted of 100 known antigens and 100 non-antigens. The derived models were tested by internal leave-one-out cross-validation and external validation using test sets. An additional five training sets for each class of antigens were used to test the stability of the discrimination between antigens and non-antigens. The models performed well in both validations showing prediction accuracy of 70% to 89%. The models were implemented in a server, which we call VaxiJen. Conclusion - VaxiJen is the first server for alignment-independent prediction of protective antigens. It was developed to allow antigen classification solely based on the physicochemical properties of proteins without recourse to sequence alignment. The server can be used on its own or in combination with alignment-based prediction methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study a variation of the graph coloring problem on random graphs of finite average connectivity. Given the number of colors, we aim to maximize the number of different colors at neighboring vertices (i.e. one edge distance) of any vertex. Two efficient algorithms, belief propagation and Walksat are adapted to carry out this task. We present experimental results based on two types of random graphs for different system sizes and identify the critical value of the connectivity for the algorithms to find a perfect solution. The problem and the suggested algorithms have practical relevance since various applications, such as distributed storage, can be mapped onto this problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We study the equilibrium states of energy functions involving a large set of real variables, defined on the links of sparsely connected networks, and interacting at the network nodes, using the cavity and replica methods. When applied to the representative problem of network resource allocation, an efficient distributed algorithm is devised, with simulations showing full agreement with theory. Scaling properties with the network connectivity and the resource availability are found. © 2006 The American Physical Society.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose – A binary integer programming model for the simple assembly line balancing problem (SALBP), which is well known as SALBP-1, was formulated more than 30 years ago. Since then, a number of researchers have extended the model for the variants of assembly line balancing problem.The model is still prevalent nowadays mainly because of the lower and upper bounds on task assignment. These properties avoid significant increase of decision variables. The purpose of this paper is to use an example to show that the model may lead to a confusing solution. Design/methodology/approach – The paper provides a remedial constraint set for the model to rectify the disordered sequence problem. Findings – The paper presents proof that the assembly line balancing model formulated by Patterson and Albracht may lead to a confusing solution. Originality/value – No one previously has found that the commonly used model is incorrect.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper formulates several mathematical models for determining the optimal sequence of component placements and assignment of component types to feeders simultaneously or the integrated scheduling problem for a type of surface mount technology placement machines, called the sequential pick-andplace (PAP) machine. A PAP machine has multiple stationary feeders storing components, a stationary working table holding a printed circuit board (PCB), and a movable placement head to pick up components from feeders and place them to a board. The objective of integrated problem is to minimize the total distance traveled by the placement head. Two integer nonlinear programming models are formulated first. Then, each of them is equivalently converted into an integer linear type. The models for the integrated problem are verified by two commercial packages. In addition, a hybrid genetic algorithm previously developed by the authors is adopted to solve the models. The algorithm not only generates the optimal solutions quickly for small-sized problems, but also outperforms the genetic algorithms developed by other researchers in terms of total traveling distance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A chip shooter machine in printed circuit board (PCB) assembly has three movable mechanisms: an X-Y table carrying a PCB, a feeder carrier with several feeders holding components and a rotary turret with multiple assembly heads to pick up and place components. In order to get the minimal placement or assembly time for a PCB on the machine, all the components on the board should be placed in a perfect sequence, and the components should be set up on a right feeder, or feeders since two feeders can hold the same type of components, and additionally, the assembly head should retrieve or pick up a component from a right feeder. The entire problem is very complicated, and this paper presents a genetic algorithm approach to tackle it.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper formulates a logistics distribution problem as the multi-depot travelling salesman problem (MDTSP). The decision makers not only have to determine the travelling sequence of the salesman for delivering finished products from a warehouse or depot to a customer, but also need to determine which depot stores which type of products so that the total travelling distance is minimised. The MDTSP is similar to the combination of the travelling salesman and quadratic assignment problems. In this paper, the two individual hard problems or models are formulated first. Then, the problems are integrated together, that is, the MDTSP. The MDTSP is constructed as both integer nonlinear and linear programming models. After formulating the models, we verify the integrated models using commercial packages, and most importantly, investigate whether an iterative approach, that is, solving the individual models repeatedly, can generate an optimal solution to the MDTSP. Copyright © 2006 Inderscience Enterprises Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Edges are key points of information in visual scenes. One important class of models supposes that edges correspond to the steepest parts of the luminance profile, implying that they can be found as peaks and troughs in the response of a gradient (1st derivative) filter, or as zero-crossings in the 2nd derivative (ZCs). We tested those ideas using a stimulus that has no local peaks of gradient and no ZCs, at any scale. The stimulus profile is analogous to the Mach ramp, but it is the luminance gradient (not the absolute luminance) that increases as a linear ramp between two plateaux; the luminance profile is a blurred triangle-wave. For all image-blurs tested, observers marked edges at or close to the corner points in the gradient profile, even though these were not gradient maxima. These Mach edges correspond to peaks and troughs in the 3rd derivative. Thus Mach edges are inconsistent with many standard edge-detection schemes, but are nicely predicted by a recent model that finds edge points with a 2-stage sequence of 1st then 2nd derivative operators, each followed by a half-wave rectifier.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article examines the adoption, by the New Labour government, of a mixed communities approach to the renewal of disadvantaged neighbourhoods in England. It argues that while there are continuities with previous policy, the new approach represents a more neoliberal policy turn in three respects: its identification of concentrated poverty as the problem; its faith in market-led regeneration; and its alignment with a new urban policy agenda in which cities are gentrified and remodelled as sites for capital accumulation through entrepreneurial local governance. The article then draws on evidence from the early phases of the evaluation of the mixed community demonstration projects to explore how the new policy approach is playing out at a local level, where it is layered upon existing policies, politics and institutional relationships. Tensions between neighbourhood and strategic interests, community and capital are evident as the local projects attempt neighbourhood transformation, while seeking to protect the rights and interests of existing residents. Extensive community consultation efforts run parallel with emergent governance structures, in which local state and capital interests combine and communities may effectively be disempowered. Policies and structures are still evolving and it is not yet entirely clear how these tensions will be resolved, especially in the light of a collapsing housing market, increased poverty and demand for affordable housing, and a shortage of private investment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research is concerned with the development of distributed real-time systems, in which software is used for the control of concurrent physical processes. These distributed control systems are required to periodically coordinate the operation of several autonomous physical processes, with the property of an atomic action. The implementation of this coordination must be fault-tolerant if the integrity of the system is to be maintained in the presence of processor or communication failures. Commit protocols have been widely used to provide this type of atomicity and ensure consistency in distributed computer systems. The objective of this research is the development of a class of robust commit protocols, applicable to the coordination of distributed real-time control systems. Extended forms of the standard two phase commit protocol, that provides fault-tolerant and real-time behaviour, were developed. Petri nets are used for the design of the distributed controllers, and to embed the commit protocol models within these controller designs. This composition of controller and protocol model allows the analysis of the complete system in a unified manner. A common problem for Petri net based techniques is that of state space explosion, a modular approach to both the design and analysis would help cope with this problem. Although extensions to Petri nets that allow module construction exist, generally the modularisation is restricted to the specification, and analysis must be performed on the (flat) detailed net. The Petri net designs for the type of distributed systems considered in this research are both large and complex. The top down, bottom up and hybrid synthesis techniques that are used to model large systems in Petri nets are considered. A hybrid approach to Petri net design for a restricted class of communicating processes is developed. Designs produced using this hybrid approach are modular and allow re-use of verified modules. In order to use this form of modular analysis, it is necessary to project an equivalent but reduced behaviour on the modules used. These projections conceal events local to modules that are not essential for the purpose of analysis. To generate the external behaviour, each firing sequence of the subnet is replaced by an atomic transition internal to the module, and the firing of these transitions transforms the input and output markings of the module. Thus local events are concealed through the projection of the external behaviour of modules. This hybrid design approach preserves properties of interest, such as boundedness and liveness, while the systematic concealment of local events allows the management of state space. The approach presented in this research is particularly suited to distributed systems, as the underlying communication model is used as the basis for the interconnection of modules in the design procedure. This hybrid approach is applied to Petri net based design and analysis of distributed controllers for two industrial applications that incorporate the robust, real-time commit protocols developed. Temporal Petri nets, which combine Petri nets and temporal logic, are used to capture and verify causal and temporal aspects of the designs in a unified manner.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We conducted a detailed study of a case of linguistic talent in the context of autism spectrum disorder, specifically Asperger syndrome. I.A. displays language strengths at the level of morphology and syntax. Yet, despite this grammar advantage, processing of figurative language and inferencing based on context presents a problem for him. The morphology advantage for I.A. is consistent with the weak central coherence (WCC) account of autism. From this account, the presence of a local processing bias is evident in the ways in which autistic individuals solve common problems, such as assessing similarities between objects and finding common patterns, and may therefore provide an advantage in some cognitive tasks compared to typical individuals. We extend the WCC account to language and provide evidence for a connection between the local processing bias and the acquisition of morphology and grammar.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The G-protein coupled receptors--or GPCRs--comprise simultaneously one of the largest and one of the most multi-functional protein families known to modern-day molecular bioscience. From a drug discovery and pharmaceutical industry perspective, the GPCRs constitute one of the most commercially and economically important groups of proteins known. The GPCRs undertake numerous vital metabolic functions and interact with a hugely diverse range of small and large ligands. Many different methodologies have been developed to efficiently and accurately classify the GPCRs. These range from motif-based techniques to machine learning as well as a variety of alignment-free techniques based on the physiochemical properties of sequences. We review here the available methodologies for the classification of GPCRs. Part of this work focuses on how we have tried to build the intrinsically hierarchical nature of sequence relations, implicit within the family, into an adaptive approach to classification. Importantly, we also allude to some of the key innate problems in developing an effective approach to classifying the GPCRs: the lack of sequence similarity between the six classes that comprise the GPCR family and the low sequence similarity to other family members evinced by many newly revealed members of the family.