43 resultados para genetic algorithm-kernel partial least squares
Resumo:
Quantitative structure-activity relationship (QSAR) analysis is a cornerstone of modern informatics. Predictive computational models of peptide-major histocompatibility complex (MHC)-binding affinity based on QSAR technology have now become important components of modern computational immunovaccinology. Historically, such approaches have been built around semiqualitative, classification methods, but these are now giving way to quantitative regression methods. We review three methods--a 2D-QSAR additive-partial least squares (PLS) and a 3D-QSAR comparative molecular similarity index analysis (CoMSIA) method--which can identify the sequence dependence of peptide-binding specificity for various class I MHC alleles from the reported binding affinities (IC50) of peptide sets. The third method is an iterative self-consistent (ISC) PLS-based additive method, which is a recently developed extension to the additive method for the affinity prediction of class II peptides. The QSAR methods presented here have established themselves as immunoinformatic techniques complementary to existing methodology, useful in the quantitative prediction of binding affinity: current methods for the in silico identification of T-cell epitopes (which form the basis of many vaccines, diagnostics, and reagents) rely on the accurate computational prediction of peptide-MHC affinity. We have reviewed various human and mouse class I and class II allele models. Studied alleles comprise HLA-A*0101, HLA-A*0201, HLA-A*0202, HLA-A*0203, HLA-A*0206, HLA-A*0301, HLA-A*1101, HLA-A*3101, HLA-A*6801, HLA-A*6802, HLA-B*3501, H2-K(k), H2-K(b), H2-D(b) HLA-DRB1*0101, HLA-DRB1*0401, HLA-DRB1*0701, I-A(b), I-A(d), I-A(k), I-A(S), I-E(d), and I-E(k). In this chapter we show a step-by-step guide into predicting the reliability and the resulting models to represent an advance on existing methods. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made are freely available online at the URL http://www.jenner.ac.uk/MHCPred.
Resumo:
With its implications for vaccine discovery, the accurate prediction of T cell epitopes is one of the key aspirations of computational vaccinology. We have developed a robust multivariate statistical method, based on partial least squares, for the quantitative prediction of peptide binding to major histocompatibility complexes (MHC), the principal checkpoint on the antigen presentation pathway. As a service to the immunobiology community, we have made a Perl implementation of the method available via a World Wide Web server. We call this server MHCPred. Access to the server is freely available from the URL: http://www.jenner.ac.uk/MHCPred. We have exemplified our method with a model for peptides binding to the common human MHC molecule HLA-B*3501.
Resumo:
Accurate T-cell epitope prediction is a principal objective of computational vaccinology. As a service to the immunology and vaccinology communities at large, we have implemented, as a server on the World Wide Web, a partial least squares-base multivariate statistical approach to the quantitative prediction of peptide binding to major histocom-patibility complexes (MHC), the key checkpoint on the antigen presentation pathway within adaptive,cellular immunity. MHCPred implements robust statistical models for both Class I alleles (HLA-A*0101, HLA-A*0201, HLA-A*0202, HLA-A*0203,HLA-A*0206, HLA-A*0301, HLA-A*1101, HLA-A*3301, HLA-A*6801, HLA-A*6802 and HLA-B*3501) and Class II alleles (HLA-DRB*0401, HLA-DRB*0401and HLA-DRB* 0701).
Resumo:
Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.
Resumo:
From a Service-Dominant Logic (S-DL) perspective, employees constitute operant resources that firms can draw to enhance the outcomes of innovation efforts. While research acknowledges that frontline employees (FLEs) constitute, through service encounters, a key interface for the transfer of valuable external knowledge into the firm, the range of potential benefits derived from FLE-driven innovation deserves more investigation. Using a sample of knowledge intensive business services firms (KIBS), this study examines how the collaboration with FLEs along the new service development (NSD) process, namely FLE co-creation, impacts on service innovation performance following two routes of different effects. Partial least squares structural equation modeling (PLS-SEM) results indicate that FLE co-creation benefits the NS success among FLEs and firm’s customers, the constituents of the resources route. FLE co-creation also has a positive effect on the NSD speed, which in turn enhances the NS quality. NSD speed and NS quality integrate the operational route, which proves to be the most effective path to impact the NS market performance. Accordingly, KIBS managers must value their FLEs as essential partners to achieve successful innovation from an internal and external perspective, and develop the appropriate mechanisms to guarantee their effective involvement along the NSD process.
Resumo:
Background: Allergy is a form of hypersensitivity to normally innocuous substances, such as dust, pollen, foods or drugs. Allergens are small antigens that commonly provoke an IgE antibody response. There are two types of bioinformatics-based allergen prediction. The first approach follows FAO/WHO Codex alimentarius guidelines and searches for sequence similarity. The second approach is based on identifying conserved allergenicity-related linear motifs. Both approaches assume that allergenicity is a linearly coded property. In the present study, we applied ACC pre-processing to sets of known allergens, developing alignment-independent models for allergen recognition based on the main chemical properties of amino acid sequences.Results: A set of 684 food, 1,156 inhalant and 555 toxin allergens was collected from several databases. A set of non-allergens from the same species were selected to mirror the allergen set. The amino acids in the protein sequences were described by three z-descriptors (z1, z2 and z3) and by auto- and cross-covariance (ACC) transformation were converted into uniform vectors. Each protein was presented as a vector of 45 variables. Five machine learning methods for classification were applied in the study to derive models for allergen prediction. The methods were: discriminant analysis by partial least squares (DA-PLS), logistic regression (LR), decision tree (DT), naïve Bayes (NB) and k nearest neighbours (kNN). The best performing model was derived by kNN at k = 3. It was optimized, cross-validated and implemented in a server named AllerTOP, freely accessible at http://www.pharmfac.net/allertop. AllerTOP also predicts the most probable route of exposure. In comparison to other servers for allergen prediction, AllerTOP outperforms them with 94% sensitivity.Conclusions: AllerTOP is the first alignment-free server for in silico prediction of allergens based on the main physicochemical properties of proteins. Significantly, as well allergenicity AllerTOP is able to predict the route of allergen exposure: food, inhalant or toxin. © 2013 Dimitrov et al.; licensee BioMed Central Ltd.
Resumo:
This paper analyzes the relationship between freight accessibility and logistics employment in the US. It develops an accessibility measure relevant for logistics companies based on a gravity model. This allows for an analysis of the accessibility of US counties focusing on four different modes of transportation: road, rail, air, and maritime. Using a Partial Least Squares model, these four different freight accessibility measures are combined into two constructs, continental and intercontinental freight accessibility, and related to logistics employment. Results show that highly accessible counties attract more logistics employment than other counties. The analyses show that it is very important to control for the effect of the county population on both freight accessibility and logistics employment. While county population explains the most variation in the logistics employment per county, there is a significant relationship between freight accessibility and logistics employment, when controlling for this effect.
Resumo:
This paper provides the most fully comprehensive evidence to date on whether or not monetary aggregates are valuable for forecasting US inflation in the early to mid 2000s. We explore a wide range of different definitions of money, including different methods of aggregation and different collections of included monetary assets. In our forecasting experiment we use two nonlinear techniques, namely, recurrent neural networks and kernel recursive least squares regressiontechniques that are new to macroeconomics. Recurrent neural networks operate with potentially unbounded input memory, while the kernel regression technique is a finite memory predictor. The two methodologies compete to find the best fitting US inflation forecasting models and are then compared to forecasts from a nave random walk model. The best models were nonlinear autoregressive models based on kernel methods. Our findings do not provide much support for the usefulness of monetary aggregates in forecasting inflation. Beyond its economic findings, our study is in the tradition of physicists' long-standing interest in the interconnections among statistical mechanics, neural networks, and related nonparametric statistical methods, and suggests potential avenues of extension for such studies. © 2010 Elsevier B.V. All rights reserved.
Resumo:
Purpose – The purpose of this empirical paper is to investigate internal marketing from a behavioural perspective. The impact of internal marketing behaviours, operationalised as an internal market orientation (IMO), on employees' marketing and other in/role behaviours (IRB) were examined. Design/methodology/approach – Survey data measuring IMO, market orientation and a range of constructs relevant to the nomological network in which they are embedded were collected from the UK retail managers. These were tested to establish their psychometric properties and the conceptual model was analysed using structural equations modelling, employing a partial least squares methodology. Findings – IMO has positive consequences for employees' market/oriented and other IRB. These, in turn, influence marketing success. Research limitations/implications – The paper provides empirical support for the long/held assumption that internal and external marketing are related and that organisations should balance their external focus with some attention to employees. Future research could measure the attitudes and behaviours of managers, employees and customers directly and explore the relationships between them. Practical implications – Firm must ensure that they do not put the needs of their employees second to those of managers and shareholders; managers must develop their listening skills and organisations must become more responsive to the needs of their employees. Originality/value – The paper contributes to the scarce body of empirical support for the role of internal marketing in services organisations. For researchers, this paper legitimises the study of internal marketing as a route to external market success; for managers, the study provides quantifiable evidence that focusing on employees' wants and needs impacts their behaviours towards the market. © 2010, Emerald Group Publishing Limited
Resumo:
A simple method for training the dynamical behavior of a neural network is derived. It is applicable to any training problem in discrete-time networks with arbitrary feedback. The method resembles back-propagation in that it is a least-squares, gradient-based optimization method, but the optimization is carried out in the hidden part of state space instead of weight space. A straightforward adaptation of this method to feedforward networks offers an alternative to training by conventional back-propagation. Computational results are presented for simple dynamical training problems, with varied success. The failures appear to arise when the method converges to a chaotic attractor. A patch-up for this problem is proposed. The patch-up involves a technique for implementing inequality constraints which may be of interest in its own right.
Resumo:
A theoretical model is presented which describes selection in a genetic algorithm (GA) under a stochastic fitness measure and correctly accounts for finite population effects. Although this model describes a number of selection schemes, we only consider Boltzmann selection in detail here as results for this form of selection are particularly transparent when fitness is corrupted by additive Gaussian noise. Finite population effects are shown to be of fundamental importance in this case, as the noise has no effect in the infinite population limit. In the limit of weak selection we show how the effects of any Gaussian noise can be removed by increasing the population size appropriately. The theory is tested on two closely related problems: the one-max problem corrupted by Gaussian noise and generalization in a perceptron with binary weights. The averaged dynamics can be accurately modelled for both problems using a formalism which describes the dynamics of the GA using methods from statistical mechanics. The second problem is a simple example of a learning problem and by considering this problem we show how the accurate characterization of noise in the fitness evaluation may be relevant in machine learning. The training error (negative fitness) is the number of misclassified training examples in a batch and can be considered as a noisy version of the generalization error if an independent batch is used for each evaluation. The noise is due to the finite batch size and in the limit of large problem size and weak selection we show how the effect of this noise can be removed by increasing the population size. This allows the optimal batch size to be determined, which minimizes computation time as well as the total number of training examples required.
Resumo:
We present a parallel genetic algorithm for nding matrix multiplication algo-rithms. For 3 x 3 matrices our genetic algorithm successfully discovered algo-rithms requiring 23 multiplications, which are equivalent to the currently best known human-developed algorithms. We also studied the cases with less mul-tiplications and evaluated the suitability of the methods discovered. Although our evolutionary method did not reach the theoretical lower bound it led to an approximate solution for 22 multiplications.
Resumo:
Transportation service operators are witnessing a growing demand for bi-directional movement of goods. Given this, the following thesis considers an extension to the vehicle routing problem (VRP) known as the delivery and pickup transportation problem (DPP), where delivery and pickup demands may occupy the same route. The problem is formulated here as the vehicle routing problem with simultaneous delivery and pickup (VRPSDP), which requires the concurrent service of the demands at the customer location. This formulation provides the greatest opportunity for cost savings for both the service provider and recipient. The aims of this research are to propose a new theoretical design to solve the multi-objective VRPSDP, provide software support for the suggested design and validate the method through a set of experiments. A new real-life based multi-objective VRPSDP is studied here, which requires the minimisation of the often conflicting objectives: operated vehicle fleet size, total routing distance and the maximum variation between route distances (workload variation). The former two objectives are commonly encountered in the domain and the latter is introduced here because it is essential for real-life routing problems. The VRPSDP is defined as a hard combinatorial optimisation problem, therefore an approximation method, Simultaneous Delivery and Pickup method (SDPmethod) is proposed to solve it. The SDPmethod consists of three phases. The first phase constructs a set of diverse partial solutions, where one is expected to form part of the near-optimal solution. The second phase determines assignment possibilities for each sub-problem. The third phase solves the sub-problems using a parallel genetic algorithm. The suggested genetic algorithm is improved by the introduction of a set of tools: genetic operator switching mechanism via diversity thresholds, accuracy analysis tool and a new fitness evaluation mechanism. This three phase method is proposed to address the shortcoming that exists in the domain, where an initial solution is built only then to be completely dismantled and redesigned in the optimisation phase. In addition, a new routing heuristic, RouteAlg, is proposed to solve the VRPSDP sub-problem, the travelling salesman problem with simultaneous delivery and pickup (TSPSDP). The experimental studies are conducted using the well known benchmark Salhi and Nagy (1999) test problems, where the SDPmethod and RouteAlg solutions are compared with the prominent works in the VRPSDP domain. The SDPmethod has demonstrated to be an effective method for solving the multi-objective VRPSDP and the RouteAlg for the TSPSDP.