39 resultados para Lanczos, Linear systems, Generalized cross validation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Since Shannon derived the seminal formula for the capacity of the additive linear white Gaussian noise channel, it has commonly been interpreted as the ultimate limit of error-free information transmission rate. However, the capacity above the corresponding linear channel limit can be achieved when noise is suppressed using nonlinear elements; that is, the regenerative function not available in linear systems. Regeneration is a fundamental concept that extends from biology to optical communications. All-optical regeneration of coherent signal has attracted particular attention. Surprisingly, the quantitative impact of regeneration on the Shannon capacity has remained unstudied. Here we propose a new method of designing regenerative transmission systems with capacity that is higher than the corresponding linear channel, and illustrate it by proposing application of the Fourier transform for efficient regeneration of multilevel multidimensional signals. The regenerative Shannon limit -the upper bound of regeneration efficiency -is derived. © 2014 Macmillan Publishers Limited. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background - Vaccine development in the post-genomic era often begins with the in silico screening of genome information, with the most probable protective antigens being predicted rather than requiring causative microorganisms to be grown. Despite the obvious advantages of this approach – such as speed and cost efficiency – its success remains dependent on the accuracy of antigen prediction. Most approaches use sequence alignment to identify antigens. This is problematic for several reasons. Some proteins lack obvious sequence similarity, although they may share similar structures and biological properties. The antigenicity of a sequence may be encoded in a subtle and recondite manner not amendable to direct identification by sequence alignment. The discovery of truly novel antigens will be frustrated by their lack of similarity to antigens of known provenance. To overcome the limitations of alignment-dependent methods, we propose a new alignment-free approach for antigen prediction, which is based on auto cross covariance (ACC) transformation of protein sequences into uniform vectors of principal amino acid properties. Results - Bacterial, viral and tumour protein datasets were used to derive models for prediction of whole protein antigenicity. Every set consisted of 100 known antigens and 100 non-antigens. The derived models were tested by internal leave-one-out cross-validation and external validation using test sets. An additional five training sets for each class of antigens were used to test the stability of the discrimination between antigens and non-antigens. The models performed well in both validations showing prediction accuracy of 70% to 89%. The models were implemented in a server, which we call VaxiJen. Conclusion - VaxiJen is the first server for alignment-independent prediction of protective antigens. It was developed to allow antigen classification solely based on the physicochemical properties of proteins without recourse to sequence alignment. The server can be used on its own or in combination with alignment-based prediction methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background - Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment between the groove and peptide, as well as creating uncertainty as to what parts of the peptide interact with the MHC. Moreover, the antigenic peptides have variable lengths, making naive modelling methods difficult to apply. This paper introduces a kernel method that can handle variable length peptides effectively by quantifying similarities between peptide sequences and integrating these into the kernel. Results - The kernel approach presented here shows increased prediction accuracy with a significantly higher number of true positives and negatives on multiple MHC class II alleles, when testing data sets from MHCPEP [1], MCHBN [2], and MHCBench [3]. Evaluation by cross validation, when segregating binders and non-binders, produced an average of 0.824 AROC for the MHCBench data sets (up from 0.756), and an average of 0.96 AROC for multiple alleles of the MHCPEP database. Conclusion - The method improves performance over existing state-of-the-art methods of MHC class II peptide binding predictions by using a custom, knowledge-based representation of peptides. Similarity scores, in contrast to a fixed-length, pocket-specific representation of amino acids, provide a flexible and powerful way of modelling MHC binding, and can easily be applied to other dynamic sequence problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Subunit vaccine discovery is an accepted clinical priority. The empirical approach is time- and labor-consuming and can often end in failure. Rational information-driven approaches can overcome these limitations in a fast and efficient manner. However, informatics solutions require reliable algorithms for antigen identification. All known algorithms use sequence similarity to identify antigens. However, antigenicity may be encoded subtly in a sequence and may not be directly identifiable by sequence alignment. We propose a new alignment-independent method for antigen recognition based on the principal chemical properties of protein amino acid sequences. The method is tested by cross-validation on a training set of bacterial antigens and external validation on a test set of known antigens. The prediction accuracy is 83% for the cross-validation and 80% for the external test set. Our approach is accurate and robust, and provides a potent tool for the in silico discovery of medically relevant subunit vaccines.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accurate identification of T-cell epitopes remains a principal goal of bioinformatics within immunology. As the immunogenicity of peptide epitopes is dependent on their binding to major histocompatibility complex (MHC) molecules, the prediction of binding affinity is a prerequisite to the reliable prediction of epitopes. The iterative self-consistent (ISC) partial-least-squares (PLS)-based additive method is a recently developed bioinformatic approach for predicting class II peptide−MHC binding affinity. The ISC−PLS method overcomes many of the conceptual difficulties inherent in the prediction of class II peptide−MHC affinity, such as the binding of a mixed population of peptide lengths due to the open-ended class II binding site. The method has applications in both the accurate prediction of class II epitopes and the manipulation of affinity for heteroclitic and competitor peptides. The method is applied here to six class II mouse alleles (I-Ab, I-Ad, I-Ak, I-As, I-Ed, and I-Ek) and included peptides up to 25 amino acids in length. A series of regression equations highlighting the quantitative contributions of individual amino acids at each peptide position was established. The initial model for each allele exhibited only moderate predictivity. Once the set of selected peptide subsequences had converged, the final models exhibited a satisfactory predictive power. Convergence was reached between the 4th and 17th iterations, and the leave-one-out cross-validation statistical terms - q2, SEP, and NC - ranged between 0.732 and 0.925, 0.418 and 0.816, and 1 and 6, respectively. The non-cross-validated statistical terms r2 and SEE ranged between 0.98 and 0.995 and 0.089 and 0.180, respectively. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made freely available online (http://www.jenner.ac.uk/MHCPred).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A set of 38 epitopes and 183 non-epitopes, which bind to alleles of the HLA-A3 supertype, was subjected to a combination of comparative molecular similarity indices analysis (CoMSIA) and soft independent modeling of class analogy (SIMCA). During the process of T cell recognition, T cell receptors (TCR) interact with the central section of the bound nonamer peptide; thus only positions 4−8 were considered in the study. The derived model distinguished 82% of the epitopes and 73% of the non-epitopes after cross-validation in five groups. The overall preference from the model is for polar amino acids with high electron density and the ability to form hydrogen bonds. These so-called “aggressive” amino acids are flanked by small-sized residues, which enable such residues to protrude from the binding cleft and take an active role in TCR-mediated T cell recognition. Combinations of “aggressive” and “passive” amino acids in the middle part of epitopes constitute a putative TCR binding motif

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Allergy is an overreaction by the immune system to a previously encountered, ordinarily harmless substance - typically proteins - resulting in skin rash, swelling of mucous membranes, sneezing or wheezing, or other abnormal conditions. The use of modified proteins is increasingly widespread: their presence in food, commercial products, such as washing powder, and medical therapeutics and diagnostics, makes predicting and identifying potential allergens a crucial societal issue. The prediction of allergens has been explored widely using bioinformatics, with many tools being developed in the last decade; many of these are freely available online. Here, we report a set of novel models for allergen prediction utilizing amino acid E-descriptors, auto- and cross-covariance transformation, and several machine learning methods for classification, including logistic regression (LR), decision tree (DT), naïve Bayes (NB), random forest (RF), multilayer perceptron (MLP) and k nearest neighbours (kNN). The best performing method was kNN with 85.3% accuracy at 5-fold cross-validation. The resulting model has been implemented in a revised version of the AllerTOP server (http://www.ddg-pharmfac.net/AllerTOP). © Springer-Verlag 2014.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Dirichlet process mixture model (DPMM) is a ubiquitous, flexible Bayesian nonparametric statistical model. However, full probabilistic inference in this model is analytically intractable, so that computationally intensive techniques such as Gibbs sampling are required. As a result, DPMM-based methods, which have considerable potential, are restricted to applications in which computational resources and time for inference is plentiful. For example, they would not be practical for digital signal processing on embedded hardware, where computational resources are at a serious premium. Here, we develop a simplified yet statistically rigorous approximate maximum a-posteriori (MAP) inference algorithm for DPMMs. This algorithm is as simple as DP-means clustering, solves the MAP problem as well as Gibbs sampling, while requiring only a fraction of the computational effort. (For freely available code that implements the MAP-DP algorithm for Gaussian mixtures see http://www.maxlittle.net/.) Unlike related small variance asymptotics (SVA), our method is non-degenerate and so inherits the “rich get richer” property of the Dirichlet process. It also retains a non-degenerate closed-form likelihood which enables out-of-sample calculations and the use of standard tools such as cross-validation. We illustrate the benefits of our algorithm on a range of examples and contrast it to variational, SVA and sampling approaches from both a computational complexity perspective as well as in terms of clustering performance. We demonstrate the wide applicabiity of our approach by presenting an approximate MAP inference method for the infinite hidden Markov model whose performance contrasts favorably with a recently proposed hybrid SVA approach. Similarly, we show how our algorithm can applied to a semiparametric mixed-effects regression model where the random effects distribution is modelled using an infinite mixture model, as used in longitudinal progression modelling in population health science. Finally, we propose directions for future research on approximate MAP inference in Bayesian nonparametrics.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the Bayesian framework, predictions for a regression problem are expressed in terms of a distribution of output values. The mode of this distribution corresponds to the most probable output, while the uncertainty associated with the predictions can conveniently be expressed in terms of error bars. In this paper we consider the evaluation of error bars in the context of the class of generalized linear regression models. We provide insights into the dependence of the error bars on the location of the data points and we derive an upper bound on the true error bars in terms of the contributions from individual data points which are themselves easily evaluated.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There is an increase in the use of multi-pulse, rectifier-fed motor-drive equipment on board more-electric aircraft. Motor drives with feedback control appear as constant power loads to the rectifiers, which can cause instability of the DC filter capacitor voltage at the output of the rectifier. This problem can be exacerbated by interactions between rectifiers that share a common source impedance. In order that such a system can be analysed, there is a need for average, dynamic models of systems of rectifiers. In this study, an efficient, compact method for deriving the approximate, linear, large-signal, average models of two heterogeneous systems of rectifiers, which are fed from a common source impedance, is presented. The models give insight into significant interaction effects that occur between the converters, and that arise through the shared source impedance. First, a 6-pulse and doubly wound, transformer-fed, 12-pulse rectifier system is considered, followed by a 6-pulse and autotransformer-fed, 12-pulse rectifier system. The system models are validated against detailed simulations and laboratory prototypes, and key characteristics of the two system types are compared.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We compare the Q parameter obtained from scalar, semi-analytical and full vector models for realistic transmission systems. One set of systems is operated in the linear regime, while another is using solitons at high peak power. We report in detail on the different results obtained for the same system using different models. Polarisation mode dispersion is also taken into account and a novel method to average Q parameters over several independent simulation runs is described. © 2006 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We compare the Q parameter obtained from the semi-analytical model with scalar and vector models for two realistic transmission systems. First a linear system with a compensated dispersion map and second a soliton transmission system.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The kinematic mapping of a rigid open-link manipulator is a homomorphism between Lie groups. The homomorphisrn has solution groups that act on an inverse kinematic solution element. A canonical representation of solution group operators that act on a solution element of three and seven degree-of-freedom (do!) dextrous manipulators is determined by geometric analysis. Seven canonical solution groups are determined for the seven do! Robotics Research K-1207 and Hollerbach arms. The solution element of a dextrous manipulator is a collection of trivial fibre bundles with solution fibres homotopic to the Torus. If fibre solutions are parameterised by a scalar, a direct inverse funct.ion that maps the scalar and Cartesian base space coordinates to solution element fibre coordinates may be defined. A direct inverse pararneterisation of a solution element may be approximated by a local linear map generated by an inverse augmented Jacobian correction of a linear interpolation. The action of canonical solution group operators on a local linear approximation of the solution element of inverse kinematics of dextrous manipulators generates cyclical solutions. The solution representation is proposed as a model of inverse kinematic transformations in primate nervous systems. Simultaneous calibration of a composition of stereo-camera and manipulator kinematic models is under-determined by equi-output parameter groups in the composition of stereo-camera and Denavit Hartenberg (DH) rnodels. An error measure for simultaneous calibration of a composition of models is derived and parameter subsets with no equi-output groups are determined by numerical experiments to simultaneously calibrate the composition of homogeneous or pan-tilt stereo-camera with DH models. For acceleration of exact Newton second-order re-calibration of DH parameters after a sequential calibration of stereo-camera and DH parameters, an optimal numerical evaluation of DH matrix first order and second order error derivatives with respect to a re-calibration error function is derived, implemented and tested. A distributed object environment for point and click image-based tele-command of manipulators and stereo-cameras is specified and implemented that supports rapid prototyping of numerical experiments in distributed system control. The environment is validated by a hierarchical k-fold cross validated calibration to Cartesian space of a radial basis function regression correction of an affine stereo model. Basic design and performance requirements are defined for scalable virtual micro-kernels that broker inter-Java-virtual-machine remote method invocations between components of secure manageable fault-tolerant open distributed agile Total Quality Managed ISO 9000+ conformant Just in Time manufacturing systems.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The tribology of linear tape storage system including Linear Tape Open (LTO) and Travan5 was investigated by combining X-ray Photoelectron Spectroscopy (XPS), Auger Electron Spectroscopy (AES), Optical Microscopy and Atomic Force Microscopy (AFM) technologies. The purpose of this study was to understand the tribology mechanism of linear tape systems then projected recording densities may be achieved in future systems. Water vapour pressure or Normalized Water Content (NWC) rather than the Relative Humidity (RH) values (as are used almost universally in this field) determined the extent of PTR and stain (if produced) in linear heads. Approximately linear dependencies were found for saturated PTR increasing with normalized water content increasing over the range studied using the same tape. Fe Stain (if produced) preferentially formed on the head surfaces at the lower water contents. The stain formation mechanism had been identified. Adhesive bond formation is a chemical process that is governed by temperature. Thus the higher the contact pressure, the higher the contact temperature in the interface of head and tape, was produced higher the probability of adhesive bond formation and the greater the amount of transferred material (stain). Water molecules at the interface saturate the surface bonds and makes adhesive junctions less likely. Tape polymeric binder formulation also has a significant role in stain formation, with the latest generation binders producing less transfer of material. This is almost certainly due to higher cohesive bonds within the body of the magnetic layer. TiC in the two-phase ceramic tape-bearing surface (AlTiC) was found to oxidise to form TiO2.The oxidation rate of TiC increased with water content increasing. The oxide was less dense than the underlying carbide; hence the interface between TiO2 oxide and TiC was stressed. Removals of the oxide phase results in the formation of three-body abrasive particles that were swept across the tape head, and gave rise to three-body abrasive wear, particularly in the pole regions. Hence, PTR and subsequent which signal loss and error growth. The lower contact pressure of the LTO system comparing with the Travan5 system ensures that fewer and smaller three-body abrasive particles were swept across the poles and insulator regions. Hence, lower contact pressure, as well as reducing stain in the same time significantly reduces PTR in the LTO system.