12 resultados para Propagation prediction models
em Aston University Research Archive
Resumo:
This thesis is a study of three techniques to improve performance of some standard fore-casting models, application to the energy demand and prices. We focus on forecasting demand and price one-day ahead. First, the wavelet transform was used as a pre-processing procedure with two approaches: multicomponent-forecasts and direct-forecasts. We have empirically compared these approaches and found that the former consistently outperformed the latter. Second, adaptive models were introduced to continuously update model parameters in the testing period by combining ?lters with standard forecasting methods. Among these adaptive models, the adaptive LR-GARCH model was proposed for the fi?rst time in the thesis. Third, with regard to noise distributions of the dependent variables in the forecasting models, we used either Gaussian or Student-t distributions. This thesis proposed a novel algorithm to infer parameters of Student-t noise models. The method is an extension of earlier work for models that are linear in parameters to the non-linear multilayer perceptron. Therefore, the proposed method broadens the range of models that can use a Student-t noise distribution. Because these techniques cannot stand alone, they must be combined with prediction models to improve their performance. We combined these techniques with some standard forecasting models: multilayer perceptron, radial basis functions, linear regression, and linear regression with GARCH. These techniques and forecasting models were applied to two datasets from the UK energy markets: daily electricity demand (which is stationary) and gas forward prices (non-stationary). The results showed that these techniques provided good improvement to prediction performance.
Resumo:
This paper presents some forecasting techniques for energy demand and price prediction, one day ahead. These techniques combine wavelet transform (WT) with fixed and adaptive machine learning/time series models (multi-layer perceptron (MLP), radial basis functions, linear regression, or GARCH). To create an adaptive model, we use an extended Kalman filter or particle filter to update the parameters continuously on the test set. The adaptive GARCH model is a new contribution, broadening the applicability of GARCH methods. We empirically compared two approaches of combining the WT with prediction models: multicomponent forecasts and direct forecasts. These techniques are applied to large sets of real data (both stationary and non-stationary) from the UK energy markets, so as to provide comparative results that are statistically stronger than those previously reported. The results showed that the forecasting accuracy is significantly improved by using the WT and adaptive models. The best models on the electricity demand/gas price forecast are the adaptive MLP/GARCH with the multicomponent forecast; their MSEs are 0.02314 and 0.15384 respectively.
Resumo:
Proteins of the Major Histocompatibility Complex (MHC) bind self and nonself peptide antigens or epitopes within the cell and present them at the cell surface for recognition by T cells. All T-cell epitopes are MHC binders but not all MCH binders are T-cell epitopes. The MHC class II proteins are extremely polymorphic. Polymorphic residues cluster in the peptide-binding region and largely determine the MHC's peptide selectivity. The peptide binding site on MHC class II proteins consist of five binding pockets. Using molecular docking, we have modelled the interactions between peptide and MHC class II proteins from locus DRB1. A combinatorial peptide library was generated by mutation of residues at peptide positions which correspond to binding pockets (so called anchor positions). The binding affinities were assessed using different scoring functions. The normalized scoring functions for each amino acid at each anchor position were used to construct quantitative matrices (QM) for MHC class II binding prediction. Models were validated by external test sets comprising 4540 known binders. Eighty percent of the known binders are identified in the best predicted 15% of all overlapping peptides, originating from one protein. © 2011 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Resumo:
In many problems in spatial statistics it is necessary to infer a global problem solution by combining local models. A principled approach to this problem is to develop a global probabilistic model for the relationships between local variables and to use this as the prior in a Bayesian inference procedure. We show how a Gaussian process with hyper-parameters estimated from Numerical Weather Prediction Models yields meteorologically convincing wind fields. We use neural networks to make local estimates of wind vector probabilities. The resulting inference problem cannot be solved analytically, but Markov Chain Monte Carlo methods allow us to retrieve accurate wind fields.
Resumo:
In many problems in spatial statistics it is necessary to infer a global problem solution by combining local models. A principled approach to this problem is to develop a global probabilistic model for the relationships between local variables and to use this as the prior in a Bayesian inference procedure. We show how a Gaussian process with hyper-parameters estimated from Numerical Weather Prediction Models yields meteorologically convincing wind fields. We use neural networks to make local estimates of wind vector probabilities. The resulting inference problem cannot be solved analytically, but Markov Chain Monte Carlo methods allow us to retrieve accurate wind fields.
Resumo:
The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about 800 km, carrying a C-band scatterometer. A scatterometer measures the amount of backscatter microwave radiation reflected by small ripples on the ocean surface induced by sea-surface winds, and so provides instantaneous snap-shots of wind flow over large areas of the ocean surface, known as wind fields. Inherent in the physics of the observation process is an ambiguity in wind direction; the scatterometer cannot distinguish if the wind is blowing toward or away from the sensor device. This ambiguity implies that there is a one-to-many mapping between scatterometer data and wind direction. Current operational methods for wind field retrieval are based on the retrieval of wind vectors from satellite scatterometer data, followed by a disambiguation and filtering process that is reliant on numerical weather prediction models. The wind vectors are retrieved by the local inversion of a forward model, mapping scatterometer observations to wind vectors, and minimising a cost function in scatterometer measurement space. This thesis applies a pragmatic Bayesian solution to the problem. The likelihood is a combination of conditional probability distributions for the local wind vectors given the scatterometer data. The prior distribution is a vector Gaussian process that provides the geophysical consistency for the wind field. The wind vectors are retrieved directly from the scatterometer data by using mixture density networks, a principled method to model multi-modal conditional probability density functions. The complexity of the mapping and the structure of the conditional probability density function are investigated. A hybrid mixture density network, that incorporates the knowledge that the conditional probability distribution of the observation process is predominantly bi-modal, is developed. The optimal model, which generalises across a swathe of scatterometer readings, is better on key performance measures than the current operational model. Wind field retrieval is approached from three perspectives. The first is a non-autonomous method that confirms the validity of the model by retrieving the correct wind field 99% of the time from a test set of 575 wind fields. The second technique takes the maximum a posteriori probability wind field retrieved from the posterior distribution as the prediction. For the third technique, Markov Chain Monte Carlo (MCMC) techniques were employed to estimate the mass associated with significant modes of the posterior distribution, and make predictions based on the mode with the greatest mass associated with it. General methods for sampling from multi-modal distributions were benchmarked against a specific MCMC transition kernel designed for this problem. It was shown that the general methods were unsuitable for this application due to computational expense. On a test set of 100 wind fields the MAP estimate correctly retrieved 72 wind fields, whilst the sampling method correctly retrieved 73 wind fields.
Resumo:
Protein-DNA interactions are an essential feature in the genetic activities of life, and the ability to predict and manipulate such interactions has applications in a wide range of fields. This Thesis presents the methods of modelling the properties of protein-DNA interactions. In particular, it investigates the methods of visualising and predicting the specificity of DNA-binding Cys2His2 zinc finger interaction. The Cys2His2 zinc finger proteins interact via their individual fingers to base pair subsites on the target DNA. Four key residue positions on the a- helix of the zinc fingers make non-covalent interactions with the DNA with sequence specificity. Mutating these key residues generates combinatorial possibilities that could potentially bind to any DNA segment of interest. Many attempts have been made to predict the binding interaction using structural and chemical information, but with only limited success. The most important contribution of the thesis is that the developed model allows for the binding properties of a given protein-DNA binding to be visualised in relation to other protein-DNA combinations without having to explicitly physically model the specific protein molecule and specific DNA sequence. To prove this, various databases were generated, including a synthetic database which includes all possible combinations of the DNA-binding Cys2His2 zinc finger interactions. NeuroScale, a topographic visualisation technique, is exploited to represent the geometric structures of the protein-DNA interactions by measuring dissimilarity between the data points. In order to verify the effect of visualisation on understanding the binding properties of the DNA-binding Cys2His2 zinc finger interaction, various prediction models are constructed by using both the high dimensional original data and the represented data in low dimensional feature space. Finally, novel data sets are studied through the selected visualisation models based on the experimental DNA-zinc finger protein database. The result of the NeuroScale projection shows that different dissimilarity representations give distinctive structural groupings, but clustering in biologically-interesting ways. This method can be used to forecast the physiochemical properties of the novel proteins which may be beneficial for therapeutic purposes involving genome targeting in general.
Resumo:
Pavement analysis and design for fatigue cracking involves a number of practical problems like material assessment/screening and performance prediction. A mechanics-aided method can answer these questions with satisfactory accuracy in a convenient way when it is appropriately implemented. This paper presents two techniques to implement the pseudo J-integral based Paris’ law to evaluate and predict fatigue cracking in asphalt mixtures and pavements. The first technique, quasi-elastic simulation, provides a rational and appropriate reference modulus for the pseudo analysis (i.e., viscoelastic to elastic conversion) by making use of the widely used material property: dynamic modulus. The physical significance of the quasi-elastic simulation is clarified. Introduction of this technique facilitates the implementation of the fracture mechanics models as well as continuum damage mechanics models to characterize fatigue cracking in asphalt pavements. The second technique about modeling fracture coefficients of the pseudo J-integral based Paris’ law simplifies the prediction of fatigue cracking without performing fatigue tests. The developed prediction models for the fracture coefficients rely on readily available mixture design properties that directly affect the fatigue performance, including the relaxation modulus, air void content, asphalt binder content, and aggregate gradation. Sufficient data are collected to develop such prediction models and the R2 values are around 0.9. The presented case studies serve as examples to illustrate how the pseudo J-integral based Paris’ law predicts fatigue resistance of asphalt mixtures and assesses fatigue performance of asphalt pavements. Future applications include the estimation of fatigue life of asphalt mixtures/pavements through a distinct criterion that defines fatigue failure by its physical significance.
Resumo:
In the analysis and prediction of many real-world time series, the assumption of stationarity is not valid. A special form of non-stationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We introduce a new model which combines a dynamic switching (controlled by a hidden Markov model) and a non-linear dynamical system. We show how to train this hybrid model in a maximum likelihood approach and evaluate its performance on both synthetic and financial data.
Resumo:
Jackson (2005) developed a hybrid model of personality and learning, known as the learning styles profiler (LSP) which was designed to span biological, socio-cognitive, and experiential research foci of personality and learning research. The hybrid model argues that functional and dysfunctional learning outcomes can be best understood in terms of how cognitions and experiences control, discipline, and re-express the biologically based scale of sensation-seeking. In two studies with part-time workers undertaking tertiary education (N=137 and 58), established models of approach and avoidance from each of the three different research foci were compared with Jackson's hybrid model in their predictiveness of leadership, work, and university outcomes using self-report and supervisor ratings. Results showed that the hybrid model was generally optimal and, as hypothesized, that goal orientation was a mediator of sensation-seeking on outcomes (work performance, university performance, leader behaviours, and counterproductive work behaviour). Our studies suggest that the hybrid model has considerable promise as a predictor of work and educational outcomes as well as dysfunctional outcomes.
Resumo:
Based on Bayesian Networks, methods were created that address protein sequence-based bacterial subcellular location prediction. Distinct predictive algorithms for the eight bacterial subcellular locations were created. Several variant methods were explored. These variations included differences in the number of residues considered within the query sequence - which ranged from the N-terminal 10 residues to the whole sequence - and residue representation - which took the form of amino acid composition, percentage amino acid composition, or normalised amino acid composition. The accuracies of the best performing networks were then compared to PSORTB. All individual location methods outperform PSORTB except for the Gram+ cytoplasmic protein predictor, for which accuracies were essentially equal, and for outer membrane protein prediction, where PSORTB outperforms the binary predictor. The method described here is an important new approach to method development for subcellular location prediction. It is also a new, potentially valuable tool for candidate subunit vaccine selection.
Resumo:
Background - The binding between peptide epitopes and major histocompatibility complex proteins (MHCs) is an important event in the cellular immune response. Accurate prediction of the binding between short peptides and the MHC molecules has long been a principal challenge for immunoinformatics. Recently, the modeling of MHC-peptide binding has come to emphasize quantitative predictions: instead of categorizing peptides as "binders" or "non-binders" or as "strong binders" and "weak binders", recent methods seek to make predictions about precise binding affinities. Results - We developed a quantitative support vector machine regression (SVR) approach, called SVRMHC, to model peptide-MHC binding affinities. As a non-linear method, SVRMHC was able to generate models that out-performed existing linear models, such as the "additive method". By adopting a new "11-factor encoding" scheme, SVRMHC takes into account similarities in the physicochemical properties of the amino acids constituting the input peptides. When applied to MHC-peptide binding data for three mouse class I MHC alleles, the SVRMHC models produced more accurate predictions than those produced previously. Furthermore, comparisons based on Receiver Operating Characteristic (ROC) analysis indicated that SVRMHC was able to out-perform several prominent methods in identifying strongly binding peptides. Conclusion - As a method with demonstrated performance in the quantitative modeling of MHC-peptide binding and in identifying strong binders, SVRMHC is a promising immunoinformatics tool with not inconsiderable future potential.