957 resultados para Test Set


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents some forecasting techniques for energy demand and price prediction, one day ahead. These techniques combine wavelet transform (WT) with fixed and adaptive machine learning/time series models (multi-layer perceptron (MLP), radial basis functions, linear regression, or GARCH). To create an adaptive model, we use an extended Kalman filter or particle filter to update the parameters continuously on the test set. The adaptive GARCH model is a new contribution, broadening the applicability of GARCH methods. We empirically compared two approaches of combining the WT with prediction models: multicomponent forecasts and direct forecasts. These techniques are applied to large sets of real data (both stationary and non-stationary) from the UK energy markets, so as to provide comparative results that are statistically stronger than those previously reported. The results showed that the forecasting accuracy is significantly improved by using the WT and adaptive models. The best models on the electricity demand/gas price forecast are the adaptive MLP/GARCH with the multicomponent forecast; their MSEs are 0.02314 and 0.15384 respectively.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about 800 km, carrying a C-band scatterometer. A scatterometer measures the amount of backscatter microwave radiation reflected by small ripples on the ocean surface induced by sea-surface winds, and so provides instantaneous snap-shots of wind flow over large areas of the ocean surface, known as wind fields. Inherent in the physics of the observation process is an ambiguity in wind direction; the scatterometer cannot distinguish if the wind is blowing toward or away from the sensor device. This ambiguity implies that there is a one-to-many mapping between scatterometer data and wind direction. Current operational methods for wind field retrieval are based on the retrieval of wind vectors from satellite scatterometer data, followed by a disambiguation and filtering process that is reliant on numerical weather prediction models. The wind vectors are retrieved by the local inversion of a forward model, mapping scatterometer observations to wind vectors, and minimising a cost function in scatterometer measurement space. This thesis applies a pragmatic Bayesian solution to the problem. The likelihood is a combination of conditional probability distributions for the local wind vectors given the scatterometer data. The prior distribution is a vector Gaussian process that provides the geophysical consistency for the wind field. The wind vectors are retrieved directly from the scatterometer data by using mixture density networks, a principled method to model multi-modal conditional probability density functions. The complexity of the mapping and the structure of the conditional probability density function are investigated. A hybrid mixture density network, that incorporates the knowledge that the conditional probability distribution of the observation process is predominantly bi-modal, is developed. The optimal model, which generalises across a swathe of scatterometer readings, is better on key performance measures than the current operational model. Wind field retrieval is approached from three perspectives. The first is a non-autonomous method that confirms the validity of the model by retrieving the correct wind field 99% of the time from a test set of 575 wind fields. The second technique takes the maximum a posteriori probability wind field retrieved from the posterior distribution as the prediction. For the third technique, Markov Chain Monte Carlo (MCMC) techniques were employed to estimate the mass associated with significant modes of the posterior distribution, and make predictions based on the mode with the greatest mass associated with it. General methods for sampling from multi-modal distributions were benchmarked against a specific MCMC transition kernel designed for this problem. It was shown that the general methods were unsuitable for this application due to computational expense. On a test set of 100 wind fields the MAP estimate correctly retrieved 72 wind fields, whilst the sampling method correctly retrieved 73 wind fields.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Subunit vaccine discovery is an accepted clinical priority. The empirical approach is time- and labor-consuming and can often end in failure. Rational information-driven approaches can overcome these limitations in a fast and efficient manner. However, informatics solutions require reliable algorithms for antigen identification. All known algorithms use sequence similarity to identify antigens. However, antigenicity may be encoded subtly in a sequence and may not be directly identifiable by sequence alignment. We propose a new alignment-independent method for antigen recognition based on the principal chemical properties of protein amino acid sequences. The method is tested by cross-validation on a training set of bacterial antigens and external validation on a test set of known antigens. The prediction accuracy is 83% for the cross-validation and 80% for the external test set. Our approach is accurate and robust, and provides a potent tool for the in silico discovery of medically relevant subunit vaccines.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cleavage by the proteasome is responsible for generating the C terminus of T-cell epitopes. Modeling the process of proteasome cleavage as part of a multi-step algorithm for T-cell epitope prediction will reduce the number of non-binders and increase the overall accuracy of the predictive algorithm. Quantitative matrix-based models for prediction of the proteasome cleavage sites in a protein were developed using a training set of 489 naturally processed T-cell epitopes (nonamer peptides) associated with HLA-A and HLA-B molecules. The models were validated using an external test set of 227 T-cell epitopes. The performance of the models was good, identifying 76% of the C-termini correctly. The best model of proteasome cleavage was incorporated as the first step in a three-step algorithm for T-cell epitope prediction, where subsequent steps predicted TAP affinity and MHC binding using previously derived models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: HLA-DPs are class II MHC proteins mediating immune responses to many diseases. Peptides bind MHC class II proteins in the acidic environment within endosomes. Acidic pH markedly elevates association rate constants but dissociation rates are almost unchanged in the pH range 5.0 - 7.0. This pH-driven effect can be explained by the protonation/deprotonation states of Histidine, whose imidazole has a pKa of 6.0. At pH 5.0, imidazole ring is protonated, making Histidine positively charged and very hydrophilic, while at pH 7.0 imidazole is unprotonated, making Histidine less hydrophilic. We develop here a method to predict peptide binding to the four most frequent HLA-DP proteins: DP1, DP41, DP42 and DP5, using a molecular docking protocol. Dockings to virtual combinatorial peptide libraries were performed at pH 5.0 and pH 7.0. Results: The X-ray structure of the peptide - HLA-DP2 protein complex was used as a starting template to model by homology the structure of the four DP proteins. The resulting models were used to produce virtual combinatorial peptide libraries constructed using the single amino acid substitution (SAAS) principle. Peptides were docked into the DP binding site using AutoDock at pH 5.0 and pH 7.0. The resulting scores were normalized and used to generate Docking Score-based Quantitative Matrices (DS-QMs). The predictive ability of these QMs was tested using an external test set of 484 known DP binders. They were also compared to existing servers for DP binding prediction. The models derived at pH 5.0 predict better than those derived at pH 7.0 and showed significantly improved predictions for three of the four DP proteins, when compared to the existing servers. They are able to recognize 50% of the known binders in the top 5% of predicted peptides. Conclusions: The higher predictive ability of DS-QMs derived at pH 5.0 may be rationalised by the additional hydrogen bond formed between the backbone carbonyl oxygen belonging to the peptide position before p1 (p-1) and the protonated ε-nitrogen of His 79β. Additionally, protonated His residues are well accepted at most of the peptide binding core positions which is in a good agreement with the overall negatively charged peptide binding site of most MHC proteins. © 2012 Patronov et al.; licensee BioMed Central Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The relationship between sleep apnoea–hypopnoea syndrome (SAHS) severity and the regularity of nocturnal oxygen saturation (SaO2) recordings was analysed. Three different methods were proposed to quantify regularity: approximate entropy (AEn), sample entropy (SEn) and kernel entropy (KEn). A total of 240 subjects suspected of suffering from SAHS took part in the study. They were randomly divided into a training set (96 subjects) and a test set (144 subjects) for the adjustment and assessment of the proposed methods, respectively. According to the measurements provided by AEn, SEn and KEn, higher irregularity of oximetry signals is associated with SAHS-positive patients. Receiver operating characteristic (ROC) and Pearson correlation analyses showed that KEn was the most reliable predictor of SAHS. It provided an area under the ROC curve of 0.91 in two-class classification of subjects as SAHS-negative or SAHS-positive. Moreover, KEn measurements from oximetry data exhibited a linear dependence on the apnoea–hypopnoea index, as shown by a correlation coefficient of 0.87. Therefore, these measurements could be used for the development of simplified diagnostic techniques in order to reduce the demand for polysomnographies. Furthermore, KEn represents a convincing alternative to AEn and SEn for the diagnostic analysis of noisy biomedical signals.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

To provide biological insights into transcriptional regulation, a couple of groups have recently presented models relating the promoter DNA-bound transcription factors (TFs) to downstream gene’s mean transcript level or transcript production rates over time. However, transcript production is dynamic in response to changes of TF concentrations over time. Also, TFs are not the only factors binding to promoters; other DNA binding factors (DBFs) bind as well, especially nucleosomes, resulting in competition between DBFs for binding at same genomic location. Additionally, not only TFs, but also some other elements regulate transcription. Within core promoter, various regulatory elements influence RNAPII recruitment, PIC formation, RNAPII searching for TSS, and RNAPII initiating transcription. Moreover, it is proposed that downstream from TSS, nucleosomes resist RNAPII elongation.

Here, we provide a machine learning framework to predict transcript production rates from DNA sequences. We applied this framework in the S. cerevisiae yeast for two scenarios: a) to predict the dynamic transcript production rate during the cell cycle for native promoters; b) to predict the mean transcript production rate over time for synthetic promoters. As far as we know, our framework is the first successful attempt to have a model that can predict dynamic transcript production rates from DNA sequences only: with cell cycle data set, we got Pearson correlation coefficient Cp = 0.751 and coefficient of determination r2 = 0.564 on test set for predicting dynamic transcript production rate over time. Also, for DREAM6 Gene Promoter Expression Prediction challenge, our fitted model outperformed all participant teams, best of all teams, and a model combining best team’s k-mer based sequence features and another paper’s biologically mechanistic features, in terms of all scoring metrics.

Moreover, our framework shows its capability of identifying generalizable fea- tures by interpreting the highly predictive models, and thereby provide support for associated hypothesized mechanisms about transcriptional regulation. With the learned sparse linear models, we got results supporting the following biological insights: a) TFs govern the probability of RNAPII recruitment and initiation possibly through interactions with PIC components and transcription cofactors; b) the core promoter amplifies the transcript production probably by influencing PIC formation, RNAPII recruitment, DNA melting, RNAPII searching for and selecting TSS, releasing RNAPII from general transcription factors, and thereby initiation; c) there is strong transcriptional synergy between TFs and core promoter elements; d) the regulatory elements within core promoter region are more than TATA box and nucleosome free region, suggesting the existence of still unidentified TAF-dependent and cofactor-dependent core promoter elements in yeast S. cerevisiae; e) nucleosome occupancy is helpful for representing +1 and -1 nucleosomes’ regulatory roles on transcription.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this chapter four combinations of input features and the feedforward, cascade forward and recurrent architectures are compared for the task of forecast tourism time series. The input features of the ANNs consist in the combination of the previous 12 months, the index time modeled by two nodes used to the year and month and one input with the daily hours of sunshine (insolation duration). The index time features associated to the previous twelve values of the time series proved its relevance in this forecast task. The insolation variable can improved results with some architectures, namely the cascade forward architecture. Finally, the experimented ANN models/architectures produced a mean absolute percentage error between 4 and 6%, proving the ability of the ANN models based to forecast this time series. Besides, the feedforward architecture behaved better considering validation and test sets, with 4.2% percentage error in test set.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Faculdade Gama, Programa de Pós-Graduação em Engenharia Biomédica, 2016.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Faculdade de Tecnologia, Departamento de Engenharia Mecânica, 2016.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract : Recently, there is a great interest to study the flow characteristics of suspensions in different environmental and industrial applications, such as snow avalanches, debris flows, hydrotransport systems, and material casting processes. Regarding rheological aspects, the majority of these suspensions, such as fresh concrete, behave mostly as non-Newtonian fluids. Concrete is the most widely used construction material in the world. Due to the limitations that exist in terms of workability and formwork filling abilities of normal concrete, a new class of concrete that is able to flow under its own weight, especially through narrow gaps in the congested areas of the formwork was developed. Accordingly, self-consolidating concrete (SCC) is a novel construction material that is gaining market acceptance in various applications. Higher fluidity characteristics of SCC enable it to be used in a number of special applications, such as densely reinforced sections. However, higher flowability of SCC makes it more sensitive to segregation of coarse particles during flow (i.e., dynamic segregation) and thereafter at rest (i.e., static segregation). Dynamic segregation can increase when SCC flows over a long distance or in the presence of obstacles. Therefore, there is always a need to establish a trade-off between the flowability, passing ability, and stability properties of SCC suspensions. This should be taken into consideration to design the casting process and the mixture proportioning of SCC. This is called “workability design” of SCC. An efficient and non-expensive workability design approach consists of the prediction and optimization of the workability of the concrete mixtures for the selected construction processes, such as transportation, pumping, casting, compaction, and finishing. Indeed, the mixture proportioning of SCC should ensure the construction quality demands, such as demanded levels of flowability, passing ability, filling ability, and stability (dynamic and static). This is necessary to develop some theoretical tools to assess under what conditions the construction quality demands are satisfied. Accordingly, this thesis is dedicated to carry out analytical and numerical simulations to predict flow performance of SCC under different casting processes, such as pumping and tremie applications, or casting using buckets. The L-Box and T-Box set-ups can evaluate flow performance properties of SCC (e.g., flowability, passing ability, filling ability, shear-induced and gravitational dynamic segregation) in casting process of wall and beam elements. The specific objective of the study consists of relating numerical results of flow simulation of SCC in L-Box and T-Box test set-ups, reported in this thesis, to the flow performance properties of SCC during casting. Accordingly, the SCC is modeled as a heterogeneous material. Furthermore, an analytical model is proposed to predict flow performance of SCC in L-Box set-up using the Dam Break Theory. On the other hand, results of the numerical simulation of SCC casting in a reinforced beam are verified by experimental free surface profiles. The results of numerical simulations of SCC casting (modeled as a single homogeneous fluid), are used to determine the critical zones corresponding to the higher risks of segregation and blocking. The effects of rheological parameters, density, particle contents, distribution of reinforcing bars, and particle-bar interactions on flow performance of SCC are evaluated using CFD simulations of SCC flow in L-Box and T-box test set-ups (modeled as a heterogeneous material). Two new approaches are proposed to classify the SCC mixtures based on filling ability and performability properties, as a contribution of flowability, passing ability, and dynamic stability of SCC.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A fully relativistic four-component Dirac-Fock-Slater program for diatomics, with numerically given AO's as basis functions is presented. We discuss the problem of the errors due to the finite basis-set, and due to the influence of the negative energy solutions of the Dirac Hamiltonian. The negative continuum contributions are found to be very small.