11 resultados para degenerate test set

em Aston University Research Archive


Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the present study, multilayer perceptron (MLP) neural networks were applied to help in the diagnosis of obstructive sleep apnoea syndrome (OSAS). Oxygen saturation (SaO2) recordings from nocturnal pulse oximetry were used for this purpose. We performed time and spectral analysis of these signals to extract 14 features related to OSAS. The performance of two different MLP classifiers was compared: maximum likelihood (ML) and Bayesian (BY) MLP networks. A total of 187 subjects suspected of suffering from OSAS took part in the study. Their SaO2 signals were divided into a training set with 74 recordings and a test set with 113 recordings. BY-MLP networks achieved the best performance on the test set with 85.58% accuracy (87.76% sensitivity and 82.39% specificity). These results were substantially better than those provided by ML-MLP networks, which were affected by overfitting and achieved an accuracy of 76.81% (86.42% sensitivity and 62.83% specificity). Our results suggest that the Bayesian framework is preferred to implement our MLP classifiers. The proposed BY-MLP networks could be used for early OSAS detection. They could contribute to overcome the difficulties of nocturnal polysomnography (PSG) and thus reduce the demand for these studies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Urinary proteomics is emerging as a powerful non-invasive tool for diagnosis and monitoring of variety of human diseases. We tested whether signatures of urinary polypeptides can contribute to the existing biomarkers for coronary artery disease (CAD). We examined a total of 359 urine samples from 88 patients with severe CAD and 282 controls. Spot urine was analyzed using capillary electrophoresis on-line coupled to ESI-TOF-MS enabling characterization of more than 1000 polypeptides per sample. In a first step a "training set" for biomarker definition was created. Multiple biomarker patterns clearly distinguished healthy controls from CAD patients, and we extracted 15 peptides that define a characteristic CAD signature panel. In a second step, the ability of the CAD-specific panel to predict the presence of CAD was evaluated in a blinded study using a "test set." The signature panel showed sensitivity of 98% (95% confidence interval, 88.7-99.6) and 83% specificity (95% confidence interval, 51.6-97.4). Furthermore the peptide pattern significantly changed toward the healthy signature correlating with the level of physical activity after therapeutic intervention. Our results show that urinary proteomics can identify CAD patients with high confidence and might also play a role in monitoring the effects of therapeutic interventions. The workflow is amenable to clinical routine testing suggesting that non-invasive proteomics analysis can become a valuable addition to other biomarkers used in cardiovascular risk assessment.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Major histocompatibility complex (MHC) II proteins bind peptide fragments derived from pathogen antigens and present them at the cell surface for recognition by T cells. MHC proteins are divided into Class I and Class II. Human MHC Class II alleles are grouped into three loci: HLA-DP, HLA-DQ, and HLA-DR. They are involved in many autoimmune diseases. In contrast to HLA-DR and HLA-DQ proteins, the X-ray structure of the HLA-DP2 protein has been solved quite recently. In this study, we have used structure-based molecular dynamics simulation to derive a tool for rapid and accurate virtual screening for the prediction of HLA-DP2-peptide binding. A combinatorial library of 247 peptides was built using the "single amino acid substitution" approach and docked into the HLA-DP2 binding site. The complexes were simulated for 1 ns and the short range interaction energies (Lennard-Jones and Coulumb) were used as binding scores after normalization. The normalized values were collected into quantitative matrices (QMs) and their predictive abilities were validated on a large external test set. The validation shows that the best performing QM consisted of Lennard-Jones energies normalized over all positions for anchor residues only plus cross terms between anchor-residues.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents some forecasting techniques for energy demand and price prediction, one day ahead. These techniques combine wavelet transform (WT) with fixed and adaptive machine learning/time series models (multi-layer perceptron (MLP), radial basis functions, linear regression, or GARCH). To create an adaptive model, we use an extended Kalman filter or particle filter to update the parameters continuously on the test set. The adaptive GARCH model is a new contribution, broadening the applicability of GARCH methods. We empirically compared two approaches of combining the WT with prediction models: multicomponent forecasts and direct forecasts. These techniques are applied to large sets of real data (both stationary and non-stationary) from the UK energy markets, so as to provide comparative results that are statistically stronger than those previously reported. The results showed that the forecasting accuracy is significantly improved by using the WT and adaptive models. The best models on the electricity demand/gas price forecast are the adaptive MLP/GARCH with the multicomponent forecast; their MSEs are 0.02314 and 0.15384 respectively.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about 800 km, carrying a C-band scatterometer. A scatterometer measures the amount of backscatter microwave radiation reflected by small ripples on the ocean surface induced by sea-surface winds, and so provides instantaneous snap-shots of wind flow over large areas of the ocean surface, known as wind fields. Inherent in the physics of the observation process is an ambiguity in wind direction; the scatterometer cannot distinguish if the wind is blowing toward or away from the sensor device. This ambiguity implies that there is a one-to-many mapping between scatterometer data and wind direction. Current operational methods for wind field retrieval are based on the retrieval of wind vectors from satellite scatterometer data, followed by a disambiguation and filtering process that is reliant on numerical weather prediction models. The wind vectors are retrieved by the local inversion of a forward model, mapping scatterometer observations to wind vectors, and minimising a cost function in scatterometer measurement space. This thesis applies a pragmatic Bayesian solution to the problem. The likelihood is a combination of conditional probability distributions for the local wind vectors given the scatterometer data. The prior distribution is a vector Gaussian process that provides the geophysical consistency for the wind field. The wind vectors are retrieved directly from the scatterometer data by using mixture density networks, a principled method to model multi-modal conditional probability density functions. The complexity of the mapping and the structure of the conditional probability density function are investigated. A hybrid mixture density network, that incorporates the knowledge that the conditional probability distribution of the observation process is predominantly bi-modal, is developed. The optimal model, which generalises across a swathe of scatterometer readings, is better on key performance measures than the current operational model. Wind field retrieval is approached from three perspectives. The first is a non-autonomous method that confirms the validity of the model by retrieving the correct wind field 99% of the time from a test set of 575 wind fields. The second technique takes the maximum a posteriori probability wind field retrieved from the posterior distribution as the prediction. For the third technique, Markov Chain Monte Carlo (MCMC) techniques were employed to estimate the mass associated with significant modes of the posterior distribution, and make predictions based on the mode with the greatest mass associated with it. General methods for sampling from multi-modal distributions were benchmarked against a specific MCMC transition kernel designed for this problem. It was shown that the general methods were unsuitable for this application due to computational expense. On a test set of 100 wind fields the MAP estimate correctly retrieved 72 wind fields, whilst the sampling method correctly retrieved 73 wind fields.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Subunit vaccine discovery is an accepted clinical priority. The empirical approach is time- and labor-consuming and can often end in failure. Rational information-driven approaches can overcome these limitations in a fast and efficient manner. However, informatics solutions require reliable algorithms for antigen identification. All known algorithms use sequence similarity to identify antigens. However, antigenicity may be encoded subtly in a sequence and may not be directly identifiable by sequence alignment. We propose a new alignment-independent method for antigen recognition based on the principal chemical properties of protein amino acid sequences. The method is tested by cross-validation on a training set of bacterial antigens and external validation on a test set of known antigens. The prediction accuracy is 83% for the cross-validation and 80% for the external test set. Our approach is accurate and robust, and provides a potent tool for the in silico discovery of medically relevant subunit vaccines.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Cleavage by the proteasome is responsible for generating the C terminus of T-cell epitopes. Modeling the process of proteasome cleavage as part of a multi-step algorithm for T-cell epitope prediction will reduce the number of non-binders and increase the overall accuracy of the predictive algorithm. Quantitative matrix-based models for prediction of the proteasome cleavage sites in a protein were developed using a training set of 489 naturally processed T-cell epitopes (nonamer peptides) associated with HLA-A and HLA-B molecules. The models were validated using an external test set of 227 T-cell epitopes. The performance of the models was good, identifying 76% of the C-termini correctly. The best model of proteasome cleavage was incorporated as the first step in a three-step algorithm for T-cell epitope prediction, where subsequent steps predicted TAP affinity and MHC binding using previously derived models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: HLA-DPs are class II MHC proteins mediating immune responses to many diseases. Peptides bind MHC class II proteins in the acidic environment within endosomes. Acidic pH markedly elevates association rate constants but dissociation rates are almost unchanged in the pH range 5.0 - 7.0. This pH-driven effect can be explained by the protonation/deprotonation states of Histidine, whose imidazole has a pKa of 6.0. At pH 5.0, imidazole ring is protonated, making Histidine positively charged and very hydrophilic, while at pH 7.0 imidazole is unprotonated, making Histidine less hydrophilic. We develop here a method to predict peptide binding to the four most frequent HLA-DP proteins: DP1, DP41, DP42 and DP5, using a molecular docking protocol. Dockings to virtual combinatorial peptide libraries were performed at pH 5.0 and pH 7.0. Results: The X-ray structure of the peptide - HLA-DP2 protein complex was used as a starting template to model by homology the structure of the four DP proteins. The resulting models were used to produce virtual combinatorial peptide libraries constructed using the single amino acid substitution (SAAS) principle. Peptides were docked into the DP binding site using AutoDock at pH 5.0 and pH 7.0. The resulting scores were normalized and used to generate Docking Score-based Quantitative Matrices (DS-QMs). The predictive ability of these QMs was tested using an external test set of 484 known DP binders. They were also compared to existing servers for DP binding prediction. The models derived at pH 5.0 predict better than those derived at pH 7.0 and showed significantly improved predictions for three of the four DP proteins, when compared to the existing servers. They are able to recognize 50% of the known binders in the top 5% of predicted peptides. Conclusions: The higher predictive ability of DS-QMs derived at pH 5.0 may be rationalised by the additional hydrogen bond formed between the backbone carbonyl oxygen belonging to the peptide position before p1 (p-1) and the protonated ε-nitrogen of His 79β. Additionally, protonated His residues are well accepted at most of the peptide binding core positions which is in a good agreement with the overall negatively charged peptide binding site of most MHC proteins. © 2012 Patronov et al.; licensee BioMed Central Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The relationship between sleep apnoea–hypopnoea syndrome (SAHS) severity and the regularity of nocturnal oxygen saturation (SaO2) recordings was analysed. Three different methods were proposed to quantify regularity: approximate entropy (AEn), sample entropy (SEn) and kernel entropy (KEn). A total of 240 subjects suspected of suffering from SAHS took part in the study. They were randomly divided into a training set (96 subjects) and a test set (144 subjects) for the adjustment and assessment of the proposed methods, respectively. According to the measurements provided by AEn, SEn and KEn, higher irregularity of oximetry signals is associated with SAHS-positive patients. Receiver operating characteristic (ROC) and Pearson correlation analyses showed that KEn was the most reliable predictor of SAHS. It provided an area under the ROC curve of 0.91 in two-class classification of subjects as SAHS-negative or SAHS-positive. Moreover, KEn measurements from oximetry data exhibited a linear dependence on the apnoea–hypopnoea index, as shown by a correlation coefficient of 0.87. Therefore, these measurements could be used for the development of simplified diagnostic techniques in order to reduce the demand for polysomnographies. Furthermore, KEn represents a convincing alternative to AEn and SEn for the diagnostic analysis of noisy biomedical signals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Women are under-represented at senior levels within organisations. They also fareless well than their male counterparts in reward and career opportunities. Attitudestoward women in the workplace are thought to underpin these disparities and moreand more organisations are introducing attitude measures into diversity and inclusioninitiatives to: 1) raise awareness amongst employees of implicit attitudes, 2) educateemployees on how these attitudes can influence behaviour and 3) re-measure theattitude after an intervention to assess whether the attitude has changed. TheImplicit Association Test (IAT: Greenwald, et al., 1998) is the most popular tool usedto assess attitudes. However, questions over the predictive validity of the measurehave been raised and the evidence for the real world impact of the implicit attitudes islimited (Blanton et al., 2009; Landy, 2008; Tetlock & Mitchell, 2009; Wax, 2010).Whilst there is growing research in the area of race, little research has explored theability of the IAT to predict gender discrimination. This thesis addresses thisimportant gap in the literature. Three empirical studies were conducted. The firststudy explored whether gender IATs were predictive of personnel decisions thatfavour men and whether affect- and cognition-based gender IATs were equallypredictive of behaviour. The second two studies explored the predictive validity ofthe IAT in comparison to an explicit measure of one type of gender attitude,benevolent sexism. The results revealed implicit gender attitudes were stronglyheld. However, they did not consistently predict behaviour across the studies.Overall, the results suggest that the IAT may only predict workplace genderdiscrimination in a very select set of circumstances. The attitude component that anIAT assesses, the personnel decision and participant demographics all impact thepredictive validity of the tool. The interplay between the IAT and behaviour thereforeappears to be more complex than is assumed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There may be circumstances where it is necessary for microbiologists to compare variances rather than means, e,g., in analysing data from experiments to determine whether a particular treatment alters the degree of variability or testing the assumption of homogeneity of variance prior to other statistical tests. All of the tests described in this Statnote have their limitations. Bartlett’s test may be too sensitive but Levene’s and the Brown-Forsythe tests also have problems. We would recommend the use of the variance-ratio test to compare two variances and the careful application of Bartlett’s test if there are more than two groups. Considering that these tests are not particularly robust, it should be remembered that the homogeneity of variance assumption is usually the least important of those considered when carrying out an ANOVA. If there is concern about this assumption and especially if the other assumptions of the analysis are also not likely to be met, e.g., lack of normality or non additivity of treatment effects then it may be better either to transform the data or to carry out a non-parametric test on the data.