14 resultados para degenerate test set
em CentAUR: Central Archive University of Reading - UK
Resumo:
BACKGROUND: The serum peptidome may be a valuable source of diagnostic cancer biomarkers. Previous mass spectrometry (MS) studies have suggested that groups of related peptides discriminatory for different cancer types are generated ex vivo from abundant serum proteins by tumor-specific exopeptidases. We tested 2 complementary serum profiling strategies to see if similar peptides could be found that discriminate ovarian cancer from benign cases and healthy controls. METHODS: We subjected identically collected and processed serum samples from healthy volunteers and patients to automated polypeptide extraction on octadecylsilane-coated magnetic beads and separately on ZipTips before MALDI-TOF MS profiling at 2 centers. The 2 platforms were compared and case control profiling data analyzed to find altered MS peak intensities. We tested models built from training datasets for both methods for their ability to classify a blinded test set. RESULTS: Both profiling platforms had CVs of approximately 15% and could be applied for high-throughput analysis of clinical samples. The 2 methods generated overlapping peptide profiles, with some differences in peak intensity in different mass regions. In cross-validation, models from training data gave diagnostic accuracies up to 87% for discriminating malignant ovarian cancer from healthy controls and up to 81% for discriminating malignant from benign samples. Diagnostic accuracies up to 71% (malignant vs healthy) and up to 65% (malignant vs benign) were obtained when the models were validated on the blinded test set. CONCLUSIONS: For ovarian cancer, altered MALDI-TOF MS peptide profiles alone cannot be used for accurate diagnoses.
Resumo:
Motivation: We compare phylogenetic approaches for inferring functional gene links. The approaches detect independent instances of the correlated gain and loss of pairs of genes from species' genomes. We investigate the effect on results of basing evidence of correlations on two phylogenetic approaches, Dollo parsminony and maximum likelihood (ML). We further examine the effect of constraining the ML model by fixing the rate of gene gain at a low value, rather than estimating it from the data. Results: We detect correlated evolution among a test set of pairs of yeast (Saccharomyces cerevisiae) genes, with a case study of 21 eukaryotic genomes and test data derived from known yeast protein complexes. If the rate at which genes are gained is constrained to be low, ML achieves by far the best results at detecting known functional links. The model then has fewer parameters but it is more realistic by preventing genes from being gained more than once. Availability: BayesTraits by M. Pagel and A. Meade, and a script to configure and repeatedly launch it by D. Barker and M. Pagel, are available at http://www.evolution.reading.ac.uk .
Resumo:
Quantitative structure activity relationships (QSARs) have been developed to optimise the choice of nitrogen heterocyclic molecules that can be used to separate the minor actinides such as americium(III) from europium(III) in the aqueous PUREX raffinate of nuclear waste. Experimental data on distribution coefficients and separation factors (SFs) for 47 such ligands have been obtained and show SF values ranging from 0.61 to 100. The ligands were divided into a training set of 36 molecules to develop the QSAR and a test set of 11 molecules to validate the QSAR. Over 1500 molecular descriptors were calculated for each heterocycle and the Genetic Algorithm was used to select the most appropriate for use in multiple regression equations. Equations were developed fitting the separation factors to 6-8 molecular descriptors which gave r(2) values of >0.8 for the training set and values of >0.7 for the test set, thus showing good predictive quality. The descriptors used in the equations were primarily electronic and steric. These equations can be used to predict the separation factors of nitrogen heterocycles not yet synthesised and/or tested and hence obtain the most efficient ligands for lanthanide and actinide separation. (C) 2003 Elsevier B.V. All rights reserved.
Resumo:
One of the enablers for new consumer electronics based products to be accepted in to the market is the availability of inexpensive, flexible and multi-standard chipsets and services. DVB-T, the principal standard for terrestrial broadcast of digital video in Europe, has been extremely successful in leading to governments reconsidering their targets for analogue television broadcast switch-off. To enable one further small step in creating increasingly cost effective chipsets, the ODFM deterministic equalizer has been presented before with its application to DVB-T. This paper discusses the test set-up of a DVB-T compliant baseband simulation that includes the deterministic equalizer and DVB-T standard propagation channels. This is then followed by a presentation of the found inner and outer Bit Error Rate (BER) results using various modulation levels, coding rates and propagation channels in order to ascertain the actual performance of the deterministic equalizer(1).
Resumo:
Motivation: A new method that uses support vector machines (SVMs) to predict protein secondary structure is described and evaluated. The study is designed to develop a reliable prediction method using an alternative technique and to investigate the applicability of SVMs to this type of bioinformatics problem. Methods: Binary SVMs are trained to discriminate between two structural classes. The binary classifiers are combined in several ways to predict multi-class secondary structure. Results: The average three-state prediction accuracy per protein (Q3) is estimated by cross-validation to be 77.07 ± 0.26% with a segment overlap (Sov) score of 73.32 ± 0.39%. The SVM performs similarly to the 'state-of-the-art' PSIPRED prediction method on a non-homologous test set of 121 proteins despite being trained on substantially fewer examples. A simple consensus of the SVM, PSIPRED and PROFsec achieves significantly higher prediction accuracy than the individual methods. Availability: The SVM classifier is available from the authors. Work is in progress to make the method available on-line and to integrate the SVM predictions into the PSIPRED server.
Resumo:
Recent studies showed that features extracted from brain MRIs can well discriminate Alzheimer’s disease from Mild Cognitive Impairment. This study provides an algorithm that sequentially applies advanced feature selection methods for findings the best subset of features in terms of binary classification accuracy. The classifiers that provided the highest accuracies, have been then used for solving a multi-class problem by the one-versus-one strategy. Although several approaches based on Regions of Interest (ROIs) extraction exist, the prediction power of features has not yet investigated by comparing filter and wrapper techniques. The findings of this work suggest that (i) the IntraCranial Volume (ICV) normalization can lead to overfitting and worst the accuracy prediction of test set and (ii) the combined use of a Random Forest-based filter with a Support Vector Machines-based wrapper, improves accuracy of binary classification.
Resumo:
Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n = 30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org.
Resumo:
SCOPE: A high intake of n-3 PUFA provides health benefits via changes in the n-6/n-3 ratio in blood. In addition to such dietary PUFAs, variants in the fatty acid desaturase 1 (FADS1) gene are also associated with altered PUFA profiles. METHODS AND RESULTS: We used mathematical modelling to predict levels of PUFA in whole blood, based on MHT and bolasso selected food items, anthropometric and lifestyle factors, and the rs174546 genotypes in FADS1 from 1,607 participants (Food4Me Study). The models were developed using data from the first reported time point (training set) and their predictive power was evaluated using data from the last reported time point (test set). Amongst other food items, fish, pizza, chicken and cereals were identified as being associated with the PUFA profiles. Using these food items and the rs174546 genotypes as predictors, models explained 26% to 43% of the variability in PUFA concentrations in the training set and 22% to 33% in the test set. CONCLUSIONS: Selecting food items using MHT is a valuable contribution to determine predictors, as our models' predictive power is higher compared to analogue studies. As unique feature, we additionally confirmed our models' power based on a test set.
Resumo:
The no response test is a new scheme in inverse problems for partial differential equations which was recently proposed in [D. R. Luke and R. Potthast, SIAM J. Appl. Math., 63 (2003), pp. 1292–1312] in the framework of inverse acoustic scattering problems. The main idea of the scheme is to construct special probing waves which are small on some test domain. Then the response for these waves is constructed. If the response is small, the unknown object is assumed to be a subset of the test domain. The response is constructed from one, several, or many particular solutions of the problem under consideration. In this paper, we investigate the convergence of the no response test for the reconstruction information about inclusions D from the Cauchy values of solutions to the Helmholtz equation on an outer surface $\partial\Omega$ with $\overline{D} \subset \Omega$. We show that the one‐wave no response test provides a criterion to test the analytic extensibility of a field. In particular, we investigate the construction of approximations for the set of singular points $N(u)$ of the total fields u from one given pair of Cauchy data. Thus, the no response test solves a particular version of the classical Cauchy problem. Also, if an infinite number of fields is given, we prove that a multifield version of the no response test reconstructs the unknown inclusion D. This is the first convergence analysis which could be achieved for the no response test.
Resumo:
We describe, and make publicly available, two problem instance generators for a multiobjective version of the well-known quadratic assignment problem (QAP). The generators allow a number of instance parameters to be set, including those controlling epistasis and inter-objective correlations. Based on these generators, several initial test suites are provided and described. For each test instance we measure some global properties and, for the smallest ones, make some initial observations of the Pareto optimal sets/fronts. Our purpose in providing these tools is to facilitate the ongoing study of problem structure in multiobjective (combinatorial) optimization, and its effects on search landscape and algorithm performance.
Resumo:
In order to assist in comparing the computational techniques used in different models, the authors propose a standardized set of one-dimensional numerical experiments that could be completed for each model. The results of these experiments, with a simplified form of the computational representation for advection, diffusion, pressure gradient term, Coriolis term, and filter used in the models, should be reported in the peer-reviewed literature. Specific recommendations are described in this paper.
Resumo:
Developing models to predict the effects of social and economic change on agricultural landscapes is an important challenge. Model development often involves making decisions about which aspects of the system require detailed description and which are reasonably insensitive to the assumptions. However, important components of the system are often left out because parameter estimates are unavailable. In particular, measurements of the relative influence of different objectives, such as risk, environmental management, on farmer decision making, have proven difficult to quantify. We describe a model that can make predictions of land use on the basis of profit alone or with the inclusion of explicit additional objectives. Importantly, our model is specifically designed to use parameter estimates for additional objectives obtained via farmer interviews. By statistically comparing the outputs of this model with a large farm-level land-use data set, we show that cropping patterns in the United Kingdom contain a significant contribution from farmer’s preference for objectives other than profit. In particular, we found that risk aversion had an effect on the accuracy of model predictions, whereas preference for a particular number of crops grown was less important. While nonprofit objectives have frequently been identified as factors in farmers’ decision making, our results take this analysis further by demonstrating the relationship between these preferences and actual cropping patterns.
Resumo:
In financial research, the sign of a trade (or identity of trade aggressor) is not always available in the transaction dataset and it can be estimated using a simple set of rules called the tick test. In this paper we investigate the accuracy of the tick test from an analytical perspective by providing a closed formula for the performance of the prediction algorithm. By analyzing the derived equation, we provide formal arguments for the use of the tick test by proving that it is bounded to perform better than chance (50/50) and that the set of rules from the tick test provides an unbiased estimator of the trade signs. On the empirical side of the research, we compare the values from the analytical formula against the empirical performance of the tick test for fifteen heavily traded stocks in the Brazilian equity market. The results show that the formula is quite realistic in assessing the accuracy of the prediction algorithm in a real data situation.
Resumo:
Purpose The relative efficiency of different eye exercise regimes is unclear, and in particular the influences of practice, placebo and the amount of effort required are rarely considered. This study measured conventional clinical measures after different regimes in typical young adults. Methods 156 asymptomatic young adults were directed to carry out eye exercises 3 times daily for two weeks. Exercises were directed at improving blur responses (accommodation), disparity responses (convergence), both in a naturalistic relationship, convergence in excess of accommodation, accommodation in excess of convergence, and a placebo regime. They were compared to two control groups, neither of which were given exercises, but the second of which were asked to make maximum effort during the second testing. Results Instruction set and participant effort were more effective than many exercises. Convergence exercises independent of accommodation were the most effective treatment, followed by accommodation exercises, and both regimes resulted in changes in both vergence and accommodation test responses. Exercises targeting convergence and accommodation working together were less effective than those where they were separated. Accommodation measures were prone to large instruction/effort effects and monocular accommodation facility was subject to large practice effects. Conclusions Separating convergence and accommodation exercises seemed more effective than exercising both systems concurrently and suggests that stimulation of accommodation and convergence may act in an additive fashion to aid responses. Instruction/effort effects are large and should be carefully controlled if claims for the efficacy of any exercise regime are to be made.