899 resultados para binary to multi-class classifiers
Resumo:
The achievable region approach seeks solutions to stochastic optimisation problems by: (i) characterising the space of all possible performances(the achievable region) of the system of interest, and (ii) optimisingthe overall system-wide performance objective over this space. This isradically different from conventional formulations based on dynamicprogramming. The approach is explained with reference to a simpletwo-class queueing system. Powerful new methodologies due to the authorsand co-workers are deployed to analyse a general multiclass queueingsystem with parallel servers and then to develop an approach to optimalload distribution across a network of interconnected stations. Finally,the approach is used for the first time to analyse a class of intensitycontrol problems.
Resumo:
Summary The specific CD8+ T cell immune response against tumors relies on the recognition by the T cell receptor (TCR) on cytotoxic T lymphocytes (CTL) of antigenic peptides bound to the class I major histocompatibility complex (MHC) molecule. Such tumor associated antigenic peptides are the focus of tumor immunotherapy with peptide vaccines. The strategy for obtaining an improved immune response often involves the design of modified tumor associated antigenic peptides. Such modifications aim at creating higher affinity and/or degradation resistant peptides and require precise structures of the peptide-MHC class I complex. In addition, the modified peptide must be cross-recognized by CTLs specific for the parental peptide, i.e. preserve the structure of the epitope. Detailed structural information on the modified peptide in complex with MHC is necessary for such predictions. In this thesis, the main focus is the development of theoretical in silico methods for prediction of both structure and cross-reactivity of peptide-MHC class I complexes. Applications of these methods in the context of immunotherapy are also presented. First, a theoretical method for structure prediction of peptide-MHC class I complexes is developed and validated. The approach is based on a molecular dynamics protocol to sample the conformational space of the peptide in its MHC environment. The sampled conformers are evaluated using conformational free energy calculations. The method, which is evaluated for its ability to reproduce 41 X-ray crystallographic structures of different peptide-MHC class I complexes, shows an overall prediction success of 83%. Importantly, in the clinically highly relevant subset of peptide-HLAA*0201 complexes, the prediction success is 100%. Based on these structure predictions, a theoretical approach for prediction of cross-reactivity is developed and validated. This method involves the generation of quantitative structure-activity relationships using three-dimensional molecular descriptors and a genetic neural network. The generated relationships are highly predictive as proved by high cross-validated correlation coefficients (0.78-0.79). Together, the here developed theoretical methods open the door for efficient rational design of improved peptides to be used in immunotherapy. Résumé La réponse immunitaire spécifique contre des tumeurs dépend de la reconnaissance par les récepteurs des cellules T CD8+ de peptides antigéniques présentés par les complexes majeurs d'histocompatibilité (CMH) de classe I. Ces peptides sont utilisés comme cible dans l'immunothérapie par vaccins peptidiques. Afin d'augmenter la réponse immunitaire, les peptides sont modifiés de façon à améliorer l'affinité et/ou la résistance à la dégradation. Ceci nécessite de connaître la structure tridimensionnelle des complexes peptide-CMH. De plus, les peptides modifiés doivent être reconnus par des cellules T spécifiques du peptide natif. La structure de l'épitope doit donc être préservée et des structures détaillées des complexes peptide-CMH sont nécessaires. Dans cette thèse, le thème central est le développement des méthodes computationnelles de prédiction des structures des complexes peptide-CMH classe I et de la reconnaissance croisée. Des applications de ces méthodes de prédiction à l'immunothérapie sont également présentées. Premièrement, une méthode théorique de prédiction des structures des complexes peptide-CMH classe I est développée et validée. Cette méthode est basée sur un échantillonnage de l'espace conformationnel du peptide dans le contexte du récepteur CMH classe I par dynamique moléculaire. Les conformations sont évaluées par leurs énergies libres conformationnelles. La méthode est validée par sa capacité à reproduire 41 structures des complexes peptide-CMH classe I obtenues par cristallographie aux rayons X. Le succès prédictif général est de 83%. Pour le sous-groupe HLA-A*0201 de complexes de grande importance pour l'immunothérapie, ce succès est de 100%. Deuxièmement, à partir de ces structures prédites in silico, une méthode théorique de prédiction de la reconnaissance croisée est développée et validée. Celle-ci consiste à générer des relations structure-activité quantitatives en utilisant des descripteurs moléculaires tridimensionnels et un réseau de neurones couplé à un algorithme génétique. Les relations générées montrent une capacité de prédiction remarquable avec des valeurs de coefficients de corrélation de validation croisée élevées (0.78-0.79). Les méthodes théoriques développées dans le cadre de cette thèse ouvrent la voie du design de vaccins peptidiques améliorés.
Resumo:
SUMMARY: A top scoring pair (TSP) classifier consists of a pair of variables whose relative ordering can be used for accurately predicting the class label of a sample. This classification rule has the advantage of being easily interpretable and more robust against technical variations in data, as those due to different microarray platforms. Here we describe a parallel implementation of this classifier which significantly reduces the training time, and a number of extensions, including a multi-class approach, which has the potential of improving the classification performance. AVAILABILITY AND IMPLEMENTATION: Full C++ source code and R package Rgtsp are freely available from http://lausanne.isb-sib.ch/~vpopovic/research/. The implementation relies on existing OpenMP libraries.
Resumo:
The purpose of this article was to review the strategies to control patient dose in adult and pediatric computed tomography (CT), taking into account the change of technology from single-detector row CT to multi-detector row CT. First the relationships between computed tomography dose index, dose length product, and effective dose in adult and pediatric CT are revised, along with the diagnostic reference level concept. Then the effect of image noise as a function of volume computed tomography dose index, reconstructed slice thickness, and the size of the patient are described. Finally, the potential of tube current modulation CT is discussed.
Resumo:
In this paper we present a multi-stage classifier for magnetic resonance spectra of human brain tumours which is being developed as part of a decision support system for radiologists. The basic idea is to decompose a complex classification scheme into a sequence of classifiers, each specialising in different classes of tumours and trying to reproducepart of the WHO classification hierarchy. Each stage uses a particular set of classification features, which are selected using a combination of classical statistical analysis, splitting performance and previous knowledge.Classifiers with different behaviour are combined using a simple voting scheme in order to extract different error patterns: LDA, decision trees and the k-NN classifier. A special label named "unknown¿ is used when the outcomes of the different classifiers disagree. Cascading is alsoused to incorporate class distances computed using LDA into decision trees. Both cascading and voting are effective tools to improve classification accuracy. Experiments also show that it is possible to extract useful information from the classification process itself in order to helpusers (clinicians and radiologists) to make more accurate predictions and reduce the number of possible classification mistakes.
Resumo:
Fluent health information flow is critical for clinical decision-making. However, a considerable part of this information is free-form text and inabilities to utilize it create risks to patient safety and cost-effective hospital administration. Methods for automated processing of clinical text are emerging. The aim in this doctoral dissertation is to study machine learning and clinical text in order to support health information flow.First, by analyzing the content of authentic patient records, the aim is to specify clinical needs in order to guide the development of machine learning applications.The contributions are a model of the ideal information flow,a model of the problems and challenges in reality, and a road map for the technology development. Second, by developing applications for practical cases,the aim is to concretize ways to support health information flow. Altogether five machine learning applications for three practical cases are described: The first two applications are binary classification and regression related to the practical case of topic labeling and relevance ranking.The third and fourth application are supervised and unsupervised multi-class classification for the practical case of topic segmentation and labeling.These four applications are tested with Finnish intensive care patient records.The fifth application is multi-label classification for the practical task of diagnosis coding. It is tested with English radiology reports.The performance of all these applications is promising. Third, the aim is to study how the quality of machine learning applications can be reliably evaluated.The associations between performance evaluation measures and methods are addressed,and a new hold-out method is introduced.This method contributes not only to processing time but also to the evaluation diversity and quality. The main conclusion is that developing machine learning applications for text requires interdisciplinary, international collaboration. Practical cases are very different, and hence the development must begin from genuine user needs and domain expertise. The technological expertise must cover linguistics,machine learning, and information systems. Finally, the methods must be evaluated both statistically and through authentic user-feedback.
Resumo:
Symmetry group methods are applied to obtain all explicit group-invariant radial solutions to a class of semilinear Schr¨odinger equations in dimensions n = 1. Both focusing and defocusing cases of a power nonlinearity are considered, including the special case of the pseudo-conformal power p = 4/n relevant for critical dynamics. The methods involve, first, reduction of the Schr¨odinger equations to group-invariant semilinear complex 2nd order ordinary differential equations (ODEs) with respect to an optimal set of one-dimensional point symmetry groups, and second, use of inherited symmetries, hidden symmetries, and conditional symmetries to solve each ODE by quadratures. Through Noether’s theorem, all conservation laws arising from these point symmetry groups are listed. Some group-invariant solutions are found to exist for values of n other than just positive integers, and in such cases an alternative two-dimensional form of the Schr¨odinger equations involving an extra modulation term with a parameter m = 2−n = 0 is discussed.
Resumo:
De nombreux problèmes pratiques qui se posent dans dans le domaine de la logistique, peuvent être modélisés comme des problèmes de tournées de véhicules. De façon générale, cette famille de problèmes implique la conception de routes, débutant et se terminant à un dépôt, qui sont utilisées pour distribuer des biens à un nombre de clients géographiquement dispersé dans un contexte où les coûts associés aux routes sont minimisés. Selon le type de problème, un ou plusieurs dépôts peuvent-être présents. Les problèmes de tournées de véhicules sont parmi les problèmes combinatoires les plus difficiles à résoudre. Dans cette thèse, nous étudions un problème d’optimisation combinatoire, appartenant aux classes des problèmes de tournées de véhicules, qui est liée au contexte des réseaux de transport. Nous introduisons un nouveau problème qui est principalement inspiré des activités de collecte de lait des fermes de production, et de la redistribution du produit collecté aux usines de transformation, pour la province de Québec. Deux variantes de ce problème sont considérées. La première, vise la conception d’un plan tactique de routage pour le problème de la collecte-redistribution de lait sur un horizon donné, en supposant que le niveau de la production au cours de l’horizon est fixé. La deuxième variante, vise à fournir un plan plus précis en tenant compte de la variation potentielle de niveau de production pouvant survenir au cours de l’horizon considéré. Dans la première partie de cette thèse, nous décrivons un algorithme exact pour la première variante du problème qui se caractérise par la présence de fenêtres de temps, plusieurs dépôts, et une flotte hétérogène de véhicules, et dont l’objectif est de minimiser le coût de routage. À cette fin, le problème est modélisé comme un problème multi-attributs de tournées de véhicules. L’algorithme exact est basé sur la génération de colonnes impliquant un algorithme de plus court chemin élémentaire avec contraintes de ressources. Dans la deuxième partie, nous concevons un algorithme exact pour résoudre la deuxième variante du problème. À cette fin, le problème est modélisé comme un problème de tournées de véhicules multi-périodes prenant en compte explicitement les variations potentielles du niveau de production sur un horizon donné. De nouvelles stratégies sont proposées pour résoudre le problème de plus court chemin élémentaire avec contraintes de ressources, impliquant dans ce cas une structure particulière étant donné la caractéristique multi-périodes du problème général. Pour résoudre des instances de taille réaliste dans des temps de calcul raisonnables, une approche de résolution de nature heuristique est requise. La troisième partie propose un algorithme de recherche adaptative à grands voisinages où de nombreuses nouvelles stratégies d’exploration et d’exploitation sont proposées pour améliorer la performances de l’algorithme proposé en termes de la qualité de la solution obtenue et du temps de calcul nécessaire.
Resumo:
Reinforcement Learning (RL) refers to a class of learning algorithms in which learning system learns which action to take in different situations by using a scalar evaluation received from the environment on performing an action. RL has been successfully applied to many multi stage decision making problem (MDP) where in each stage the learning systems decides which action has to be taken. Economic Dispatch (ED) problem is an important scheduling problem in power systems, which decides the amount of generation to be allocated to each generating unit so that the total cost of generation is minimized without violating system constraints. In this paper we formulate economic dispatch problem as a multi stage decision making problem. In this paper, we also develop RL based algorithm to solve the ED problem. The performance of our algorithm is compared with other recent methods. The main advantage of our method is it can learn the schedule for all possible demands simultaneously.
Resumo:
We consider a fully complex-valued radial basis function (RBF) network for regression and classification applications. For regression problems, the locally regularised orthogonal least squares (LROLS) algorithm aided with the D-optimality experimental design, originally derived for constructing parsimonious real-valued RBF models, is extended to the fully complex-valued RBF (CVRBF) network. Like its real-valued counterpart, the proposed algorithm aims to achieve maximised model robustness and sparsity by combining two effective and complementary approaches. The LROLS algorithm alone is capable of producing a very parsimonious model with excellent generalisation performance while the D-optimality design criterion further enhances the model efficiency and robustness. By specifying an appropriate weighting for the D-optimality cost in the combined model selecting criterion, the entire model construction procedure becomes automatic. An example of identifying a complex-valued nonlinear channel is used to illustrate the regression application of the proposed fully CVRBF network. The proposed fully CVRBF network is also applied to four-class classification problems that are typically encountered in communication systems. A complex-valued orthogonal forward selection algorithm based on the multi-class Fisher ratio of class separability measure is derived for constructing sparse CVRBF classifiers that generalise well. The effectiveness of the proposed algorithm is demonstrated using the example of nonlinear beamforming for multiple-antenna aided communication systems that employ complex-valued quadrature phase shift keying modulation scheme. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
We propose a simple and computationally efficient construction algorithm for two class linear-in-the-parameters classifiers. In order to optimize model generalization, a forward orthogonal selection (OFS) procedure is used for minimizing the leave-one-out (LOO) misclassification rate directly. An analytic formula and a set of forward recursive updating formula of the LOO misclassification rate are developed and applied in the proposed algorithm. Numerical examples are used to demonstrate that the proposed algorithm is an excellent alternative approach to construct sparse two class classifiers in terms of performance and computational efficiency.
Resumo:
This paper presents a novel approach to the automatic classification of very large data sets composed of terahertz pulse transient signals, highlighting their potential use in biochemical, biomedical, pharmaceutical and security applications. Two different types of THz spectra are considered in the classification process. Firstly a binary classification study of poly-A and poly-C ribonucleic acid samples is performed. This is then contrasted with a difficult multi-class classification problem of spectra from six different powder samples that although have fairly indistinguishable features in the optical spectrum, they also possess a few discernable spectral features in the terahertz part of the spectrum. Classification is performed using a complex-valued extreme learning machine algorithm that takes into account features in both the amplitude as well as the phase of the recorded spectra. Classification speed and accuracy are contrasted with that achieved using a support vector machine classifier. The study systematically compares the classifier performance achieved after adopting different Gaussian kernels when separating amplitude and phase signatures. The two signatures are presented as feature vectors for both training and testing purposes. The study confirms the utility of complex-valued extreme learning machine algorithms for classification of the very large data sets generated with current terahertz imaging spectrometers. The classifier can take into consideration heterogeneous layers within an object as would be required within a tomographic setting and is sufficiently robust to detect patterns hidden inside noisy terahertz data sets. The proposed study opens up the opportunity for the establishment of complex-valued extreme learning machine algorithms as new chemometric tools that will assist the wider proliferation of terahertz sensing technology for chemical sensing, quality control, security screening and clinic diagnosis. Furthermore, the proposed algorithm should also be very useful in other applications requiring the classification of very large datasets.
Resumo:
Aeromonas salmonicida AS03, a potential fish pathogen, was isolated from Atlantic salmon, Salmo salar, in 2003. This strain was found to be resistant to ≥1000 mM HgCl2 and ≥32 mM phenylmercuric acetate as well as multiple antimicrobials. Mercury (Hg) and antibiotic resistance genes are often located on the same mobile genetic elements, so the genetic determinants of both resistances and the possibility of horizontal gene transfer were examined. Specific PCR primers were used to amplify and sequence distinctive regions of the mer operon. A. salmonicida AS03 was found to have a pDU1358-like broad-spectrum mer operon, containing merB as well as merA, merD, merP, merR and merT, most similar to Klebsiella pneumonaie plasmid pRMH760. To our knowledge, the mer operon has never before been documented in Aeromonas spp. PCR and gene sequencing were used to identify class 1 integron associated antibiotic resistance determinants and the Tet A tetracycline resistance gene. The transposase and resolvase genes of Tn1696 were identified through PCR and sequencing with Tn21 specific PCR primers. We provide phenotypic and genotypic evidence that the mer operon, the aforementioned antibiotic resistances, and the Tn1696 transposition module are located on a single plasmid or conjugative transposon that can be transferred to E. coli DH5α through conjugation in the presence of low level Hg and absence of any antibiotic selective pressure. Additionally, the presence of low-level Hg or chloramphenicol in the mating media was found to stimulate conjugation, significantly increasing the transfer frequency of conjugation above the transfer frequency measured with mating media lacking both antibiotics and Hg. This research demonstrates that mercury indirectly selects for the dissemination of the antibiotic resistance genes of A. salmonicida AS03.
Resumo:
Aeromonas salmonicida AS03, a potential fish pathogen, was isolated from Atlantic salmon, Salmo salar, in 2003. This strain was found to be resistant to ≥1000 mM HgCl2 and ≥32 mM phenylmercuric acetate as well as multiple antimicrobials. Mercury (Hg) and antibiotic resistance genes are often located on the same mobile genetic elements, so the genetic determinants of both resistances and the possibility of horizontal gene transfer were examined. Specific PCR primers were used to amplify and sequence distinctive regions of the mer operon. A. salmonicida AS03 was found to have a pDU1358-like broad-spectrum mer operon, containing merB as well as merA, merD, merP, merR and merT, most similar to Klebsiella pneumonaie plasmid pRMH760. To our knowledge, the mer operon has never before been documented in Aeromonas spp. PCR and gene sequencing were used to identify class 1 integron associated antibiotic resistance determinants and the Tet A tetracycline resistance gene. The transposase and resolvase genes of Tn1696 were identified through PCR and sequencing with Tn21 specific PCR primers. We provide phenotypic and genotypic evidence that the mer operon, the aforementioned antibiotic resistances, and the Tn1696 transposition module are located on a single plasmid or conjugative transposon that can be transferred to E. coli DH5α through conjugation in the presence of low level Hg and absence of any antibiotic selective pressure. Additionally, the presence of low-level Hg or chloramphenicol in the mating media was found to stimulate conjugation, significantly increasing the transfer frequency of conjugation above the transfer frequency measured with mating media lacking both antibiotics and Hg. This research demonstrates that mercury indirectly selects for the dissemination of the antibiotic resistance genes of A. salmonicida AS03.
Resumo:
Metabolomics as one of the most rapidly growing technologies in the "-omics" field denotes the comprehensive analysis of low molecular-weight compounds and their pathways. Cancer-specific alterations of the metabolome can be detected by high-throughput mass-spectrometric metabolite profiling and serve as a considerable source of new markers for the early differentiation of malignant diseases as well as their distinction from benign states. However, a comprehensive framework for the statistical evaluation of marker panels in a multi-class setting has not yet been established. We collected serum samples of 40 pancreatic carcinoma patients, 40 controls, and 23 pancreatitis patients according to standard protocols and generated amino acid profiles by routine mass-spectrometry. In an intrinsic three-class bioinformatic approach we compared these profiles, evaluated their selectivity and computed multi-marker panels combined with the conventional tumor marker CA 19-9. Additionally, we tested for non-inferiority and superiority to determine the diagnostic surplus value of our multi-metabolite marker panels. Compared to CA 19-9 alone, the combined amino acid-based metabolite panel had a superior selectivity for the discrimination of healthy controls, pancreatitis, and pancreatic carcinoma patients [Formula: see text] We combined highly standardized samples, a three-class study design, a high-throughput mass-spectrometric technique, and a comprehensive bioinformatic framework to identify metabolite panels selective for all three groups in a single approach. Our results suggest that metabolomic profiling necessitates appropriate evaluation strategies and-despite all its current limitations-can deliver marker panels with high selectivity even in multi-class settings.