947 resultados para naive bayes classifier
Resumo:
BACKGROUND: Poor tolerance and adverse drug reactions are main reasons for discontinuation of antiretroviral therapy (ART). Identifying predictors of ART discontinuation is a priority in HIV care. METHODS: A genetic association study in an observational cohort to evaluate the association of pharmacogenetic markers with time to treatment discontinuation during the first year of ART. Analysis included 577 treatment-naive individuals initiating tenofovir (n = 500) or abacavir (n = 77), with efavirenz (n = 272), lopinavir/ritonavir (n = 184), or atazanavir/ritonavir (n = 121). Genotyping included 23 genetic markers in 15 genes associated with toxicity or pharmacokinetics of the study medication. Rates of ART discontinuation between groups with and without genetic risk markers were assessed by survival analysis using Cox regression models. RESULTS: During the first year of ART, 190 individuals (33%) stopped 1 or more drugs. For efavirenz and atazanavir, individuals with genetic risk markers experienced higher discontinuation rates than individuals without (71.15% vs 28.10%, and 62.5% vs 14.6%, respectively). The efavirenz discontinuation hazard ratio (HR) was 3.14 (95% confidence interval (CI): 1.35-7.33, P = .008). The atazanavir discontinuation HR was 9.13 (95% CI: 3.38-24.69, P < .0001). CONCLUSIONS: Several pharmacogenetic markers identify individuals at risk for early treatment discontinuation. These markers should be considered for validation in the clinical setting.
Resumo:
Discriminating complex sounds relies on multiple stages of differential brain activity. The specific roles of these stages and their links to perception were the focus of the present study. We presented 250ms duration sounds of living and man-made objects while recording 160-channel electroencephalography (EEG). Subjects categorized each sound as that of a living, man-made or unknown item. We tested whether/when the brain discriminates between sound categories even when not transpiring behaviorally. We applied a single-trial classifier that identified voltage topographies and latencies at which brain responses are most discriminative. For sounds that the subjects could not categorize, we could successfully decode the semantic category based on differences in voltage topographies during the 116-174ms post-stimulus period. Sounds that were correctly categorized as that of a living or man-made item by the same subjects exhibited two periods of differences in voltage topographies at the single-trial level. Subjects exhibited differential activity before the sound ended (starting at 112ms) and on a separate period at ~270ms post-stimulus onset. Because each of these periods could be used to reliably decode semantic categories, we interpreted the first as being related to an implicit tuning for sound representations and the second as being linked to perceptual decision-making processes. Collectively, our results show that the brain discriminates environmental sounds during early stages and independently of behavioral proficiency and that explicit sound categorization requires a subsequent processing stage.
Resumo:
Soil information is needed for managing the agricultural environment. The aim of this study was to apply artificial neural networks (ANNs) for the prediction of soil classes using orbital remote sensing products, terrain attributes derived from a digital elevation model and local geology information as data sources. This approach to digital soil mapping was evaluated in an area with a high degree of lithologic diversity in the Serra do Mar. The neural network simulator used in this study was JavaNNS and the backpropagation learning algorithm. For soil class prediction, different combinations of the selected discriminant variables were tested: elevation, declivity, aspect, curvature, curvature plan, curvature profile, topographic index, solar radiation, LS topographic factor, local geology information, and clay mineral indices, iron oxides and the normalized difference vegetation index (NDVI) derived from an image of a Landsat-7 Enhanced Thematic Mapper Plus (ETM+) sensor. With the tested sets, best results were obtained when all discriminant variables were associated with geological information (overall accuracy 93.2 - 95.6 %, Kappa index 0.924 - 0.951, for set 13). Excluding the variable profile curvature (set 12), overall accuracy ranged from 93.9 to 95.4 % and the Kappa index from 0.932 to 0.948. The maps based on the neural network classifier were consistent and similar to conventional soil maps drawn for the study area, although with more spatial details. The results show the potential of ANNs for soil class prediction in mountainous areas with lithological diversity.
Resumo:
Background and Aims: The three anti-TNF agents infliximab (IFX), adalimumab (ADA) andcertolizumab pegol (CZP) have demonstrated similar efficacy in induction and maintenanceof response and remission in Crohn's disease (CD) treatment. Given the comparability ofthese drugs, patient's preferences may influence the choice of the product. However, dataon patient's preferences for choosing anti-TNF agents are lacking. We therefore aimed toassess the CD patient's appraisal to select the drug of his choice and to identify factorsguiding this decision.Methods: A prospective survey among anti-TNF-naive CD patientswas performed. Patients were provided a description of the three anti-TNF agents focusingon indication, application mode (s.c. vs. i.v.), application time intervals, setting of application(hospital vs. private practice vs. patient's home), average time to apply the medication permonth, typical side effects, and the scientific evidence of efficacy and safety available for everydrug. Patients answered a questionnaire consisting of 17 questions, covering demographic,disease-specific, and medication data.Results: Hundred patients (47f/53m, mean age 45±16years) completed the questionnaire. Disease duration was <1year in 7%, 1-5 years in 31%,and >5 years in 62% of patients. Disease location was ileal in 33%, colonic in 40%, andileocolonic in 27%. Disease phenotype was inflammatory in 68%, stenosing in 29%, andinternally fistulizing in 3% of patients. Additionally, 20% had perianal fistulizing disease.Patients were already treated with the following drugs: mesalamines 61%, budesonide 44%,prednisone 97%, thiopurines 78%, methotrexate 16%. In total, 30% had already heardabout IFX, 20% about ADA, and 11% about CZP. Thirty-six percent voted for treatmentwith ADA, 28% for CZP, and 25% for IFX, whereas 11% were undecided. The followingfactors influenced the patient's decision for choosing a specific anti-TNF drug (severalanswers possible): side effects 76%, physician's recommendation 66%, application mode54%, efficacy experience 52%, time to spend for therapy 27%, patient's recommendations21%, interactions with other medications 12%. The single most important factor for choosinga specific anti-TNF was (1 answer): side effect profile 35%, physician's recommendation22%, efficacy experience 21%, application mode 13%, patient's recommendations 5%, timespent for therapy 3%, interaction with other medications 1%.Conclusions: The majority ofpatients preferred anti-TNF syringes to infusions. The safety profile of the drugs and thephysician's recommendation are major factors influencing the patient's choice for a specificanti-TNF drug. Patient's issues about safety and lifestyle habits should be taken into accountwhen prescribing specific anti-TNF formulations.
Resumo:
Naive scale invariance is not a true property of natural images. Natural monochrome images possess a much richer geometrical structure, which is particularly well described in terms of multiscaling relations. This means that the pixels of a given image can be decomposed into sets, the fractal components of the image, with well-defined scaling exponents [Turiel and Parga, Neural Comput. 12, 763 (2000)]. Here it is shown that hyperspectral representations of natural scenes also exhibit multiscaling properties, observing the same kind of behavior. A precise measure of the informational relevance of the fractal components is also given, and it is shown that there are important differences between the intrinsically redundant red-green-blue system and the decorrelated one defined in Ruderman, Cronin, and Chiao [J. Opt. Soc. Am. A 15, 2036 (1998)].
Resumo:
BACKGROUND: Gefitinib is active in patients with pretreated non-small-cell lung cancer (NSCLC). We evaluated the activity and toxicity of gefitinib first-line treatment in advanced NSCLC followed by chemotherapy at disease progression. PATIENTS AND METHODS: In all, 63 patients with chemotherapy-naive stage IIIB/IV NSCLC received gefitinib 250 mg/day. At disease progression, gefitinib was replaced by cisplatin 80 mg/m(2) on day 1 and gemcitabine 1250 mg/m(2) on days 1, 8 for up to six 3-week cycles. Primary end point was the disease stabilization rate (DSR) after 12 weeks of gefitinib. RESULTS: After 12 weeks of gefitinib, the DSR was 24% and the response rate (RR) was 8%. Median time to progression (TtP) was 2.5 months and median overall survival (OS) 11.5 months. Never smokers (n = 9) had a DSR of 56% and a median OS of 20.2 months; patients with epidermal growth factor receptor (EGFR) mutation (n = 4) had a DSR of 75% and the median OS was not reached after the follow-up of 21.6 months. In all, 41 patients received chemotherapy with an overall RR of 34%, DSR of 71% and median TtP of 6.7 months. CONCLUSIONS: First-line gefitinib monotherapy led to a DSR of 24% at 12 weeks in an unselected patients population. Never smokers and patients with EGFR mutations tend to have a better outcome; hence, further trials in selected patients are warranted.
Resumo:
Résumé Suite aux recentes avancées technologiques, les archives d'images digitales ont connu une croissance qualitative et quantitative sans précédent. Malgré les énormes possibilités qu'elles offrent, ces avancées posent de nouvelles questions quant au traitement des masses de données saisies. Cette question est à la base de cette Thèse: les problèmes de traitement d'information digitale à très haute résolution spatiale et/ou spectrale y sont considérés en recourant à des approches d'apprentissage statistique, les méthodes à noyau. Cette Thèse étudie des problèmes de classification d'images, c'est à dire de catégorisation de pixels en un nombre réduit de classes refletant les propriétés spectrales et contextuelles des objets qu'elles représentent. L'accent est mis sur l'efficience des algorithmes, ainsi que sur leur simplicité, de manière à augmenter leur potentiel d'implementation pour les utilisateurs. De plus, le défi de cette Thèse est de rester proche des problèmes concrets des utilisateurs d'images satellite sans pour autant perdre de vue l'intéret des méthodes proposées pour le milieu du machine learning dont elles sont issues. En ce sens, ce travail joue la carte de la transdisciplinarité en maintenant un lien fort entre les deux sciences dans tous les développements proposés. Quatre modèles sont proposés: le premier répond au problème de la haute dimensionalité et de la redondance des données par un modèle optimisant les performances en classification en s'adaptant aux particularités de l'image. Ceci est rendu possible par un système de ranking des variables (les bandes) qui est optimisé en même temps que le modèle de base: ce faisant, seules les variables importantes pour résoudre le problème sont utilisées par le classifieur. Le manque d'information étiquétée et l'incertitude quant à sa pertinence pour le problème sont à la source des deux modèles suivants, basés respectivement sur l'apprentissage actif et les méthodes semi-supervisées: le premier permet d'améliorer la qualité d'un ensemble d'entraînement par interaction directe entre l'utilisateur et la machine, alors que le deuxième utilise les pixels non étiquetés pour améliorer la description des données disponibles et la robustesse du modèle. Enfin, le dernier modèle proposé considère la question plus théorique de la structure entre les outputs: l'intègration de cette source d'information, jusqu'à présent jamais considérée en télédétection, ouvre des nouveaux défis de recherche. Advanced kernel methods for remote sensing image classification Devis Tuia Institut de Géomatique et d'Analyse du Risque September 2009 Abstract The technical developments in recent years have brought the quantity and quality of digital information to an unprecedented level, as enormous archives of satellite images are available to the users. However, even if these advances open more and more possibilities in the use of digital imagery, they also rise several problems of storage and treatment. The latter is considered in this Thesis: the processing of very high spatial and spectral resolution images is treated with approaches based on data-driven algorithms relying on kernel methods. In particular, the problem of image classification, i.e. the categorization of the image's pixels into a reduced number of classes reflecting spectral and contextual properties, is studied through the different models presented. The accent is put on algorithmic efficiency and the simplicity of the approaches proposed, to avoid too complex models that would not be used by users. The major challenge of the Thesis is to remain close to concrete remote sensing problems, without losing the methodological interest from the machine learning viewpoint: in this sense, this work aims at building a bridge between the machine learning and remote sensing communities and all the models proposed have been developed keeping in mind the need for such a synergy. Four models are proposed: first, an adaptive model learning the relevant image features has been proposed to solve the problem of high dimensionality and collinearity of the image features. This model provides automatically an accurate classifier and a ranking of the relevance of the single features. The scarcity and unreliability of labeled. information were the common root of the second and third models proposed: when confronted to such problems, the user can either construct the labeled set iteratively by direct interaction with the machine or use the unlabeled data to increase robustness and quality of the description of data. Both solutions have been explored resulting into two methodological contributions, based respectively on active learning and semisupervised learning. Finally, the more theoretical issue of structured outputs has been considered in the last model, which, by integrating outputs similarity into a model, opens new challenges and opportunities for remote sensing image processing.
Resumo:
Context: Foreign body aspiration (FbA) is a serious problem in children. Accurate clinical and radiographic diagnosis is important because missed or delayed diagnosis can result in respiratory difficulties ranging from life-treatening airway obstruction to chronic wheezing or recurrent pneumonia. Bronchoscopy also has risks and accurate clinical and radiographc diagnosis can support the decision of bronchoscopy. Objective: To rewiev the diagnostic accuracy of clinical presentation (CP) and pulmonary radiograph (PR) for the diagnosis of FbA. There is no previous rewievMethods: A search of Medline is conducted for articles containing data regarding CP and PR signes of FbA. Calculation of likelihood ratios (LR) and pre and post test probability using Bayes theorem were performed for all signs of CP and PR. Inclusion criteria: Articles containing prospective data regarding CP and PR of FbA. Exclusion criteria: Retrospectives studies. Articles containing incomplete data for calculation of LR. Results: Five prospectives studies are included with a total of 585 patients. Prevalence of FbA is 63% in children suspected of FbA. If CP is normal, probability of FbA is 25% and if PR is normal, probability is 14%. If CP is pathologic, probability of FbA is 69-76% with presence of cough (LR = 1.32) or dyspnea (LR = 1.84) or localized crackles (LR = 1.5). Probability is 81-88% if cyanosis (LR = 4.8) or decreased breaths sounds (LR = 4.3) or asymetric auscultation (LR = 2.9) or localized wheezing (LR = 2.5) are present. When CP is anormal and PR show mediatinal shift (LR = 100), pneumomediatin (LR = 100), radio opaque foreign body (LR = 100), lobar distention (LR = 4), atelectasis (LR = 2.5), inspiratory/expiratory abnormal (LR = 7), the probability of FbA is 96-100%. If CP is normal and PR is abnormal the probability is 40-100%. If CP is abnormal and PR is normal the probability is 55-75%. Conclusions: This rewiev of prospective studies demonstrates the importance of CP and PR and an algorithm can be proposed. When CP is abnormal with or without PR pathologic, the probability of FbA is high and bronchoscopy is indicated. When CP and PR are normal the probability of FbA is low and bronchoscopy is not necessary immediatly, observation should be proposed. This approach should be validated with prospective study.
Resumo:
SUMMARY: A top scoring pair (TSP) classifier consists of a pair of variables whose relative ordering can be used for accurately predicting the class label of a sample. This classification rule has the advantage of being easily interpretable and more robust against technical variations in data, as those due to different microarray platforms. Here we describe a parallel implementation of this classifier which significantly reduces the training time, and a number of extensions, including a multi-class approach, which has the potential of improving the classification performance. AVAILABILITY AND IMPLEMENTATION: Full C++ source code and R package Rgtsp are freely available from http://lausanne.isb-sib.ch/~vpopovic/research/. The implementation relies on existing OpenMP libraries.
Resumo:
The TCR repertoire of CD8+ T cells specific for Moloney murine leukemia virus (M-MuLV)-associated Ags has been investigated in vitro and in vivo. Analysis of a large panel of established CD8+ CTL clones specific for M-MuLV indicated an overwhelming bias for V beta4 in BALB/c mice and for V beta5.2 in C57BL/6 mice. These V beta biases were already detectable in mixed lymphocyte:tumor cell cultures established from virus-immune spleen cells. Furthermore, direct ex vivo analysis of PBL from BALB/c or C57BL/6 mice immunized with syngeneic M-MuLV-infected tumor cells revealed a dramatic increase in CD8+ cells expressing V beta4 or V beta5.2, respectively. M-MuLV-specific CD8+ cells with an activated (CD62L-) phenotype persisted in blood of immunized mice for at least 2 mo, and exhibited decreased TCR and CD8 levels compared with their naive counterparts. In C57BL/6 mice, most M-MuLV-specific CD8+ CTL clones and immune PBL coexpressed V alpha3.2 in association with V beta5.2. Moreover, these V beta5.2+ V alpha3.2+ cells were shown to recognize the recently described H-2Db-restricted epitope (CCLCLTVFL) encoded in the leader sequence of the M-MuLV gag polyprotein. Collectively, our data demonstrate a highly restricted TCR repertoire in the CD8+ T cell response to M-MuLV-associated Ags in vivo, and suggest the potential utility of flow-microfluorometric analysis of V beta and V alpha expression in the diagnosis and monitoring of viral infections.
Resumo:
Abstract : This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems. Résumé : Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes.
Resumo:
We conduct a large-scale comparative study on linearly combining superparent-one-dependence estimators (SPODEs), a popular family of seminaive Bayesian classifiers. Altogether, 16 model selection and weighing schemes, 58 benchmark data sets, and various statistical tests are employed. This paper's main contributions are threefold. First, it formally presents each scheme's definition, rationale, and time complexity and hence can serve as a comprehensive reference for researchers interested in ensemble learning. Second, it offers bias-variance analysis for each scheme's classification error performance. Third, it identifies effective schemes that meet various needs in practice. This leads to accurate and fast classification algorithms which have an immediate and significant impact on real-world applications. Another important feature of our study is using a variety of statistical tests to evaluate multiple learning methods across multiple data sets.
Resumo:
A common way to model multiclass classification problems is by means of Error-Correcting Output Codes (ECOCs). Given a multiclass problem, the ECOC technique designs a code word for each class, where each position of the code identifies the membership of the class for a given binary problem. A classification decision is obtained by assigning the label of the class with the closest code. One of the main requirements of the ECOC design is that the base classifier is capable of splitting each subgroup of classes from each binary problem. However, we cannot guarantee that a linear classifier model convex regions. Furthermore, nonlinear classifiers also fail to manage some type of surfaces. In this paper, we present a novel strategy to model multiclass classification problems using subclass information in the ECOC framework. Complex problems are solved by splitting the original set of classes into subclasses and embedding the binary problems in a problem-dependent ECOC design. Experimental results show that the proposed splitting procedure yields a better performance when the class overlap or the distribution of the training objects conceal the decision boundaries for the base classifier. The results are even more significant when one has a sufficiently large training size.
Resumo:
During chronic infection, pathogen-specific CD8(+) T cells upregulate expression of molecules such as the inhibitory surface receptor PD-1, have diminished cytokine production and are thought to undergo terminal differentiation into exhausted cells. Here we found that T cells with memory-like properties were generated during chronic infection. After transfer into naive mice, these cells robustly proliferated and controlled a viral infection. The reexpanded T cell populations continued to have the exhausted phenotype they acquired during the chronic infection. Thus, the cells underwent a form of differentiation that was stably transmitted to daughter cells. We therefore propose that during persistent infection, effector T cells stably differentiate into a state that is optimized to limit viral replication without causing overwhelming immunological pathology.