186 resultados para Classification algorithm

em Université de Lausanne, Switzerland


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Introduction: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints. Methods: We used gene-expression data from 230 breast cancers (grouped into training and independent validation sets), and we examined 40 predictors (five univariate feature-selection methods combined with eight different classifiers) for each of the three endpoints. Their classification performance was estimated on the training set by using two different resampling methods and compared with the accuracy observed in the independent validation set. Results: A ranking of the three classification problems was obtained, and the performance of 120 models was estimated and assessed on an independent validation set. The bootstrapping estimates were closer to the validation performance than were the cross-validation estimates. The required sample size for each endpoint was estimated, and both gene-level and pathway-level analyses were performed on the obtained models. Conclusions: We showed that genomic predictor accuracy is determined largely by an interplay between sample size and classification difficulty. Variations on univariate feature-selection methods and choice of classification algorithm have only a modest impact on predictor performance, and several statistically equally good predictors can be developed for any given classification problem.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Epoetin-delta (Dynepo Shire Pharmaceuticals, Basing stoke, UK) is a synthetic form of erythropoietin (EPO) whose resemblance with endogenous EPO makes it hard to identify using the classical identification criteria. Urine samples collected from six healthy volunteers treated with epoetin-delta injections and from a control population were immuno-purified and analyzed with the usual IEF method. On the basis of the EPO profiles integration, a linear multivariate model was computed for discriminant analysis. For each sample, a pattern classification algorithm returned a bands distribution and intensity score (bands intensity score) saying how representative this sample is of one of the two classes, positive or negative. Effort profiles were also integrated in the model. The method yielded a good sensitivity versus specificity relation and was used to determine the detection window of the molecule following multiple injections. The bands intensity score, which can be generalized to epoetin-alpha and epoetin-beta, is proposed as an alternative criterion and a supplementary evidence for the identification of EPO abuse.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose and validate a multivariate classification algorithm for characterizing changes in human intracranial electroencephalographic data (iEEG) after learning motor sequences. The algorithm is based on a Hidden Markov Model (HMM) that captures spatio-temporal properties of the iEEG at the level of single trials. Continuous intracranial iEEG was acquired during two sessions (one before and one after a night of sleep) in two patients with depth electrodes implanted in several brain areas. They performed a visuomotor sequence (serial reaction time task, SRTT) using the fingers of their non-dominant hand. Our results show that the decoding algorithm correctly classified single iEEG trials from the trained sequence as belonging to either the initial training phase (day 1, before sleep) or a later consolidated phase (day 2, after sleep), whereas it failed to do so for trials belonging to a control condition (pseudo-random sequence). Accurate single-trial classification was achieved by taking advantage of the distributed pattern of neural activity. However, across all the contacts the hippocampus contributed most significantly to the classification accuracy for both patients, and one fronto-striatal contact for one patient. Together, these human intracranial findings demonstrate that a multivariate decoding approach can detect learning-related changes at the level of single-trial iEEG. Because it allows an unbiased identification of brain sites contributing to a behavioral effect (or experimental condition) at the level of single subject, this approach could be usefully applied to assess the neural correlates of other complex cognitive functions in patients implanted with multiple electrodes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Behavioral and brain responses to identical stimuli can vary with experimental and task parameters, including the context of stimulus presentation or attention. More surprisingly, computational models suggest that noise-related random fluctuations in brain responses to stimuli would alone be sufficient to engender perceptual differences between physically identical stimuli. In two experiments combining psychophysics and EEG in healthy humans, we investigated brain mechanisms whereby identical stimuli are (erroneously) perceived as different (higher vs lower in pitch or longer vs shorter in duration) in the absence of any change in the experimental context. Even though, as expected, participants' percepts to identical stimuli varied randomly, a classification algorithm based on a mixture of Gaussians model (GMM) showed that there was sufficient information in single-trial EEG to reliably predict participants' judgments of the stimulus dimension. By contrasting electrical neuroimaging analyses of auditory evoked potentials (AEPs) to the identical stimuli as a function of participants' percepts, we identified the precise timing and neural correlates (strength vs topographic modulations) as well as intracranial sources of these erroneous perceptions. In both experiments, AEP differences first occurred ∼100 ms after stimulus onset and were the result of topographic modulations following from changes in the configuration of active brain networks. Source estimations localized the origin of variations in perceived pitch of identical stimuli within right temporal and left frontal areas and of variations in perceived duration within right temporoparietal areas. We discuss our results in terms of providing neurophysiologic evidence for the contribution of random fluctuations in brain activity to conscious perception.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Counterfeit pharmaceutical products have become a widespread problem in the last decade. Various analytical techniques have been applied to discriminate between genuine and counterfeit products. Among these, Near-infrared (NIR) and Raman spectroscopy provided promising results.The present study offers a methodology allowing to provide more valuable information fororganisations engaged in the fight against counterfeiting of medicines.A database was established by analyzing counterfeits of a particular pharmaceutical product using Near-infrared (NIR) and Raman spectroscopy. Unsupervised chemometric techniques (i.e. principal component analysis - PCA and hierarchical cluster analysis - HCA) were implemented to identify the classes within the datasets. Gas Chromatography coupled to Mass Spectrometry (GC-MS) and Fourier Transform Infrared Spectroscopy (FT-IR) were used to determine the number of different chemical profiles within the counterfeits. A comparison with the classes established by NIR and Raman spectroscopy allowed to evaluate the discriminating power provided by these techniques. Supervised classifiers (i.e. k-Nearest Neighbors, Partial Least Squares Discriminant Analysis, Probabilistic Neural Networks and Counterpropagation Artificial Neural Networks) were applied on the acquired NIR and Raman spectra and the results were compared to the ones provided by the unsupervised classifiers.The retained strategy for routine applications, founded on the classes identified by NIR and Raman spectroscopy, uses a classification algorithm based on distance measures and Receiver Operating Characteristics (ROC) curves. The model is able to compare the spectrum of a new counterfeit with that of previously analyzed products and to determine if a new specimen belongs to one of the existing classes, consequently allowing to establish a link with other counterfeits of the database.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Résumé La théorie de l'autocatégorisation est une théorie de psychologie sociale qui porte sur la relation entre l'individu et le groupe. Elle explique le comportement de groupe par la conception de soi et des autres en tant que membres de catégories sociales, et par l'attribution aux individus des caractéristiques prototypiques de ces catégories. Il s'agit donc d'une théorie de l'individu qui est censée expliquer des phénomènes collectifs. Les situations dans lesquelles un grand nombre d'individus interagissent de manière non triviale génèrent typiquement des comportements collectifs complexes qui sont difficiles à prévoir sur la base des comportements individuels. La simulation informatique de tels systèmes est un moyen fiable d'explorer de manière systématique la dynamique du comportement collectif en fonction des spécifications individuelles. Dans cette thèse, nous présentons un modèle formel d'une partie de la théorie de l'autocatégorisation appelée principe du métacontraste. À partir de la distribution d'un ensemble d'individus sur une ou plusieurs dimensions comparatives, le modèle génère les catégories et les prototypes associés. Nous montrons que le modèle se comporte de manière cohérente par rapport à la théorie et est capable de répliquer des données expérimentales concernant divers phénomènes de groupe, dont par exemple la polarisation. De plus, il permet de décrire systématiquement les prédictions de la théorie dont il dérive, notamment dans des situations nouvelles. Au niveau collectif, plusieurs dynamiques peuvent être observées, dont la convergence vers le consensus, vers une fragmentation ou vers l'émergence d'attitudes extrêmes. Nous étudions également l'effet du réseau social sur la dynamique et montrons qu'à l'exception de la vitesse de convergence, qui augmente lorsque les distances moyennes du réseau diminuent, les types de convergences dépendent peu du réseau choisi. Nous constatons d'autre part que les individus qui se situent à la frontière des groupes (dans le réseau social ou spatialement) ont une influence déterminante sur l'issue de la dynamique. Le modèle peut par ailleurs être utilisé comme un algorithme de classification automatique. Il identifie des prototypes autour desquels sont construits des groupes. Les prototypes sont positionnés de sorte à accentuer les caractéristiques typiques des groupes, et ne sont pas forcément centraux. Enfin, si l'on considère l'ensemble des pixels d'une image comme des individus dans un espace de couleur tridimensionnel, le modèle fournit un filtre qui permet d'atténuer du bruit, d'aider à la détection d'objets et de simuler des biais de perception comme l'induction chromatique. Abstract Self-categorization theory is a social psychology theory dealing with the relation between the individual and the group. It explains group behaviour through self- and others' conception as members of social categories, and through the attribution of the proto-typical categories' characteristics to the individuals. Hence, it is a theory of the individual that intends to explain collective phenomena. Situations involving a large number of non-trivially interacting individuals typically generate complex collective behaviours, which are difficult to anticipate on the basis of individual behaviour. Computer simulation of such systems is a reliable way of systematically exploring the dynamics of the collective behaviour depending on individual specifications. In this thesis, we present a formal model of a part of self-categorization theory named metacontrast principle. Given the distribution of a set of individuals on one or several comparison dimensions, the model generates categories and their associated prototypes. We show that the model behaves coherently with respect to the theory and is able to replicate experimental data concerning various group phenomena, for example polarization. Moreover, it allows to systematically describe the predictions of the theory from which it is derived, specially in unencountered situations. At the collective level, several dynamics can be observed, among which convergence towards consensus, towards frag-mentation or towards the emergence of extreme attitudes. We also study the effect of the social network on the dynamics and show that, except for the convergence speed which raises as the mean distances on the network decrease, the observed convergence types do not depend much on the chosen network. We further note that individuals located at the border of the groups (whether in the social network or spatially) have a decisive influence on the dynamics' issue. In addition, the model can be used as an automatic classification algorithm. It identifies prototypes around which groups are built. Prototypes are positioned such as to accentuate groups' typical characteristics and are not necessarily central. Finally, if we consider the set of pixels of an image as individuals in a three-dimensional color space, the model provides a filter that allows to lessen noise, to help detecting objects and to simulate perception biases such as chromatic induction.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La théorie de l'autocatégorisation est une théorie de psychologie sociale qui porte sur la relation entre l'individu et le groupe. Elle explique le comportement de groupe par la conception de soi et des autres en tant que membres de catégories sociales, et par l'attribution aux individus des caractéristiques prototypiques de ces catégories. Il s'agit donc d'une théorie de l'individu qui est censée expliquer des phénomènes collectifs. Les situations dans lesquelles un grand nombre d'individus interagissent de manière non triviale génèrent typiquement des comportements collectifs complexes qui sont difficiles à prévoir sur la base des comportements individuels. La simulation informatique de tels systèmes est un moyen fiable d'explorer de manière systématique la dynamique du comportement collectif en fonction des spécifications individuelles. Dans cette thèse, nous présentons un modèle formel d'une partie de la théorie de l'autocatégorisation appelée principe du métacontraste. À partir de la distribution d'un ensemble d'individus sur une ou plusieurs dimensions comparatives, le modèle génère les catégories et les prototypes associés. Nous montrons que le modèle se comporte de manière cohérente par rapport à la théorie et est capable de répliquer des données expérimentales concernant divers phénomènes de groupe, dont par exemple la polarisation. De plus, il permet de décrire systématiquement les prédictions de la théorie dont il dérive, notamment dans des situations nouvelles. Au niveau collectif, plusieurs dynamiques peuvent être observées, dont la convergence vers le consensus, vers une fragmentation ou vers l'émergence d'attitudes extrêmes. Nous étudions également l'effet du réseau social sur la dynamique et montrons qu'à l'exception de la vitesse de convergence, qui augmente lorsque les distances moyennes du réseau diminuent, les types de convergences dépendent peu du réseau choisi. Nous constatons d'autre part que les individus qui se situent à la frontière des groupes (dans le réseau social ou spatialement) ont une influence déterminante sur l'issue de la dynamique. Le modèle peut par ailleurs être utilisé comme un algorithme de classification automatique. Il identifie des prototypes autour desquels sont construits des groupes. Les prototypes sont positionnés de sorte à accentuer les caractéristiques typiques des groupes, et ne sont pas forcément centraux. Enfin, si l'on considère l'ensemble des pixels d'une image comme des individus dans un espace de couleur tridimensionnel, le modèle fournit un filtre qui permet d'atténuer du bruit, d'aider à la détection d'objets et de simuler des biais de perception comme l'induction chromatique. Abstract Self-categorization theory is a social psychology theory dealing with the relation between the individual and the group. It explains group behaviour through self- and others' conception as members of social categories, and through the attribution of the proto-typical categories' characteristics to the individuals. Hence, it is a theory of the individual that intends to explain collective phenomena. Situations involving a large number of non-trivially interacting individuals typically generate complex collective behaviours, which are difficult to anticipate on the basis of individual behaviour. Computer simulation of such systems is a reliable way of systematically exploring the dynamics of the collective behaviour depending on individual specifications. In this thesis, we present a formal model of a part of self-categorization theory named metacontrast principle. Given the distribution of a set of individuals on one or several comparison dimensions, the model generates categories and their associated prototypes. We show that the model behaves coherently with respect to the theory and is able to replicate experimental data concerning various group phenomena, for example polarization. Moreover, it allows to systematically describe the predictions of the theory from which it is derived, specially in unencountered situations. At the collective level, several dynamics can be observed, among which convergence towards consensus, towards frag-mentation or towards the emergence of extreme attitudes. We also study the effect of the social network on the dynamics and show that, except for the convergence speed which raises as the mean distances on the network decrease, the observed convergence types do not depend much on the chosen network. We further note that individuals located at the border of the groups (whether in the social network or spatially) have a decisive influence on the dynamics' issue. In addition, the model can be used as an automatic classification algorithm. It identifies prototypes around which groups are built. Prototypes are positioned such as to accentuate groups' typical characteristics and are not necessarily central. Finally, if we consider the set of pixels of an image as individuals in a three-dimensional color space, the model provides a filter that allows to lessen noise, to help detecting objects and to simulate perception biases such as chromatic induction.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

BACKGROUND: Surveillance of multiple congenital anomalies is considered to be more sensitive for the detection of new teratogens than surveillance of all or isolated congenital anomalies. Current literature proposes the manual review of all cases for classification into isolated or multiple congenital anomalies. METHODS: Multiple anomalies were defined as two or more major congenital anomalies, excluding sequences and syndromes. A computer algorithm for classification of major congenital anomaly cases in the EUROCAT database according to International Classification of Diseases (ICD)v10 codes was programmed, further developed, and implemented for 1 year's data (2004) from 25 registries. The group of cases classified with potential multiple congenital anomalies were manually reviewed by three geneticists to reach a final agreement of classification as "multiple congenital anomaly" cases. RESULTS: A total of 17,733 cases with major congenital anomalies were reported giving an overall prevalence of major congenital anomalies at 2.17%. The computer algorithm classified 10.5% of all cases as "potentially multiple congenital anomalies". After manual review of these cases, 7% were agreed to have true multiple congenital anomalies. Furthermore, the algorithm classified 15% of all cases as having chromosomal anomalies, 2% as monogenic syndromes, and 76% as isolated congenital anomalies. The proportion of multiple anomalies varies by congenital anomaly subgroup with up to 35% of cases with bilateral renal agenesis. CONCLUSIONS: The implementation of the EUROCAT computer algorithm is a feasible, efficient, and transparent way to improve classification of congenital anomalies for surveillance and research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Defining an efficient training set is one of the most delicate phases for the success of remote sensing image classification routines. The complexity of the problem, the limited temporal and financial resources, as well as the high intraclass variance can make an algorithm fail if it is trained with a suboptimal dataset. Active learning aims at building efficient training sets by iteratively improving the model performance through sampling. A user-defined heuristic ranks the unlabeled pixels according to a function of the uncertainty of their class membership and then the user is asked to provide labels for the most uncertain pixels. This paper reviews and tests the main families of active learning algorithms: committee, large margin, and posterior probability-based. For each of them, the most recent advances in the remote sensing community are discussed and some heuristics are detailed and tested. Several challenging remote sensing scenarios are considered, including very high spatial resolution and hyperspectral image classification. Finally, guidelines for choosing the good architecture are provided for new and/or unexperienced user.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The 2008 Data Fusion Contest organized by the IEEE Geoscience and Remote Sensing Data Fusion Technical Committee deals with the classification of high-resolution hyperspectral data from an urban area. Unlike in the previous issues of the contest, the goal was not only to identify the best algorithm but also to provide a collaborative effort: The decision fusion of the best individual algorithms was aiming at further improving the classification performances, and the best algorithms were ranked according to their relative contribution to the decision fusion. This paper presents the five awarded algorithms and the conclusions of the contest, stressing the importance of decision fusion, dimension reduction, and supervised classification methods, such as neural networks and support vector machines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Diagnosis of several neurological disorders is based on the detection of typical pathological patterns in the electroencephalogram (EEG). This is a time-consuming task requiring significant training and experience. Automatic detection of these EEG patterns would greatly assist in quantitative analysis and interpretation. We present a method, which allows automatic detection of epileptiform events and discrimination of them from eye blinks, and is based on features derived using a novel application of independent component analysis. The algorithm was trained and cross validated using seven EEGs with epileptiform activity. For epileptiform events with compensation for eyeblinks, the sensitivity was 65 +/- 22% at a specificity of 86 +/- 7% (mean +/- SD). With feature extraction by PCA or classification of raw data, specificity reduced to 76 and 74%, respectively, for the same sensitivity. On exactly the same data, the commercially available software Reveal had a maximum sensitivity of 30% and concurrent specificity of 77%. Our algorithm performed well at detecting epileptiform events in this preliminary test and offers a flexible tool that is intended to be generalized to the simultaneous classification of many waveforms in the EEG.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

HAMAP (High-quality Automated and Manual Annotation of Proteins-available at http://hamap.expasy.org/) is a system for the automatic classification and annotation of protein sequences. HAMAP provides annotation of the same quality and detail as UniProtKB/Swiss-Prot, using manually curated profiles for protein sequence family classification and expert curated rules for functional annotation of family members. HAMAP data and tools are made available through our website and as part of the UniRule pipeline of UniProt, providing annotation for millions of unreviewed sequences of UniProtKB/TrEMBL. Here we report on the growth of HAMAP and updates to the HAMAP system since our last report in the NAR Database Issue of 2013. We continue to augment HAMAP with new family profiles and annotation rules as new protein families are characterized and annotated in UniProtKB/Swiss-Prot; the latest version of HAMAP (as of 3 September 2014) contains 1983 family classification profiles and 1998 annotation rules (up from 1780 and 1720). We demonstrate how the complex logic of HAMAP rules allows for precise annotation of individual functional variants within large homologous protein families. We also describe improvements to our web-based tool HAMAP-Scan which simplify the classification and annotation of sequences, and the incorporation of an improved sequence-profile search algorithm.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The potential of type-2 fuzzy sets for managing high levels of uncertainty in the subjective knowledge of experts or of numerical information has focused on control and pattern classification systems in recent years. One of the main challenges in designing a type-2 fuzzy logic system is how to estimate the parameters of type-2 fuzzy membership function (T2MF) and the Footprint of Uncertainty (FOU) from imperfect and noisy datasets. This paper presents an automatic approach for learning and tuning Gaussian interval type-2 membership functions (IT2MFs) with application to multi-dimensional pattern classification problems. T2MFs and their FOUs are tuned according to the uncertainties in the training dataset by a combination of genetic algorithm (GA) and crossvalidation techniques. In our GA-based approach, the structure of the chromosome has fewer genes than other GA methods and chromosome initialization is more precise. The proposed approach addresses the application of the interval type-2 fuzzy logic system (IT2FLS) for the problem of nodule classification in a lung Computer Aided Detection (CAD) system. The designed IT2FLS is compared with its type-1 fuzzy logic system (T1FLS) counterpart. The results demonstrate that the IT2FLS outperforms the T1FLS by more than 30% in terms of classification accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents 3-D brain tissue classificationschemes using three recent promising energy minimizationmethods for Markov random fields: graph cuts, loopybelief propagation and tree-reweighted message passing.The classification is performed using the well knownfinite Gaussian mixture Markov Random Field model.Results from the above methods are compared with widelyused iterative conditional modes algorithm. Theevaluation is performed on a dataset containing simulatedT1-weighted MR brain volumes with varying noise andintensity non-uniformities. The comparisons are performedin terms of energies as well as based on ground truthsegmentations, using various quantitative metrics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research considers the problem of spatial data classification using machine learning algorithms: probabilistic neural networks (PNN) and support vector machines (SVM). As a benchmark model simple k-nearest neighbor algorithm is considered. PNN is a neural network reformulation of well known nonparametric principles of probability density modeling using kernel density estimator and Bayesian optimal or maximum a posteriori decision rules. PNN is well suited to problems where not only predictions but also quantification of accuracy and integration of prior information are necessary. An important property of PNN is that they can be easily used in decision support systems dealing with problems of automatic classification. Support vector machine is an implementation of the principles of statistical learning theory for the classification tasks. Recently they were successfully applied for different environmental topics: classification of soil types and hydro-geological units, optimization of monitoring networks, susceptibility mapping of natural hazards. In the present paper both simulated and real data case studies (low and high dimensional) are considered. The main attention is paid to the detection and learning of spatial patterns by the algorithms applied.