842 resultados para Graph Based Algorithms


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação

Relevância:

90.00% 90.00%

Publicador:

Resumo:

PURPOSE: To assess how different diagnostic decision aids perform in terms of sensitivity, specificity, and harm. METHODS: Four diagnostic decision aids were compared, as applied to a simulated patient population: a findings-based algorithm following a linear or branched pathway, a serial threshold-based strategy, and a parallel threshold-based strategy. Headache in immune-compromised HIV patients in a developing country was used as an example. Diagnoses included cryptococcal meningitis, cerebral toxoplasmosis, tuberculous meningitis, bacterial meningitis, and malaria. Data were derived from literature and expert opinion. Diagnostic strategies' validity was assessed in terms of sensitivity, specificity, and harm related to mortality and morbidity. Sensitivity analyses and Monte Carlo simulation were performed. RESULTS: The parallel threshold-based approach led to a sensitivity of 92% and a specificity of 65%. Sensitivities of the serial threshold-based approach and the branched and linear algorithms were 47%, 47%, and 74%, respectively, and the specificities were 85%, 95%, and 96%. The parallel threshold-based approach resulted in the least harm, with the serial threshold-based approach, the branched algorithm, and the linear algorithm being associated with 1.56-, 1.44-, and 1.17-times higher harm, respectively. Findings were corroborated by sensitivity and Monte Carlo analyses. CONCLUSION: A threshold-based diagnostic approach is designed to find the optimal trade-off that minimizes expected harm, enhancing sensitivity and lowering specificity when appropriate, as in the given example of a symptom pointing to several life-threatening diseases. Findings-based algorithms, in contrast, solely consider clinical observations. A parallel workup, as opposed to a serial workup, additionally allows for all potential diseases to be reviewed, further reducing false negatives. The parallel threshold-based approach might, however, not be as good in other disease settings.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Malgré des progrès constants en termes de capacité de calcul, mémoire et quantité de données disponibles, les algorithmes d'apprentissage machine doivent se montrer efficaces dans l'utilisation de ces ressources. La minimisation des coûts est évidemment un facteur important, mais une autre motivation est la recherche de mécanismes d'apprentissage capables de reproduire le comportement d'êtres intelligents. Cette thèse aborde le problème de l'efficacité à travers plusieurs articles traitant d'algorithmes d'apprentissage variés : ce problème est vu non seulement du point de vue de l'efficacité computationnelle (temps de calcul et mémoire utilisés), mais aussi de celui de l'efficacité statistique (nombre d'exemples requis pour accomplir une tâche donnée). Une première contribution apportée par cette thèse est la mise en lumière d'inefficacités statistiques dans des algorithmes existants. Nous montrons ainsi que les arbres de décision généralisent mal pour certains types de tâches (chapitre 3), de même que les algorithmes classiques d'apprentissage semi-supervisé à base de graphe (chapitre 5), chacun étant affecté par une forme particulière de la malédiction de la dimensionalité. Pour une certaine classe de réseaux de neurones, appelés réseaux sommes-produits, nous montrons qu'il peut être exponentiellement moins efficace de représenter certaines fonctions par des réseaux à une seule couche cachée, comparé à des réseaux profonds (chapitre 4). Nos analyses permettent de mieux comprendre certains problèmes intrinsèques liés à ces algorithmes, et d'orienter la recherche dans des directions qui pourraient permettre de les résoudre. Nous identifions également des inefficacités computationnelles dans les algorithmes d'apprentissage semi-supervisé à base de graphe (chapitre 5), et dans l'apprentissage de mélanges de Gaussiennes en présence de valeurs manquantes (chapitre 6). Dans les deux cas, nous proposons de nouveaux algorithmes capables de traiter des ensembles de données significativement plus grands. Les deux derniers chapitres traitent de l'efficacité computationnelle sous un angle différent. Dans le chapitre 7, nous analysons de manière théorique un algorithme existant pour l'apprentissage efficace dans les machines de Boltzmann restreintes (la divergence contrastive), afin de mieux comprendre les raisons qui expliquent le succès de cet algorithme. Finalement, dans le chapitre 8 nous présentons une application de l'apprentissage machine dans le domaine des jeux vidéo, pour laquelle le problème de l'efficacité computationnelle est relié à des considérations d'ingénierie logicielle et matérielle, souvent ignorées en recherche mais ô combien importantes en pratique.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

La scoliose idiopathique de l’adolescent (SIA) est une déformation tri-dimensionelle du rachis. Son traitement comprend l’observation, l’utilisation de corsets pour limiter sa progression ou la chirurgie pour corriger la déformation squelettique et cesser sa progression. Le traitement chirurgical reste controversé au niveau des indications, mais aussi de la chirurgie à entreprendre. Malgré la présence de classifications pour guider le traitement de la SIA, une variabilité dans la stratégie opératoire intra et inter-observateur a été décrite dans la littérature. Cette variabilité s’accentue d’autant plus avec l’évolution des techniques chirurgicales et de l’instrumentation disponible. L’avancement de la technologie et son intégration dans le milieu médical a mené à l’utilisation d’algorithmes d’intelligence artificielle informatiques pour aider la classification et l’évaluation tridimensionnelle de la scoliose. Certains algorithmes ont démontré être efficace pour diminuer la variabilité dans la classification de la scoliose et pour guider le traitement. L’objectif général de cette thèse est de développer une application utilisant des outils d’intelligence artificielle pour intégrer les données d’un nouveau patient et les évidences disponibles dans la littérature pour guider le traitement chirurgical de la SIA. Pour cela une revue de la littérature sur les applications existantes dans l’évaluation de la SIA fut entreprise pour rassembler les éléments qui permettraient la mise en place d’une application efficace et acceptée dans le milieu clinique. Cette revue de la littérature nous a permis de réaliser que l’existence de “black box” dans les applications développées est une limitation pour l’intégration clinique ou la justification basée sur les évidence est essentielle. Dans une première étude nous avons développé un arbre décisionnel de classification de la scoliose idiopathique basé sur la classification de Lenke qui est la plus communément utilisée de nos jours mais a été critiquée pour sa complexité et la variabilité inter et intra-observateur. Cet arbre décisionnel a démontré qu’il permet d’augmenter la précision de classification proportionnellement au temps passé à classifier et ce indépendamment du niveau de connaissance sur la SIA. Dans une deuxième étude, un algorithme de stratégies chirurgicales basé sur des règles extraites de la littérature a été développé pour guider les chirurgiens dans la sélection de l’approche et les niveaux de fusion pour la SIA. Lorsque cet algorithme est appliqué à une large base de donnée de 1556 cas de SIA, il est capable de proposer une stratégie opératoire similaire à celle d’un chirurgien expert dans prêt de 70% des cas. Cette étude a confirmé la possibilité d’extraire des stratégies opératoires valides à l’aide d’un arbre décisionnel utilisant des règles extraites de la littérature. Dans une troisième étude, la classification de 1776 patients avec la SIA à l’aide d’une carte de Kohonen, un type de réseaux de neurone a permis de démontrer qu’il existe des scoliose typiques (scoliose à courbes uniques ou double thoracique) pour lesquelles la variabilité dans le traitement chirurgical varie peu des recommandations par la classification de Lenke tandis que les scolioses a courbes multiples ou tangentielles à deux groupes de courbes typiques étaient celles avec le plus de variation dans la stratégie opératoire. Finalement, une plateforme logicielle a été développée intégrant chacune des études ci-dessus. Cette interface logicielle permet l’entrée de données radiologiques pour un patient scoliotique, classifie la SIA à l’aide de l’arbre décisionnel de classification et suggère une approche chirurgicale basée sur l’arbre décisionnel de stratégies opératoires. Une analyse de la correction post-opératoire obtenue démontre une tendance, bien que non-statistiquement significative, à une meilleure balance chez les patients opérés suivant la stratégie recommandée par la plateforme logicielle que ceux aillant un traitement différent. Les études exposées dans cette thèse soulignent que l’utilisation d’algorithmes d’intelligence artificielle dans la classification et l’élaboration de stratégies opératoires de la SIA peuvent être intégrées dans une plateforme logicielle et pourraient assister les chirurgiens dans leur planification préopératoire.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Most active-contour methods are based either on maximizing the image contrast under the contour or on minimizing the sum of squared distances between contour and image 'features'. The Marginalized Likelihood Ratio (MLR) contour model uses a contrast-based measure of goodness-of-fit for the contour and thus falls into the first class. The point of departure from previous models consists in marginalizing this contrast measure over unmodelled shape variations. The MLR model naturally leads to the EM Contour algorithm, in which pose optimization is carried out by iterated least-squares, as in feature-based contour methods. The difference with respect to other feature-based algorithms is that the EM Contour algorithm minimizes squared distances from Bayes least-squares (marginalized) estimates of contour locations, rather than from 'strongest features' in the neighborhood of the contour. Within the framework of the MLR model, alternatives to the EM algorithm can also be derived: one of these alternatives is the empirical-information method. Tracking experiments demonstrate the robustness of pose estimates given by the MLR model, and support the theoretical expectation that the EM Contour algorithm is more robust than either feature-based methods or the empirical-information method. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The dependence of much of Africa on rain fed agriculture leads to a high vulnerability to fluctuations in rainfall amount. Hence, accurate monitoring of near-real time rainfall is particularly useful, for example in forewarning possible crop shortfalls in drought-prone areas. Unfortunately, ground based observations are often inadequate. Rainfall estimates from satellite-based algorithms and numerical model outputs can fill this data gap, however rigorous assessment of such estimates is required. In this case, three satellite based products (NOAA-RFE 2.0, GPCP-1DD and TAMSAT) and two numerical model outputs (ERA-40 and ERA-Interim) have been evaluated for Uganda in East Africa using a network of 27 rain gauges. The study focuses on the years 2001 to 2005 and considers the main rainy season (February to June). All data sets were converted to the same temporal and spatial scales. Kriging was used for the spatial interpolation of the gauge data. All three satellite products showed similar characteristics and had a high level of skill that exceeded both model outputs. ERA-Interim had a tendency to overestimate whilst ERA-40 consistently underestimated the Ugandan rainfall.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The assessment of routing protocols for mobile wireless networks is a difficult task, because of the networks` dynamic behavior and the absence of benchmarks. However, some of these networks, such as intermittent wireless sensors networks, periodic or cyclic networks, and some delay tolerant networks (DTNs), have more predictable dynamics, as the temporal variations in the network topology can be considered as deterministic, which may make them easier to study. Recently, a graph theoretic model-the evolving graphs-was proposed to help capture the dynamic behavior of such networks, in view of the construction of least cost routing and other algorithms. The algorithms and insights obtained through this model are theoretically very efficient and intriguing. However, there is no study about the use of such theoretical results into practical situations. Therefore, the objective of our work is to analyze the applicability of the evolving graph theory in the construction of efficient routing protocols in realistic scenarios. In this paper, we use the NS2 network simulator to first implement an evolving graph based routing protocol, and then to use it as a benchmark when comparing the four major ad hoc routing protocols (AODV, DSR, OLSR and DSDV). Interestingly, our experiments show that evolving graphs have the potential to be an effective and powerful tool in the development and analysis of algorithms for dynamic networks, with predictable dynamics at least. In order to make this model widely applicable, however, some practical issues still have to be addressed and incorporated into the model, like adaptive algorithms. We also discuss such issues in this paper, as a result of our experience.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Concept drift is a problem of increasing importance in machine learning and data mining. Data sets under analysis are no longer only static databases, but also data streams in which concepts and data distributions may not be stable over time. However, most learning algorithms produced so far are based on the assumption that data comes from a fixed distribution, so they are not suitable to handle concept drifts. Moreover, some concept drifts applications requires fast response, which means an algorithm must always be (re) trained with the latest available data. But the process of labeling data is usually expensive and/or time consuming when compared to unlabeled data acquisition, thus only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are also based on the assumption that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenge in machine learning. Recently, a particle competition and cooperation approach was used to realize graph-based semi-supervised learning from static data. In this paper, we extend that approach to handle data streams and concept drift. The result is a passive algorithm using a single classifier, which naturally adapts to concept changes, without any explicit drift detection mechanism. Its built-in mechanisms provide a natural way of learning from new data, gradually forgetting older knowledge as older labeled data items became less influent on the classification of newer data items. Some computer simulation are presented, showing the effectiveness of the proposed method.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Since Sharir and Pnueli, algorithms for context-sensitivity have been defined in terms of 'valid' paths in an interprocedural flow graph. The definition of valid paths requires atomic call and ret statements, and encapsulated procedures. Thus, the resulting algorithms are not directly applicable when behavior similar to call and ret instructions may be realized using non-atomic statements, or when procedures do not have rigid boundaries, such as with programs in low level languages like assembly or RTL. We present a framework for context-sensitive analysis that requires neither atomic call and ret instructions, nor encapsulated procedures. The framework presented decouples the transfer of control semantics and the context manipulation semantics of statements. A new definition of context-sensitivity, called stack contexts, is developed. A stack context, which is defined using trace semantics, is more general than Sharir and Pnueli's interprocedural path based calling-context. An abstract interpretation based framework is developed to reason about stack-contexts and to derive analogues of calling-context based algorithms using stack-context. The framework presented is suitable for deriving algorithms for analyzing binary programs, such as malware, that employ obfuscations with the deliberate intent of defeating automated analysis. The framework is used to create a context-sensitive version of Venable et al.'s algorithm for analyzing x86 binaries without requiring that a binary conforms to a standard compilation model for maintaining procedures, calls, and returns. Experimental results show that a context-sensitive analysis using stack-context performs just as well for programs where the use of Sharir and Pnueli's calling-context produces correct approximations. However, if those programs are transformed to use call obfuscations, a contextsensitive analysis using stack-context still provides the same, correct results and without any additional overhead. © Springer Science+Business Media, LLC 2011.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation. © 2012 IEEE.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An important tool for the heart disease diagnosis is the analysis of electrocardiogram (ECG) signals, since the non-invasive nature and simplicity of the ECG exam. According to the application, ECG data analysis consists of steps such as preprocessing, segmentation, feature extraction and classification aiming to detect cardiac arrhythmias (i.e.; cardiac rhythm abnormalities). Aiming to made a fast and accurate cardiac arrhythmia signal classification process, we apply and analyze a recent and robust supervised graph-based pattern recognition technique, the optimum-path forest (OPF) classifier. To the best of our knowledge, it is the first time that OPF classifier is used to the ECG heartbeat signal classification task. We then compare the performance (in terms of training and testing time, accuracy, specificity, and sensitivity) of the OPF classifier to the ones of other three well-known expert system classifiers, i.e.; support vector machine (SVM), Bayesian and multilayer artificial neural network (MLP), using features extracted from six main approaches considered in literature for ECG arrhythmia analysis. In our experiments, we use the MIT-BIH Arrhythmia Database and the evaluation protocol recommended by The Association for the Advancement of Medical Instrumentation. A discussion on the obtained results shows that OPF classifier presents a robust performance, i.e.; there is no need for parameter setup, as well as a high accuracy at an extremely low computational cost. Moreover, in average, the OPF classifier yielded greater performance than the MLP and SVM classifiers in terms of classification time and accuracy, and to produce quite similar performance to the Bayesian classifier, showing to be a promising technique for ECG signal analysis. © 2012 Elsevier Ltd. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation.