938 resultados para Graph-based segmentation


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper addresses biometric identification using large databases, in particular, iris databases. In such applications, it is critical to have low response time, while maintaining an acceptable recognition rate. Thus, the trade-off between speed and accuracy must be evaluated for processing and recognition parts of an identification system. In this paper, a graph-based framework for pattern recognition, called Optimum-Path Forest (OPF), is utilized as a classifier in a pre-developed iris recognition system. The aim of this paper is to verify the effectiveness of OPF in the field of iris recognition, and its performance for various scale iris databases. The existing Gauss-Laguerre Wavelet based coding scheme is used for iris encoding. The performance of the OPF and two other - Hamming and Bayesian - classifiers, is compared using small, medium, and large-scale databases. Such a comparison shows that the OPF has faster response for large-scale databases, thus performing better than the more accurate, but slower, classifiers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Concept drift is a problem of increasing importance in machine learning and data mining. Data sets under analysis are no longer only static databases, but also data streams in which concepts and data distributions may not be stable over time. However, most learning algorithms produced so far are based on the assumption that data comes from a fixed distribution, so they are not suitable to handle concept drifts. Moreover, some concept drifts applications requires fast response, which means an algorithm must always be (re) trained with the latest available data. But the process of labeling data is usually expensive and/or time consuming when compared to unlabeled data acquisition, thus only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are also based on the assumption that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenge in machine learning. Recently, a particle competition and cooperation approach was used to realize graph-based semi-supervised learning from static data. In this paper, we extend that approach to handle data streams and concept drift. The result is a passive algorithm using a single classifier, which naturally adapts to concept changes, without any explicit drift detection mechanism. Its built-in mechanisms provide a natural way of learning from new data, gradually forgetting older knowledge as older labeled data items became less influent on the classification of newer data items. Some computer simulation are presented, showing the effectiveness of the proposed method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Majority of biometric researchers focus on the accuracy of matching using biometrics databases, including iris databases, while the scalability and speed issues have been neglected. In the applications such as identification in airports and borders, it is critical for the identification system to have low-time response. In this paper, a graph-based framework for pattern recognition, called Optimum-Path Forest (OPF), is utilized as a classifier in a pre-developed iris recognition system. The aim of this paper is to verify the effectiveness of OPF in the field of iris recognition, and its performance for various scale iris databases. This paper investigates several classifiers, which are widely used in iris recognition papers, and the response time along with accuracy. The existing Gauss-Laguerre Wavelet based iris coding scheme, which shows perfect discrimination with rotary Hamming distance classifier, is used for iris coding. The performance of classifiers is compared using small, medium, and large scale databases. Such comparison shows that OPF has faster response for large scale database, thus performing better than more accurate but slower Bayesian classifier.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The increase of computing power of the microcomputers has stimulated the building of direct manipulation interfaces that allow graphical representation of Linear Programming (LP) models. This work discusses the components of such a graphical interface as the basis for a system to assist users in the process of formulating LP problems. In essence, this work proposes a methodology which considers the modelling task as divided into three stages which are specification of the Data Model, the Conceptual Model and the LP Model. The necessity for using Artificial Intelligence techniques in the problem conceptualisation and to help the model formulation task is illustrated.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation. © 2012 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Semi-supervised learning is applied to classification problems where only a small portion of the data items is labeled. In these cases, the reliability of the labels is a crucial factor, because mislabeled items may propagate wrong labels to a large portion or even the entire data set. This paper aims to address this problem by presenting a graph-based (network-based) semi-supervised learning method, specifically designed to handle data sets with mislabeled samples. The method uses teams of walking particles, with competitive and cooperative behavior, for label propagation in the network constructed from the input data set. The proposed model is nature-inspired and it incorporates some features to make it robust to a considerable amount of mislabeled data items. Computer simulations show the performance of the method in the presence of different percentage of mislabeled data, in networks of different sizes and average node degree. Importantly, these simulations reveals the existence of the critical points of the mislabeled subset size, below which the network is free of wrong label contamination, but above which the mislabeled samples start to propagate their labels to the rest of the network. Moreover, numerical comparisons have been made among the proposed method and other representative graph-based semi-supervised learning methods using both artificial and real-world data sets. Interestingly, the proposed method has increasing better performance than the others as the percentage of mislabeled samples is getting larger. © 2012 IEEE.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Identification and classification of overlapping nodes in networks are important topics in data mining. In this paper, a network-based (graph-based) semi-supervised learning method is proposed. It is based on competition and cooperation among walking particles in a network to uncover overlapping nodes by generating continuous-valued outputs (soft labels), corresponding to the levels of membership from the nodes to each of the communities. Moreover, the proposed method can be applied to detect overlapping data items in a data set of general form, such as a vector-based data set, once it is transformed to a network. Usually, label propagation involves risks of error amplification. In order to avoid this problem, the proposed method offers a mechanism to identify outliers among the labeled data items, and consequently prevents error propagation from such outliers. Computer simulations carried out for synthetic and real-world data sets provide a numeric quantification of the performance of the method. © 2012 Springer-Verlag.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Concept drift, which refers to non stationary learning problems over time, has increasing importance in machine learning and data mining. Many concept drift applications require fast response, which means an algorithm must always be (re)trained with the latest available data. But the process of data labeling is usually expensive and/or time consuming when compared to acquisition of unlabeled data, thus usually only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are based on assumptions that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenging task in machine learning. Recently, a particle competition and cooperation approach has been developed to realize graph-based semi-supervised learning from static data. We have extend that approach to handle data streams and concept drift. The result is a passive algorithm which uses a single classifier approach, naturally adapted to concept changes without any explicit drift detection mechanism. It has built-in mechanisms that provide a natural way of learning from new data, gradually "forgetting" older knowledge as older data items are no longer useful for the classification of newer data items. The proposed algorithm is applied to the KDD Cup 1999 Data of network intrusion, showing its effectiveness.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Software product line (SPL) engineering offers several advantages in the development of families of software products such as reduced costs, high quality and a short time to market. A software product line is a set of software intensive systems, each of which shares a common core set of functionalities, but also differs from the other products through customization tailored to fit the needs of individual groups of customers. The differences between products within the family are well-understood and organized into a feature model that represents the variability of the SPL. Products can then be built by generating and composing features described in the feature model. Testing of software product lines has become a bottleneck in the SPL development lifecycle, since many of the techniques used in their testing have been borrowed from traditional software testing and do not directly take advantage of the similarities between products. This limits the overall gains that can be achieved in SPL engineering. Recent work proposed by both industry and the research community for improving SPL testing has begun to consider this problem, but there is still a need for better testing techniques that are tailored to SPL development. In this thesis, I make two primary contributions to software product line testing. First I propose a new definition for testability of SPLs that is based on the ability to re-use test cases between products without a loss of fault detection effectiveness. I build on this idea to identify elements of the feature model that contribute positively and/or negatively towards SPL testability. Second, I provide a graph based testing approach called the FIG Basis Path method that selects products and features for testing based on a feature dependency graph. This method should increase our ability to re-use results of test cases across successive products in the family and reduce testing effort. I report the results of a case study involving several non-trivial SPLs and show that for these objects, the FIG Basis Path method is as effective as testing all products, but requires us to test no more than 24% of the products in the SPL.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Semi-supervised learning is one of the important topics in machine learning, concerning with pattern classification where only a small subset of data is labeled. In this paper, a new network-based (or graph-based) semi-supervised classification model is proposed. It employs a combined random-greedy walk of particles, with competition and cooperation mechanisms, to propagate class labels to the whole network. Due to the competition mechanism, the proposed model has a local label spreading fashion, i.e., each particle only visits a portion of nodes potentially belonging to it, while it is not allowed to visit those nodes definitely occupied by particles of other classes. In this way, a "divide-and-conquer" effect is naturally embedded in the model. As a result, the proposed model can achieve a good classification rate while exhibiting low computational complexity order in comparison to other network-based semi-supervised algorithms. Computer simulations carried out for synthetic and real-world data sets provide a numeric quantification of the performance of the method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Due to the growing interest in social networks, link prediction has received significant attention. Link prediction is mostly based on graph-based features, with some recent approaches focusing on domain semantics. We propose algorithms for link prediction that use a probabilistic ontology to enhance the analysis of the domain and the unavoidable uncertainty in the task (the ontology is specified in the probabilistic description logic crALC). The scalability of the approach is investigated, through a combination of semantic assumptions and graph-based features. We evaluate empirically our proposal, and compare it with standard solutions in the literature.