982 resultados para graph matching algorithms


Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With an increasing emphasis on the emerging automatic person identification application, biometrics based, especially fingerprint-based identification, is receiving a lot of attention. This research developed an automatic fingerprint recognition system (AFRS) based on a hybrid between minutiae and correlation based techniques to represent and to match fingerprint; it improved each technique individually. It was noticed that, in the hybrid approach, as a result of an improvement of minutiae extraction algorithm in post-process phase that combines the two algorithms, the performance of the minutia algorithm improved. An improvement in the ridge algorithm that used centre point in fingerprint instead of reference point was also observed. Experiments indicate that the hybrid technique performs much better than each algorithm individually.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lung modelling has emerged as a useful method for diagnosing lung diseases. Image segmentation is an important part of lung modelling systems. The ill-defined nature of image segmentation makes automated lung modelling difficult. Also, low resolution of lung images further increases the difficulty of the lung image segmentation. It is therefore important to identify a suitable segmentation algorithm that can enhance lung modelling accuracies. This paper investigates six image segmentation algorithms, used in medical imaging, and also their application to lung modelling. The algorithms are: normalised cuts, graph, region growing, watershed, Markov random field, and mean shift. The performance of the six segmentation algorithms is determined through a set of experiments on realistic 2D CT lung images. An experimental procedure is devised to measure the performance of the tested algorithms. The measured segmentation accuracies as well as execution times of the six algorithms are then compared and discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As one of the primary substances in a living organism, protein defines the character of each cell by interacting with the cellular environment to promote the cell’s growth and function [1]. Previous studies on proteomics indicate that the functions of different proteins could be assigned based upon protein structures [2,3]. The knowledge on protein structures gives us an overview of protein fold space and is helpful for the understanding of the evolutionary principles behind structure. By observing the architectures and topologies of the protein families, biological processes can be investigated more directly with much higher resolution and finer detail. For this reason, the analysis of protein, its structure and the interaction with the other materials is emerging as an important problem in bioinformatics. However, the determination of protein structures is experimentally expensive and time consuming, this makes scientists largely dependent on sequence rather than more general structure to infer the function of the protein at the present time. For this reason, data mining technology is introduced into this area to provide more efficient data processing and knowledge discovery approaches.

Unlike many data mining applications which lack available data, the protein structure determination problem and its interaction study, on the contrary, could utilize a vast amount of biologically relevant information on protein and its interaction, such as the protein data bank (PDB) [4], the structural classification of proteins (SCOP) databases [5], CATH databases [6], UniProt [7], and others. The difficulty of predicting protein structures, specially its 3D structures, and the interactions between proteins as shown in Figure 6.1, lies in the computational complexity of the data. Although a large number of approaches have been developed to determine the protein structures such as ab initio modelling [8], homology modelling [9] and threading [10], more efficient and reliable methods are still greatly needed.

In this chapter, we will introduce a state-of-the-art data mining technique, graph mining, which is good at defining and discovering interesting structural patterns in graphical data sets, and take advantage of its expressive power to study protein structures, including protein structure prediction and comparison, and protein-protein interaction (PPI). The current graph pattern mining methods will be described, and typical algorithms will be presented, together with their applications in the protein structure analysis.

The rest of the chapter is organized as follows: Section 6.2 will give a brief introduction of the fundamental knowledge of protein, the publicly accessible protein data resources and the current research status of protein analysis; in Section 6.3, we will pay attention to one of the state-of-the-art data mining methods, graph mining; then Section 6.4 surveys several existing work for protein structure analysis using advanced graph mining methods in the recent decade; finally, in Section 6.5, a conclusion with potential further work will be summarized.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Texture synthesis employs neighbourhood matching to generate appropriate new content. Terrain synthesis has the added constraint that new content must be geographically plausible. The profile recognition and polygon breaking algorithm (PPA) [Chang et al. 1998] provides a robust mechanism for characterizing terrain as systems of valley and ridge lines in digital elevation maps. We exploit this to create a terrain characterization metric that is robust, efficient to compute and is sensitive to terrain properties.

Terrain regions are characterized as a minimum spanning tree derived from a graph created from the sample points of the elevation map which are encoded as weights in the edges of the graph. This formulation allows us to provide a single consistent feature definition that is sensitive to the pattern of ridges and valleys in the terrain Alternative formulations of these weights provide richer characteristicmeasures and we provide examples of alternate definitions based on curvature and contour measures.

We show that the measure is robust, with a significant portion derived directly from information local to the terrain sample. Global terrain characteristics introduce the issue of over- and underconnected valley/ridge lines when working with sub-regions. This is addressed by providing two graph construction strategies, which respectively provide an upper bound on connectivity as a single spanning tree, and a lower bound as a forest of trees.

Efficient minimum spanning tree algorithms are adapted to the context of terrain data and are shown to provide substantially better performance than previous PPA implementations. In particular, these are able to characterize valley and ridge behaviour at every point even in large elevation maps, providing a measure sensitive to terrain features at all scales.

The resulting graph based formulation provides an efficient and elegant algorithm for characterizing terrain features. The measure can be calculated efficiently, is robust under changes of neighbourhood position, size and resolution and the hybrid measure is sensitive to terrain features both locally and globally.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A critical question in data mining is that can we always trust what discovered by a data mining system unconditionally? The answer is obviously not. If not, when can we trust the discovery then? What are the factors that affect the reliability of the discovery? How do they affect the reliability of the discovery? These are some interesting questions to be investigated. In this chapter we will firstly provide a definition and the measurements of reliability, and analyse the factors that affect the reliability. We then examine the impact of model complexity, weak links, varying sample sizes and the ability of different learners to the reliability of graphical model discovery. The experimental results reveal that (1) the larger sample size for the discovery, the higher reliability we will get; (2) the stronger a graph link is, the easier the discovery will be and thus the higher the reliability it can achieve; (3) the complexity of a graph also plays an important role in the discovery. The higher the complexity of a graph is, the more difficult to induce the graph and the lower reliability it would be. We also examined the performance difference of different discovery algorithms. This reveals the impact of discovery process. The experimental results show the superior reliability and robustness of MML method to standard significance tests in the recovery of graph links with small samples and weak links.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes object-centered symbolic representation and distributed matching strategies of 3D objects in a schematic form which occur in engineering drawings and maps. The object-centered representation has a hierarchical structure and is constructed from symbolic representations of schematics. With this representation, two independent schematics representing the same object can be matched. We also consider matching strategies using distributed algorithms. The object recognition is carried out with two matching methods: (1) matching between an object model and observed data at the lowest level of the hierarchy, and (2) constraints propagation. The first is carried out with symbolic Hopfield-type neural networks and the second is achieved via hierarchical winner-takes-all algorithms

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Content authenticity and correctness is one of the important challenges in eLearning as there can be many solutions for one specific problem in the cyber space. Therefore, we feel the necessity of mapping problem to solutions using graph partition and weighted bipartite matching. This paper presents a novel architecture and methodology for a personal eLearning system called PELS that is developed by us. We also present an efficient algorithm to partition question-answer (QA) space and explore best possible solution to a particular problem. Our approach can be efficiently applied to social eLearning space where there is one-to-many and many-to-many relationship with a level of bonding. The main advantage of our approach is that we use QA ranking by adjusted edge weights provided by subject matter experts (SME) or expert database. Finally, we use statistical methods called confidence interval and hypothesis test on the data to check the reliability and dependability of the quality of results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Smartphone applications are getting more and more popular and pervasive in our daily life, and are also attractive to malware writers due to their limited computing source and vulnerabilities. At the same time, we possess limited understanding of our opponents in cyberspace. In this paper, we investigate the propagation model of SMS/MMS-based worms through integrating semi-Markov process and social relationship graph. In our modeling, we use semi-Markov process to characterize state transition among mobile nodes, and hire social network theory, a missing element in many previous works, to enhance the proposed mobile malware propagation model. In order to evaluate the proposed models, we have developed a specific software, and collected a large scale real-world data for this purpose. The extensive experiments indicate that the proposed models and algorithms are effective and practical. © 2014 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recommender systems have been successfully dealing with the problem of information overload. However, most recommendation methods suit to the scenarios where explicit feedback, e.g. ratings, are available, but might not be suitable for the most common scenarios with only implicit feedback. In addition, most existing methods only focus on user and item dimensions and neglect any additional contextual information, such as time and location. In this paper, we propose a graph-based generic recommendation framework, which constructs a Multi-Layer Context Graph (MLCG) from implicit feedback data, and then performs ranking algorithms in MLCG for context-aware recommendation. Specifically, MLCG incorporates a variety of contextual information into a recommendation process and models the interactions between users and items. Moreover, based on MLCG, two novel ranking methods are developed: Context-aware Personalized Random Walk (CPRW) captures user preferences and current situations, and Semantic Path-based Random Walk (SPRW) incorporates semantics of paths in MLCG into random walk model for recommendation. The experiments on two real-world datasets demonstrate the effectiveness of our approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, the concept of Matching Parallelepiped (MP) is presented. It is shown that the volume of the MP can be used as an additional measure of `distance' between a pair of candidate points in a matching algorithm by Relaxation Labeling (RL). The volume of the MP is related with the Epipolar Geometry and the use of this measure works as an epipolar constraint in a RL process, decreasing the efforts in the matching algorithm since it is not necessary to explicitly determine the equations of the epipolar lines and to compute the distance of a candidate point to each epipolar line. As at the beginning of the process the Relative Orientation (RO) parameters are unknown, a initial matching based on gradient, intensities and correlation is obtained. Based on this set of labeled points the RO is determined and the epipolar constraint included in the algorithm. The obtained results shown that the proposed approach is suitable to determine feature-point matching with simultaneous estimation of camera orientation parameters even for the cases where the pair of optical axes are not parallel.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dental recognition is very important for forensic human identification, mainly regarding the mass disasters, which have frequently happened due to tsunamis, airplanes crashes, etc. Algorithms for automatic, precise, and robust teeth segmentation from radiograph images are crucial for dental recognition. In this work we propose the use of a graph-based algorithm to extract the teeth contours from panoramic dental radiographs that are used as dental features. In order to assess our proposal, we have carried out experiments using a database of 1126 tooth images, obtained from 40 panoramic dental radiograph images from 20 individuals. The results of the graph-based algorithm was qualitatively assessed by a human expert who reported excellent scores. For dental recognition we propose the use of the teeth shapes as biometric features, by the means of BAS (Bean Angle Statistics) and Shape Context descriptors. The BAS descriptors showed, on the same database, a better performance (EER 14%) than the Shape Context (EER 20%). © 2012 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we deal with the problem of boosting the Optimum-Path Forest (OPF) clustering approach using evolutionary-based optimization techniques. As the OPF classifier performs an exhaustive search to find out the size of sample's neighborhood that allows it to reach the minimum graph cut as a quality measure, we compared several optimization techniques that can obtain close graph cut values to the ones obtained by brute force. Experiments in two public datasets in the context of unsupervised network intrusion detection have showed the evolutionary optimization techniques can find suitable values for the neighborhood faster than the exhaustive search. Additionally, we have showed that it is not necessary to employ many agents for such task, since the neighborhood size is defined by discrete values, with constrain the set of possible solution to a few ones.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we investigate the problem of routing connections in all-optical networks while allowing for degradation of routed signals by different optical components. To overcome the complexity of the problem, we divide it into two parts. First, we solve the pure RWA problem using fixed routes for every connection. Second, power assignment is accomplished by either using the smallest-gain first (SGF) heuristic or using a genetic algorithm. Numerical examples on a wide variety of networks show that (a) the number of connections established without considering the signal attenuation was most of the time greater than that achievable considering attenuation and (b) the genetic solution quality was much better than that of SGF, especially when the conflict graph of the connections generated by the linear solver is denser.