940 resultados para Graph Colourings
Resumo:
We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications. In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random. Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.
Resumo:
Large probabilistic graphs arise in various domains spanning from social networks to biological and communication networks. An important query in these graphs is the k nearest-neighbor query, which involves finding and reporting the k closest nodes to a specific node. This query assumes the existence of a measure of the "proximity" or the "distance" between any two nodes in the graph. To that end, we propose various novel distance functions that extend well known notions of classical graph theory, such as shortest paths and random walks. We argue that many meaningful distance functions are computationally intractable to compute exactly. Thus, in order to process nearest-neighbor queries, we resort to Monte Carlo sampling and exploit novel graph-transformation ideas and pruning opportunities. In our extensive experimental analysis, we explore the trade-offs of our approximation algorithms and demonstrate that they scale well on real-world probabilistic graphs with tens of millions of edges.
Resumo:
Research on the construction of logical overlay networks has gained significance in recent times. This is partly due to work on peer-to-peer (P2P) systems for locating and retrieving distributed data objects, and also scalable content distribution using end-system multicast techniques. However, there are emerging applications that require the real-time transport of data from various sources to potentially many thousands of subscribers, each having their own quality-of-service (QoS) constraints. This paper primarily focuses on the properties of two popular topologies found in interconnection networks, namely k-ary n-cubes and de Bruijn graphs. The regular structure of these graph topologies makes them easier to analyze and determine possible routes for real-time data than complete or irregular graphs. We show how these overlay topologies compare in their ability to deliver data according to the QoS constraints of many subscribers, each receiving data from specific publishing hosts. Comparisons are drawn on the ability of each topology to route data in the presence of dynamic system effects, due to end-hosts joining and departing the system. Finally, experimental results show the service guarantees and physical link stress resulting from efficient multicast trees constructed over both kinds of overlay networks.
Resumo:
Nearest neighbor search is commonly employed in face recognition but it does not scale well to large dataset sizes. A strategy to combine rejection classifiers into a cascade for face identification is proposed in this paper. A rejection classifier for a pair of classes is defined to reject at least one of the classes with high confidence. These rejection classifiers are able to share discriminants in feature space and at the same time have high confidence in the rejection decision. In the face identification problem, it is possible that a pair of known individual faces are very dissimilar. It is very unlikely that both of them are close to an unknown face in the feature space. Hence, only one of them needs to be considered. Using a cascade structure of rejection classifiers, the scope of nearest neighbor search can be reduced significantly. Experiments on Face Recognition Grand Challenge (FRGC) version 1 data demonstrate that the proposed method achieves significant speed up and an accuracy comparable with the brute force Nearest Neighbor method. In addition, a graph cut based clustering technique is employed to demonstrate that the pairwise separability of these rejection classifiers is capable of semantic grouping.
Resumo:
With the proliferation of mobile wireless communication and embedded systems, the energy efficiency becomes a major design constraint. The dissipated energy is often referred as the product of power dissipation and the input-output delay. Most of electronic design automation techniques focus on optimising only one of these parameters either power or delay. Industry standard design flows integrate systematic methods of optimising either area or timing while for power consumption optimisation one often employs heuristics which are characteristic to a specific design. In this work we answer three questions in our quest to provide a systematic approach to joint power and delay Optimisation. The first question of our research is: How to build a design flow which incorporates academic and industry standard design flows for power optimisation? To address this question, we use a reference design flow provided by Synopsys and integrate in this flow academic tools and methodologies. The proposed design flow is used as a platform for analysing some novel algorithms and methodologies for optimisation in the context of digital circuits. The second question we answer is: Is possible to apply a systematic approach for power optimisation in the context of combinational digital circuits? The starting point is a selection of a suitable data structure which can easily incorporate information about delay, power, area and which then allows optimisation algorithms to be applied. In particular we address the implications of a systematic power optimisation methodologies and the potential degradation of other (often conflicting) parameters such as area or the delay of implementation. Finally, the third question which this thesis attempts to answer is: Is there a systematic approach for multi-objective optimisation of delay and power? A delay-driven power and power-driven delay optimisation is proposed in order to have balanced delay and power values. This implies that each power optimisation step is not only constrained by the decrease in power but also the increase in delay. Similarly, each delay optimisation step is not only governed with the decrease in delay but also the increase in power. The goal is to obtain multi-objective optimisation of digital circuits where the two conflicting objectives are power and delay. The logic synthesis and optimisation methodology is based on AND-Inverter Graphs (AIGs) which represent the functionality of the circuit. The switching activities and arrival times of circuit nodes are annotated onto an AND-Inverter Graph under the zero and a non-zero-delay model. We introduce then several reordering rules which are applied on the AIG nodes to minimise switching power or longest path delay of the circuit at the pre-technology mapping level. The academic Electronic Design Automation (EDA) tool ABC is used for the manipulation of AND-Inverter Graphs. We have implemented various combinatorial optimisation algorithms often used in Electronic Design Automation such as Simulated Annealing and Uniform Cost Search Algorithm. Simulated Annealing (SMA) is a probabilistic meta heuristic for the global optimization problem of locating a good approximation to the global optimum of a given function in a large search space. We used SMA to probabilistically decide between moving from one optimised solution to another such that the dynamic power is optimised under given delay constraints and the delay is optimised under given power constraints. A good approximation to the global optimum solution of energy constraint is obtained. Uniform Cost Search (UCS) is a tree search algorithm used for traversing or searching a weighted tree, tree structure, or graph. We have used Uniform Cost Search Algorithm to search within the AIG network, a specific AIG node order for the reordering rules application. After the reordering rules application, the AIG network is mapped to an AIG netlist using specific library cells. Our approach combines network re-structuring, AIG nodes reordering, dynamic power and longest path delay estimation and optimisation and finally technology mapping to an AIG netlist. A set of MCNC Benchmark circuits and large combinational circuits up to 100,000 gates have been used to validate our methodology. Comparisons for power and delay optimisation are made with the best synthesis scripts used in ABC. Reduction of 23% in power and 15% in delay with minimal overhead is achieved, compared to the best known ABC results. Also, our approach is also implemented on a number of processors with combinational and sequential components and significant savings are achieved.
Resumo:
Simulation of pedestrian evacuations of smart buildings in emergency is a powerful tool for building analysis, dynamic evacuation planning and real-time response to the evolving state of evacuations. Macroscopic pedestrian models are low-complexity models that are and well suited to algorithmic analysis and planning, but are quite abstract. Microscopic simulation models allow for a high level of simulation detail but can be computationally intensive. By combining micro- and macro- models we can use each to overcome the shortcomings of the other and enable new capability and applications for pedestrian evacuation simulation that would not be possible with either alone. We develop the EvacSim multi-agent pedestrian simulator and procedurally generate macroscopic flow graph models of building space, integrating micro- and macroscopic approaches to simulation of the same emergency space. By “coupling” flow graph parameters to microscopic simulation results, the graph model captures some of the higher detail and fidelity of the complex microscopic simulation model. The coupled flow graph is used for analysis and prediction of the movement of pedestrians in the microscopic simulation, and investigate the performance of dynamic evacuation planning in simulated emergencies using a variety of strategies for allocation of macroscopic evacuation routes to microscopic pedestrian agents. The predictive capability of the coupled flow graph is exploited for the decomposition of microscopic simulation space into multiple future states in a scalable manner. By simulating multiple future states of the emergency in short time frames, this enables sensing strategy based on simulation scenario pattern matching which we show to achieve fast scenario matching, enabling rich, real-time feedback in emergencies in buildings with meagre sensing capabilities.
Resumo:
We revisit the well-known problem of sorting under partial information: sort a finite set given the outcomes of comparisons between some pairs of elements. The input is a partially ordered set P, and solving the problem amounts to discovering an unknown linear extension of P, using pairwise comparisons. The information-theoretic lower bound on the number of comparisons needed in the worst case is log e(P), the binary logarithm of the number of linear extensions of P. In a breakthrough paper, Jeff Kahn and Jeong Han Kim (STOC 1992) showed that there exists a polynomial-time algorithm for the problem achieving this bound up to a constant factor. Their algorithm invokes the ellipsoid algorithm at each iteration for determining the next comparison, making it impractical. We develop efficient algorithms for sorting under partial information. Like Kahn and Kim, our approach relies on graph entropy. However, our algorithms differ in essential ways from theirs. Rather than resorting to convex programming for computing the entropy, we approximate the entropy, or make sure it is computed only once in a restricted class of graphs, permitting the use of a simpler algorithm. Specifically, we present: an O(n2) algorithm performing O(log n·log e(P)) comparisons; an O(n2.5) algorithm performing at most (1+ε) log e(P) + Oε(n) comparisons; an O(n2.5) algorithm performing O(log e(P)) comparisons. All our algorithms are simple to implement. © 2010 ACM.
Resumo:
Segmentation of anatomical and pathological structures in ophthalmic images is crucial for the diagnosis and study of ocular diseases. However, manual segmentation is often a time-consuming and subjective process. This paper presents an automatic approach for segmenting retinal layers in Spectral Domain Optical Coherence Tomography images using graph theory and dynamic programming. Results show that this method accurately segments eight retinal layer boundaries in normal adult eyes more closely to an expert grader as compared to a second expert grader.
Resumo:
We consider the problem of variable selection in regression modeling in high-dimensional spaces where there is known structure among the covariates. This is an unconventional variable selection problem for two reasons: (1) The dimension of the covariate space is comparable, and often much larger, than the number of subjects in the study, and (2) the covariate space is highly structured, and in some cases it is desirable to incorporate this structural information in to the model building process. We approach this problem through the Bayesian variable selection framework, where we assume that the covariates lie on an undirected graph and formulate an Ising prior on the model space for incorporating structural information. Certain computational and statistical problems arise that are unique to such high-dimensional, structured settings, the most interesting being the phenomenon of phase transitions. We propose theoretical and computational schemes to mitigate these problems. We illustrate our methods on two different graph structures: the linear chain and the regular graph of degree k. Finally, we use our methods to study a specific application in genomics: the modeling of transcription factor binding sites in DNA sequences. © 2010 American Statistical Association.
Resumo:
We describe a strategy for Markov chain Monte Carlo analysis of non-linear, non-Gaussian state-space models involving batch analysis for inference on dynamic, latent state variables and fixed model parameters. The key innovation is a Metropolis-Hastings method for the time series of state variables based on sequential approximation of filtering and smoothing densities using normal mixtures. These mixtures are propagated through the non-linearities using an accurate, local mixture approximation method, and we use a regenerating procedure to deal with potential degeneracy of mixture components. This provides accurate, direct approximations to sequential filtering and retrospective smoothing distributions, and hence a useful construction of global Metropolis proposal distributions for simulation of posteriors for the set of states. This analysis is embedded within a Gibbs sampler to include uncertain fixed parameters. We give an example motivated by an application in systems biology. Supplemental materials provide an example based on a stochastic volatility model as well as MATLAB code.