851 resultados para Graph mining
Resumo:
We give a detailed construction of a finite-state transition system for a com-connected Message Sequence Graph. Though this result is well-known in the literature and forms the basis for the solution to several analysis and verification problems concerning MSG specifications, the constructions given in the literature are either not amenable to implementation, or imprecise, or simply incorrect. In contrast we give a detailed construction along with a proof of its correctness. Our transition system is amenable to implementation, and can also be used for a bounded analysis of general (not necessarily com-connected) MSG specifications.
Resumo:
In data mining, an important goal is to generate an abstraction of the data. Such an abstraction helps in reducing the space and search time requirements of the overall decision making process. Further, it is important that the abstraction is generated from the data with a small number of disk scans. We propose a novel data structure, pattern count tree (PC-tree), that can be built by scanning the database only once. PC-tree is a minimal size complete representation of the data and it can be used to represent dynamic databases with the help of knowledge that is either static or changing. We show that further compactness can be achieved by constructing the PC-tree on segmented patterns. We exploit the flexibility offered by rough sets to realize a rough PC-tree and use it for efficient and effective rough classification. To be consistent with the sizes of the branches of the PC-tree, we use upper and lower approximations of feature sets in a manner different from the conventional rough set theory. We conducted experiments using the proposed classification scheme on a large-scale hand-written digit data set. We use the experimental results to establish the efficacy of the proposed approach. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
Wireless sensor networks can often be viewed in terms of a uniform deployment of a large number of nodes on a region in Euclidean space, e.g., the unit square. After deployment, the nodes self-organise into a mesh topology. In a dense, homogeneous deployment, a frequently used approximation is to take the hop distance between nodes to be proportional to the Euclidean distance between them. In this paper, we analyse the performance of this approximation. We show that nodes with a certain hop distance from a fixed anchor node lie within a certain annulus with probability approach- ing unity as the number of nodes n → ∞. We take a uniform, i.i.d. deployment of n nodes on a unit square, and consider the geometric graph on these nodes with radius r(n) = c q ln n n . We show that, for a given hop distance h of a node from a fixed anchor on the unit square,the Euclidean distance lies within [(1−ǫ)(h−1)r(n), hr(n)],for ǫ > 0, with probability approaching unity as n → ∞.This result shows that it is more likely to expect a node, with hop distance h from the anchor, to lie within this an- nulus centred at the anchor location, and of width roughly r(n), rather than close to a circle whose radius is exactly proportional to h. We show that if the radius r of the ge- ometric graph is fixed, the convergence of the probability is exponentially fast. Similar results hold for a randomised lattice deployment. We provide simulation results that il- lustrate the theory, and serve to show how large n needs to be for the asymptotics to be useful.
Resumo:
With the emergence of large-volume and high-speed streaming data, the recent techniques for stream mining of CFIpsilas (closed frequent itemsets) will become inefficient. When concept drift occurs at a slow rate in high speed data streams, the rate of change of information across different sliding windows will be negligible. So, the user wonpsilat be devoid of change in information if we slide window by multiple transactions at a time. Therefore, we propose a novel approach for mining CFIpsilas cumulatively by making sliding width(ges1) over high speed data streams. However, it is nontrivial to mine CFIpsilas cumulatively over stream, because such growth may lead to the generation of exponential number of candidates for closure checking. In this study, we develop an efficient algorithm, stream-close, for mining CFIpsilas over stream by exploring some interesting properties. Our performance study reveals that stream-close achieves good scalability and has promising results.
Resumo:
An axis-parallel box in $b$-dimensional space is a Cartesian product $R_1 \times R_2 \times \cdots \times R_b$ where $R_i$ (for $1 \leq i \leq b$) is a closed interval of the form $[a_i, b_i]$ on the real line. For a graph $G$, its boxicity is the minimum dimension $b$, such that $G$ is representable as the intersection graph of (axis-parallel) boxes in $b$-dimensional space. The concept of boxicity finds application in various areas of research like ecology, operation research etc. Chandran, Francis and Sivadasan gave an $O(\Delta n^2 \ln^2 n)$ randomized algorithm to construct a box representation for any graph $G$ on $n$ vertices in $\lceil (\Delta + 2)\ln n \rceil$ dimensions, where $\Delta$ is the maximum degree of the graph. They also came up with a deterministic algorithm that runs in $O(n^4 \Delta )$ time. Here, we present an $O(n^2 \Delta^2 \ln n)$ deterministic algorithm that constructs the box representation for any graph in $\lceil (\Delta + 2)\ln n \rceil$ dimensions.
Resumo:
Scan circuit is widely practiced DFT technology. The scan testing procedure consist of state initialization, test application, response capture and observation process. During the state initialization process the scan vectors are shifted into the scan cells and simultaneously the responses captured in last cycle are shifted out. During this shift operation the transitions that arise in the scan cells are propagated to the combinational circuit, which inturn create many more toggling activities in the combinational block and hence increases the dynamic power consumption. The dynamic power consumed during scan shift operation is much more higher than that of normal mode operation.
Resumo:
Rapid urbanisation in India has posed serious challenges to the decision makers in regional planning involving plethora of issues including provision of basic amenities (like electricity, water, sanitation, transport, etc.). Urban planning entails an understanding of landscape and urban dynamics with causal factors. Identifying, delineating and mapping landscapes on temporal scale provide an opportunity to monitor the changes, which is important for natural resource management and sustainable planning activities. Multi-source, multi-sensor, multi-temporal, multi-frequency or multi-polarization remote sensing data with efficient classification algorithms and pattern recognition techniques aid in capturing these dynamics. This paper analyses the landscape dynamics of Greater Bangalore by: (i) characterisation of direct impervious surface, (ii) computation of forest fragmentation indices and (iii) modeling to quantify and categorise urban changes. Linear unmixing is used for solving the mixed pixel problem of coarse resolution super spectral MODIS data for impervious surface characterisation. Fragmentation indices were used to classify forests – interior, perforated, edge, transitional, patch and undetermined. Based on this, urban growth model was developed to determine the type of urban growth – Infill, Expansion and Outlying growth. This helped in visualising urban growth poles and consequence of earlier policy decisions that can help in evolving strategies for effective land use policies.
Resumo:
We consider evolving exponential RGGs in one dimension and characterize the time dependent behavior of some of their topological properties. We consider two evolution models and study one of them detail while providing a summary of the results for the other. In the first model, the inter-nodal gaps evolve according to an exponential AR(1) process that makes the stationary distribution of the node locations exponential. For this model we obtain the one-step conditional connectivity probabilities and extend it to the k-step case. Finite and asymptotic analysis are given. We then obtain the k-step connectivity probability conditioned on the network being disconnected. We also derive the pmf of the first passage time for a connected network to become disconnected. We then describe a random birth-death model where at each instant, the node locations evolve according to an AR(1) process. In addition, a random node is allowed to die while giving birth to a node at another location. We derive properties similar to those above.
Resumo:
We consider the problem of computing a minimum cycle basis in a directed graph G. The input to this problem is a directed graph whose arcs have positive weights. In this problem a {- 1, 0, 1} incidence vector is associated with each cycle and the vector space over Q generated by these vectors is the cycle space of G. A set of cycles is called a cycle basis of G if it forms a basis for its cycle space. A cycle basis where the sum of weights of the cycles is minimum is called a minimum cycle basis of G. The current fastest algorithm for computing a minimum cycle basis in a directed graph with m arcs and n vertices runs in O(m(w+1)n) time (where w < 2.376 is the exponent of matrix multiplication). If one allows randomization, then an (O) over tilde (m(3)n) algorithm is known for this problem. In this paper we present a simple (O) over tilde (m(2)n) randomized algorithm for this problem. The problem of computing a minimum cycle basis in an undirected graph has been well-studied. In this problem a {0, 1} incidence vector is associated with each cycle and the vector space over F-2 generated by these vectors is the cycle space of the graph. The fastest known algorithm for computing a minimum cycle basis in an undirected graph runs in O(m(2)n + mn(2) logn) time and our randomized algorithm for directed graphs almost matches this running time.
Resumo:
Mining association rules from a large collection of databases is based on two main tasks. One is generation of large itemsets; and the other is finding associations between the discovered large itemsets. Existing formalism for association rules are based on a single transaction database which is not sufficient to describe the association rules based on multiple database environment. In this paper, we give a general characterization of association rules and also give a framework for knowledge-based mining of multiple databases for association rules.
Resumo:
Data mining is concerned with analysing large volumes of (often unstructured) data to automatically discover interesting regularities or relationships which in turn lead to better understanding of the underlying processes. The field of temporal data mining is concerned with such analysis in the case of ordered data streams with temporal interdependencies. Over the last decade many interesting techniques of temporal data mining were proposed and shown to be useful in many applications. Since temporal data mining brings together techniques from different fields such as statistics, machine learning and databases, the literature is scattered among many different sources. In this article, we present an overview of techniques of temporal data mining.We mainly concentrate on algorithms for pattern discovery in sequential data streams.We also describe some recent results regarding statistical analysis of pattern discovery methods.
Resumo:
A method, system, and computer program product for fault data correlation in a diagnostic system are provided. The method includes receiving the fault data including a plurality of faults collected over a period of time, and identifying a plurality of episodes within the fault data, where each episode includes a sequence of the faults. The method further includes calculating a frequency of the episodes within the fault data, calculating a correlation confidence of the faults relative to the episodes as a function of the frequency of the episodes, and outputting a report of the faults with the correlation confidence.