13 resultados para Subgraph isomorphism

em CentAUR: Central Archive University of Reading - UK


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

From a statistician's standpoint, the interesting kind of isomorphism for fractional factorial designs depends on the statistical application. Combinatorially isomorphic fractional factorial designs may have different statistical properties when factors are quantitative. This idea is illustrated by using Latin squares of order 3 to obtain fractions of the 3(3) factorial. design in 18 runs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recursive circulant RC(2(n), 4) enjoys several attractive topological properties. Let max_epsilon(G) (m) denote the maximum number of edges in a subgraph of graph G induced by m nodes. In this paper, we show that max_epsilon(RC(2n,4))(m) = Sigma(i)(r)=(0)(p(i)/2 + i)2(Pi), where p(0) > p(1) > ... > p(r) are nonnegative integers defined by m = Sigma(i)(r)=(0)2(Pi). We then apply this formula to find the bisection width of RC(2(n), 4). The conclusion shows that, as n-dimensional cube, RC(2(n), 4) enjoys a linear bisection width. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present a distributed computing framework for problems characterized by a highly irregular search tree, whereby no reliable workload prediction is available. The framework is based on a peer-to-peer computing environment and dynamic load balancing. The system allows for dynamic resource aggregation, does not depend on any specific meta-computing middleware and is suitable for large-scale, multi-domain, heterogeneous environments, such as computational Grids. Dynamic load balancing policies based on global statistics are known to provide optimal load balancing performance, while randomized techniques provide high scalability. The proposed method combines both advantages and adopts distributed job-pools and a randomized polling technique. The framework has been successfully adopted in a parallel search algorithm for subgraph mining and evaluated on a molecular compounds dataset. The parallel application has shown good calability and close-to linear speedup in a distributed network of workstations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In real world applications sequential algorithms of data mining and data exploration are often unsuitable for datasets with enormous size, high-dimensionality and complex data structure. Grid computing promises unprecedented opportunities for unlimited computing and storage resources. In this context there is the necessity to develop high performance distributed data mining algorithms. However, the computational complexity of the problem and the large amount of data to be explored often make the design of large scale applications particularly challenging. In this paper we present the first distributed formulation of a frequent subgraph mining algorithm for discriminative fragments of molecular compounds. Two distributed approaches have been developed and compared on the well known National Cancer Institute’s HIV-screening dataset. We present experimental results on a small-scale computing environment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Operator spaces of Hilbertian JC∗ -triples E are considered in the light of the universal ternary ring of operators (TRO) introduced in recent work. For these operator spaces, it is shown that their triple envelope (in the sense of Hamana) is the TRO they generate, that a complete isometry between any two of them is always the restriction of a TRO isomorphism and that distinct operator space structures on a fixed E are never completely isometric. In the infinite-dimensional cases, operator space structure is shown to be characterized by severe and definite restrictions upon finite-dimensional subspaces. Injective envelopes are explicitly computed.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A model based on graph isomorphisms is used to formalize software evolution. Step by step we narrow the search space by an informed selection of the attributes based on the current state-of-the-art in software engineering and generate a seed solution. We then traverse the resulting space using graph isomorphisms and other set operations over the vertex sets. The new solutions will preserve the desired attributes. The goal of defining an isomorphism based search mechanism is to construct predictors of evolution that can facilitate the automation of ’software factory’ paradigm. The model allows for automation via software tools implementing the concepts.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A model based on graph isomorphisms is used to formalize software evolution. Step by step we narrow the search space by an informed selection of the attributes based on the current state-of-the-art in software engineering and generate a seed solution. We then traverse the resulting space using graph isomorphisms and other set operations over the vertex sets. The new solutions will preserve the desired attributes. The goal of defining an isomorphism based search mechanism is to construct predictors of evolution that can facilitate the automation of ’software factory’ paradigm. The model allows for automation via software tools implementing the concepts.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose adding a temporal dimension to stakeholder management theory, and assess the implications thereof for firm-level competitive advantage. We argue that a firm’s competitive advantage fundamentally depends on its capacity for stakeholder management related, transformational adaptation over time. Our new temporal stakeholder management approach builds upon insights from both the resource-based view (RBV) in strategic management and institutional theory. Stakeholder agendas and their relative salience to the firm evolve over time, a phenomenon well understood in the literature, and requiring what we call level 1 adaptation. However, the dominant direction of stakeholder pressures can also change, namely, from supporting resource heterogeneity at the firm level to fostering industry homogeneity, and vice versa. When dominant stakeholder pressures shift from supporting heterogeneity towards stimulating homogeneity in industry, the firm must engage in level 2 or transformational adaptation. Stakeholders typically provide valuable resources to the firm in an early stage. Without these resources, which foster heterogeneity (in line with RBV thinking), the firm would not exist. At a later stage, stakeholders also contribute to inter-firm homogeneity via isomorphism pressures (in line with institutional theory thinking). Adding a temporal dimension to stakeholder management theory has far reaching implications for this theory’s practical relevance to senior level management in business.