165 resultados para stream mining


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a method to enhance fault localization for software systems based on a frequent pattern mining algorithm. Our method is based on a large set of test cases for a given set of programs in which faults can be detected. The test executions are recorded as function call trees. Based on test oracles the tests can be classified into successful and failing tests. A frequent pattern mining algorithm is used to identify frequent subtrees in successful and failing test executions. This information is used to rank functions according to their likelihood of containing a fault. The ranking suggests an order in which to examine the functions during fault analysis. We validate our approach experimentally using a subset of Siemens benchmark programs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, two approaches have been introduced that distribute the molecular fragment mining problem. The first approach applies a master/worker topology, the second approach, a completely distributed peer-to-peer system, solves the scalability problem due to the bottleneck at the master node. However, in many real world scenarios the participating computing nodes cannot communicate directly due to administrative policies such as security restrictions. Thus, potential computing power is not accessible to accelerate the mining run. To solve this shortcoming, this work introduces a hierarchical topology of computing resources, which distributes the management over several levels and adapts to the natural structure of those multi-domain architectures. The most important aspect is the load balancing scheme, which has been designed and optimized for the hierarchical structure. The approach allows dynamic aggregation of heterogenous computing resources and is applied to wide area network scenarios.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In real world applications sequential algorithms of data mining and data exploration are often unsuitable for datasets with enormous size, high-dimensionality and complex data structure. Grid computing promises unprecedented opportunities for unlimited computing and storage resources. In this context there is the necessity to develop high performance distributed data mining algorithms. However, the computational complexity of the problem and the large amount of data to be explored often make the design of large scale applications particularly challenging. In this paper we present the first distributed formulation of a frequent subgraph mining algorithm for discriminative fragments of molecular compounds. Two distributed approaches have been developed and compared on the well known National Cancer Institute’s HIV-screening dataset. We present experimental results on a small-scale computing environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Much of the atmospheric variability in the North Atlantic sector is associated with variations in the eddy-driven component of the zonal flow. Here we present a simple method to specifically diagnose this component of the flow using the low-level wind field (925–700 hpa ). We focus on the North Atlantic winter season in the ERA-40 reanalysis. Diagnostics of the latitude and speed of the eddy-driven jet stream are compared with conventional diagnostics of the North Atlantic Oscillation (NAO) and the East Atlantic (EA) pattern. This shows that the NAO and the EA both describe combined changes in the latitude and speed of the jet stream. It is therefore necessary, but not always sufficient, to consider both the NAO and the EA in identifying changes in the jet stream. The jet stream analysis suggests that there are three preferred latitudinal positions of the North Atlantic eddy-driven jet stream in winter. This result is in very good agreement with the application of a statistical mixture model to the two-dimensional state space defined by the NAO and the EA. These results are consistent with several other studies which identify four European/Atlantic regimes, comprising three jet stream patterns plus European blocking events.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information is provided on phosphorus in the River Kennet and the adjacent Kennet and Avon Canal in southern England to assess their interactions and the changes following phosphorus reductions in sewage treatment work (STW) effluent inputs. A step reduction in soluble reactive phosphorus (SRP) concentration within the effluent (5 to 13 fold) was observed from several STWs discharging to the river in the mid-2000s. This translated to over halving of SRP concentrations within the lower Kennet. Lower Kennet SRP concentrations change from being highest under base-flow to highest under storm-flow conditions. This represented a major shift from direct effluent inputs to a within-catchment source dominated system characteristic of the upper part to the catchment. Average SRP concentrations in the lower Kennet reduced over time towards the target for good water quality. Critically, there was no corresponding reduction in chlorophyll-a concentration, the waters remaining eutrophic when set against standards for lakes. Following the up gradient input of the main water and SRP source (Wilton Water), SRP concentrations in the canal reduced down gradient to below detection limits at times near its junction with the Kennet downstream. However, chlorophyll concentrations in the canal were in an order of magnitude higher than in the river. This probably resulted from long water residence times and higher temperatures promoting progressive algal and suspended sediment generations that consumed SRP. The canal acted as a point source for sediment, algae and total phosphorus to the river especially during the summer months when boat traffic disturbed the canal's bottom sediments and the locks were being regularly opened. The short-term dynamics of this transfer was complex. For the canal and the supply source at Wilton Water, conditions remained hypertrophic when set against standards for lakes even when SRP concentrations were extremely low.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper provides an extended analysis of the tensions that have surfaced between large-scale mine operators and artisanal miners in gold-rich areas of rural Tanzania. The literature on grievance is used to contextualise, these disputes, the underlying cause of which is artisanal miners' mounting frustration over not being able to secure viable concessions to work. Newly implemented legislation has, for the most part, empowered foreign large-scale mine operators, while simultaneously disempowering indigenous small-scale miners. In many cases, the former have addressed mounting security and community problems on their own. Until the country's major mine operators extend assistance to marginalised small-scale mining groups, the likelihood of violent conflict unfolding between these parties will increase.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper critiques contemporary research and policy approaches taken toward the analysis and abatement of mercury pollution in the small-scale gold mining sector. Unmonitored releases of mercury from gold amalgamation have caused considerable environmental contamination and human health complications in rural reaches of sub-Saharan Africa, Latin America and Asia. Whilst these problems have caught the attention of the scientific community over the past 15-20 years, the research that has since been undertaken has failed to identify appropriate mitigation measures, and has done little to advance understanding of why contamination persists. Moreover, the strategies used to educate operators about the impacts of acute mercury exposure, and the technologies implemented to prevent farther pollution, have been marginally effective at best. The mercury pollution problem will not be resolved until governments and donor agencies commit to carrying out research aimed at improving understanding of the dynamics of small scale gold mining communities. Acquisition of this knowledge is the key to designing and implementing appropriate support and abatement measures. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The World Bank, United Nations and UK Department for International Development (DfID) have spearheaded a recent global drive to regularize artisanal and small-scale mining (ASM), and provide assistance to its predominantly impoverished participants. To date, millions of dollars have been pledged toward the design of industry-specific policies and regulations; implementation of mechanized equipment; extension; and the launch of alternative livelihood (AL) programmes aimed at diversifying local economies. Much of this funding, however, has failed to facilitate marked improvements, and in many cases, has exacerbated problems. This paper argues that a poor understanding of artisanal, mine-community dynamics and operators’ needs has, in a number of cases, led to the design and implementation of inappropriate industry support schemes and interventions. The discussion focuses upon experiences from sub-Saharan Africa, where ASM is in the most rudimentary of states.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper contributes to a growing body of literature that critically examines how mining companies are embracing community development challenges in developing countries, drawing on experiences from Ghana. Despite receiving considerable praise from the donor and industry communities, the actions being taken by Ghana's major mining companies to foster community development are facilitating few improvements in the rural regions where activities take place. Companies are generally implementing community development programmes that are incapable of alleviating rural hardship and are coordinating destructive displacement exercises. The analysis serves as a stark reminder that mining companies are not charities and engage with African countries strictly for commercial purposes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper provides an extended analysis of the child labor problem in the artisanal and small-scale mining (ASM) sector, focusing specifically on the situation in sub-Saharan Africa. In recent years, the issue of child labor in ASM has garnered significant attention from the International Labor Organization (ILO), which has been particularly active in raising public awareness of the problem; and, has proceeded to implement policies and collaborative project work aimed at Curtailing children's participation in ASM activities in a number of African countries. The analysis concludes with a critical appraisal of an ILO project recently launched in the Talensi-Nabdam District in the Upper East Region of Ghana, which sheds light on how the child labor problem is being tackled in practice in ASM communities in sub-Saharan Africa. (c) 2008 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Artisanal and small-scale mining (ASM)-low tech, labour intensive mineral processing and excavation activity-is an economic mainstay in rural sub-Saharan Africa, providing direct employment to over two million people. This paper introduces a special issue on 'Small-scale mining, poverty and development in sub-Saharan Africa'. It focuses on the core conceptual issues covered in the literature, and the policy implications of the findings reported in the papers in this special issue. (C) 2009 Elsevier Ltd. All rights reserved.