150 resultados para distributed meta classifiers

em CentAUR: Central Archive University of Reading - UK


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a distributed computing framework for problems characterized by a highly irregular search tree, whereby no reliable workload prediction is available. The framework is based on a peer-to-peer computing environment and dynamic load balancing. The system allows for dynamic resource aggregation, does not depend on any specific meta-computing middleware and is suitable for large-scale, multi-domain, heterogeneous environments, such as computational Grids. Dynamic load balancing policies based on global statistics are known to provide optimal load balancing performance, while randomized techniques provide high scalability. The proposed method combines both advantages and adopts distributed job-pools and a randomized polling technique. The framework has been successfully adopted in a parallel search algorithm for subgraph mining and evaluated on a molecular compounds dataset. The parallel application has shown good calability and close-to linear speedup in a distributed network of workstations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, two approaches have been introduced that distribute the molecular fragment mining problem. The first approach applies a master/worker topology, the second approach, a completely distributed peer-to-peer system, solves the scalability problem due to the bottleneck at the master node. However, in many real world scenarios the participating computing nodes cannot communicate directly due to administrative policies such as security restrictions. Thus, potential computing power is not accessible to accelerate the mining run. To solve this shortcoming, this work introduces a hierarchical topology of computing resources, which distributes the management over several levels and adapts to the natural structure of those multi-domain architectures. The most important aspect is the load balancing scheme, which has been designed and optimized for the hierarchical structure. The approach allows dynamic aggregation of heterogenous computing resources and is applied to wide area network scenarios.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper focuses on improving computer network management by the adoption of artificial intelligence techniques. A logical inference system has being devised to enable automated isolation, diagnosis, and even repair of network problems, thus enhancing the reliability, performance, and security of networks. We propose a distributed multi-agent architecture for network management, where a logical reasoner acts as an external managing entity capable of directing, coordinating, and stimulating actions in an active management architecture. The active networks technology represents the lower level layer which makes possible the deployment of code which implement teleo-reactive agents, distributed across the whole network. We adopt the Situation Calculus to define a network model and the Reactive Golog language to implement the logical reasoner. An active network management architecture is used by the reasoner to inject and execute operational tasks in the network. The integrated system collects the advantages coming from logical reasoning and network programmability, and provides a powerful system capable of performing high-level management tasks in order to deal with network fault.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In real world applications sequential algorithms of data mining and data exploration are often unsuitable for datasets with enormous size, high-dimensionality and complex data structure. Grid computing promises unprecedented opportunities for unlimited computing and storage resources. In this context there is the necessity to develop high performance distributed data mining algorithms. However, the computational complexity of the problem and the large amount of data to be explored often make the design of large scale applications particularly challenging. In this paper we present the first distributed formulation of a frequent subgraph mining algorithm for discriminative fragments of molecular compounds. Two distributed approaches have been developed and compared on the well known National Cancer Institute’s HIV-screening dataset. We present experimental results on a small-scale computing environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper considers meta-analysis of diagnostic studies that use a continuous score for classification of study participants into healthy or diseased groups. Classification is often done on the basis of a threshold or cut-off value, which might vary between studies. Consequently, conventional meta-analysis methodology focusing solely on separate analysis of sensitivity and specificity might be confounded by a potentially unknown variation of the cut-off value. To cope with this phenomena it is suggested to use, instead, an overall estimate of the misclassification error previously suggested and used as Youden’s index and; furthermore, it is argued that this index is less prone to between-study variation of cut-off values. A simple Mantel–Haenszel estimator as a summary measure of the overall misclassification error is suggested, which adjusts for a potential study effect. The measure of the misclassification error based on Youden’s index is advantageous in that it easily allows an extension to a likelihood approach, which is then able to cope with unobserved heterogeneity via a nonparametric mixture model. All methods are illustrated at hand of an example on a diagnostic meta-analysis on duplex doppler ultrasound, with angiography as the standard for stroke prevention.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: The objective was to evaluate the efficacy and tolerability of donepezil (5 and 10 mg/day) compared with placebo in alleviating manifestations of mild to moderate Alzheimer's disease (AD). Method: A systematic review of individual patient data from Phase II and III double-blind, randomised, placebo-controlled studies of up to 24 weeks and completed by 20 December 1999. The main outcome measures were the ADAS-cog, the CIBIC-plus, and reports of adverse events. Results: A total of 2376 patients from ten trials were randomised to either donepezil 5 mg/day (n = 821), 10 mg/day (n = 662) or placebo (n = 893). Cognitive performance was better in patients receiving donepezil than in patients receiving placebo. At 12 weeks the differences in ADAS-cog scores were 5 mg/day-placebo: - 2.1 [95% confidence interval (CI), - 2.6 to - 1.6; p < 0.001], 10 mg/day-placebo: - 2.5 ( - 3.1 to - 2.0; p < 0.001). The corresponding results at 24 weeks were - 2.0 ( - 2.7 to - 1.3; p < 0.001) and - 3.1 ( - 3.9 to - 2.4; p < 0.001). The difference between the 5 and 10 mg/day doses was significant at 24 weeks (p = 0.005). The odds ratios (OR) of improvement on the CIBIC-plus at 12 weeks were: 5 mg/day-placebo 1.8 (1.5 to 2.1; p < 0.001), 10 mg/day-placebo 1.9 (1.5 to 2.4; p < 0.001). The corresponding values at 24 weeks were 1.9 (1.5 to 2.4; p = 0.001) and 2.1 (1.6 to 2.8; p < 0.001). Donepezil was well tolerated; adverse events were cholinergic in nature and generally of mild severity and brief in duration. Conclusion: Donepezil (5 and 10 mg/day) provides meaningful benefits in alleviating deficits in cognitive and clinician-rated global function in AD patients relative to placebo. Increased improvements in cognition were indicated for the higher dose. Copyright © 2004 John Wiley & Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Meta-analyses based on individual patient data (IPD) are regarded as the gold standard for systematic reviews. However, the methods used for analysing and presenting results from IPD meta-analyses have received little discussion. Methods We review 44 IPD meta-analyses published during the years 1999–2001. We summarize whether they obtained all the data they sought, what types of approaches were used in the analysis, including assumptions of common or random effects, and how they examined the effects of covariates. Results: Twenty-four out of 44 analyses focused on time-to-event outcomes, and most analyses (28) estimated treatment effects within each trial and then combined the results assuming a common treatment effect across trials. Three analyses failed to stratify by trial, analysing the data is if they came from a single mega-trial. Only nine analyses used random effects methods. Covariate-treatment interactions were generally investigated by subgrouping patients. Seven of the meta-analyses included data from less than 80% of the randomized patients sought, but did not address the resulting potential biases. Conclusions: Although IPD meta-analyses have many advantages in assessing the effects of health care, there are several aspects that could be further developed to make fuller use of the potential of these time-consuming projects. In particular, IPD could be used to more fully investigate the influence of covariates on heterogeneity of treatment effects, both within and between trials. The impact of heterogeneity, or use of random effects, are seldom discussed. There is thus considerable scope for enhancing the methods of analysis and presentation of IPD meta-analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the case of a multicenter trial in which the center specific sample sizes are potentially small. Under homogeneity, the conventional procedure is to pool information using a weighted estimator where the weights used are inverse estimated center-specific variances. Whereas this procedure is efficient for conventional asymptotics (e. g. center-specific sample sizes become large, number of center fixed), it is commonly believed that the efficiency of this estimator holds true also for meta-analytic asymptotics (e.g. center-specific sample size bounded, potentially small, and number of centers large). In this contribution we demonstrate that this estimator fails to be efficient. In fact, it shows a persistent bias with increasing number of centers showing that it isnot meta-consistent. In addition, we show that the Cochran and Mantel-Haenszel weighted estimators are meta-consistent and, in more generality, provide conditions on the weights such that the associated weighted estimator is meta-consistent.