557 resultados para ENSEMBLES


10.00% 10.00%



Cdc25 phosphatases involved in cell cycle checkpoints are now active targets for the development of anti-cancer therapies. Rational drug design would certainly benefit from detailed structural information for Cdc25s. However, only apo- or sulfate-bound crystal structures of the Cdc25 catalytic domain have been described so far. Together with previously available crystalographic data, results from molecular dynamics simulations, bioinformatic analysis, and computer-generated conformational ensembles shown here indicate that the last 30-40 residues in the C-terminus of Cdc25B are partially unfolded or disordered in solution. The effect of C-terminal flexibility upon binding of two potent small molecule inhibitors to Cdc25B is then analyzed by using three structural models with variable levels of flexibility, including an equilibrium distributed ensemble of Cdc25B backbone conformations. The three Cdc25B structural models are used in combination with flexible docking, clustering, and calculation of binding free energies by the linear interaction energy approximation to construct and validate Cdc25B-inhibitor complexes. Two binding sites are identified on top and beside the Cdc25B active site. The diversity of interaction modes found increases with receptor flexibility. Backbone flexibility allows the formation of transient cavities or compact hydrophobic units on the surface of the stable, folded protein core that are unexposed or unavailable for ligand binding in rigid and densely packed crystal structures. The present results may help to speculate on the mechanisms of small molecule complexation to partially unfolded or locally disordered proteins.


10.00% 10.00%



The use of ensemble models in many problem domains has increased significantly in the last fewyears. The ensemble modeling, in particularly boosting, has shown a great promise in improving predictive performance of a model. Combining the ensemble members is normally done in a co-operative fashion where each of the ensemble members performs the same task and their predictions are aggregated to obtain the improved performance. However, it is also possible to combine the ensemble members in a competitive fashion where the best prediction of a relevant ensemble member is selected for a particular input. This option has been previously somewhat overlooked. The aim of this article is to investigate and compare the competitive and co-operative approaches to combining the models in the ensemble. A comparison is made between a competitive ensemble model and that of MARS with bagging, mixture of experts, hierarchical mixture of experts and a neural network ensemble over several public domain regression problems that have a high degree of nonlinearity and noise. The empirical results showa substantial advantage of competitive learning versus the co-operative learning for all the regression problems investigated. The requirements for creating the efficient ensembles and the available guidelines are also discussed.


10.00% 10.00%



This paper presents an innovative email categorization using a serialized multi-stage classification ensembles technique. Many approaches are used in practice for email categorization to control the menace of spam emails in different ways. Content-based email categorization employs filtering techniques using classification algorithms to learn to predict spam e-mails given a corpus of training e-mails. This process achieves a substantial performance with some amount of FP tradeoffs. It has been studied and investigated with different classification algorithms and found that the outputs of the classifiers vary from one classifier to another with same email corpora. In this paper we have proposed a multi-stage classification technique using different popular learning algorithms with an analyser which reduces the FP (false positive) problems substantially and increases classification accuracy compared to similar existing techniques.


10.00% 10.00%



In this paper, the impact of the size of the training set on the benefit from ensemble, i.e. the gains obtained by employing ensemble learning paradigms, is empirically studied. Experiments on Bagged/ Boosted J4.8 decision trees with/without pruning show that enlarging the training set tends to improve the benefit from Boosting but does not significantly impact the benefit from Bagging. This phenomenon is then explained from the view of bias-variance reduction. Moreover, it is shown that even for Boosting, the benefit does not always increase consistently along with the increase of the training set size since single learners sometimes may learn relatively more from additional training data that are randomly provided than ensembles do. Furthermore, it is observed that the benefit from ensemble of unpruned decision trees is usually bigger than that from ensemble of pruned decision trees. This phenomenon is then explained from the view of error-ambiguity balance.


10.00% 10.00%



In this paper, a new variant of Bagging named DepenBag is proposed. This algorithm obtains bootstrap samples at first. Then, it employs a causal discoverer to induce from each sample a dependency model expressed as a Directed Acyclic Graph (DAG). The attributes without connections to the class attribute in all the DAGs are then removed. Finally, a component learner is trained from each of the resulted samples to constitute the ensemble. Empirical study shows that DepenBag is effective in building ensembles of nearest neighbor classifiers.


10.00% 10.00%



This paper evaluates six commonly available parts-of-speech tagging tools over corpora other than those upon which they were originally trained. In particular this investigation measures the performance of the selected tools over varying styles and genres of text without retraining, under the assumption that domain specific training data is not always available. An investigation is performed to determine whether improved results can be achieved by combining the set of tagging tools into ensembles that use voting schemes to determine the best tag for each word. It is found that while accuracy drops due to non-domain specific training, and tag-mapping between corpora, accuracy remains very high, with the support vector machine-based tagger, and the decision tree-based tagger performing best over different corpora. It is also found that an ensemble containing a support vector machine-based tagger, a probabilistic tagger, a decision-tree based tagger and a rule-based tagger produces the largest increase in accuracy and the largest reduction in error across different corpora, using the Precision-Recall voting scheme.


10.00% 10.00%



his paper evaluates six commonly available parts-of-speech tagging tools over corpora other than those upon which they were originally trained. In particular this investigation measures the performance of the selected tools over varying styles and genres of text without retraining, under the assumption that domain specific training data is not always available. An investigation is performed to determine whether improved results can be achieved by combining the set of tagging tools into ensembles that use voting schemes to determine the best tag for each word. It is found that while accuracy drops due to non-domain specific training, and tag-mapping between corpora, accuracy remains very high, with the support vector machine-based tagger, and the decision tree-based tagger performing best over different corpora. It is also found that an ensemble containing a support vector machine-based tagger, a probabilistic tagger, a decision-tree based tagger and a rule-based tagger produces the largest increase in accuracy and the largest reduction in error across different corpora, using the Precision-Recall voting scheme.


10.00% 10.00%



The divergent syntheses of 2-(selenophen-2-yl)pyrroles and their N-vinyl derivatives from available 2-acylselenophenes and acetylenes in a one-pot procedure make these exotic heterocyclic ensembles accessible. Now we face a potentially vast area for exploration with a great diversity of far-reaching consequences including conducting electrochromic polymers with repeating of pyrrole and selenophene units (emerging rivalry for polypyrroles and polyselenophenes), the synthesis of functionalized pyrrole–selenophene assembles for advanced materials, biochemistry and medicine, exciting models for theory of polymer conductivity.


10.00% 10.00%



This study highlights the sensitivity of capital structure determinants in each sector within the ensembles of Malaysia Listed Companies. Based on pooled OLS, fixed effect and Generalized Method of Moments analysis, the findings revealed that capital structure determinants vary across sectors due to its nature or characteristics.


10.00% 10.00%



This article is devoted to large multi-tier ensemble classifiers generated as ensembles of ensembles and applied to phishing websites. Our new ensemble construction is a special case of the general and productive multi-tier approach well known in information security. Many efficient multi-tier classifiers have been considered in the literature. Our new contribution is in generating new large systems as ensembles of ensembles by linking a top-tier ensemble to another middletier ensemble instead of a base classifier so that the top~ tier ensemble can generate the whole system. This automatic generation capability includes many large ensemble classifiers in two tiers simultaneously and automatically combines them into one hierarchical unified system so that one ensemble is an integral part of another one. This new construction makes it easy to set up and run such large systems. The present article concentrates on the investigation of performance of these new multi~tier ensembles for the example of detection of phishing websites. We carried out systematic experiments evaluating several essential ensemble techniques as well as more recent approaches and studying their performance as parts of multi~level ensembles with three tiers. The results presented here demonstrate that new three-tier ensemble classifiers performed better than the base classifiers and standard ensembles included in the system. This example of application to the classification of phishing websites shows that the new method of combining diverse ensemble techniques into a unified hierarchical three-tier ensemble can be applied to increase the performance of classifiers in situations where data can be processed on a large computer.


10.00% 10.00%



This paper is devoted to multi-tier ensemble classifiers for the detection and filtering of phishing emails. We introduce a new construction of ensemble classifiers, based on the well known and productive multi-tier approach. Our experiments evaluate their performance for the detection and filtering of phishing emails. The multi-tier constructions are well known and have been used to design effective classifiers for email classification and other applications previously. We investigate new multi-tier ensemble classifiers, where diverse ensemble methods are combined in a unified system by incorporating different ensembles at a lower tier as an integral part of another ensemble at the top tier. Our novel contribution is to investigate the possibility and effectiveness of combining diverse ensemble methods into one large multi-tier ensemble for the example of detection and filtering of phishing emails. Our study handled a few essential ensemble methods and more recent approaches incorporated into a combined multi-tier ensemble classifier. The results show that new large multi-tier ensemble classifiers achieved better performance compared with the outcomes of the base classifiers and ensemble classifiers incorporated in the multi-tier system. This demonstrates that the new method of combining diverse ensembles into one unified multi-tier ensemble can be applied to increase the performance of classifiers if diverse ensembles are incorporated in the system.


10.00% 10.00%



This paper is devoted to empirical investigation of novel multi-level ensemble meta classifiers for the detection and monitoring of progression of cardiac autonomic neuropathy, CAN, in diabetes patients. Our experiments relied on an extensive database and concentrated on ensembles of ensembles, or multi-level meta classifiers, for the classification of cardiac autonomic neuropathy progression. First, we carried out a thorough investigation comparing the performance of various base classifiers for several known sets of the most essential features in this database and determined that Random Forest significantly and consistently outperforms all other base classifiers in this new application. Second, we used feature selection and ranking implemented in Random Forest. It was able to identify a new set of features, which has turned out better than all other sets considered for this large and well-known database previously. Random Forest remained the very best classier for the new set of features too. Third, we investigated meta classifiers and new multi-level meta classifiers based on Random Forest, which have improved its performance. The results obtained show that novel multi-level meta classifiers achieved further improvement and obtained new outcomes that are significantly better compared with the outcomes published in the literature previously for cardiac autonomic neuropathy.


10.00% 10.00%



This study highlights the sensitivity of capital structure determinants in each sector within the ensembles of Malaysia Listed Companies. Based on pooled OLS, fixed effect and Generalized Method of Moments analysis, the findings revealed that capital structure determinants vary across sectors due to its nature or characteristics.


10.00% 10.00%



This paper examines and analyzes different aggregation algorithms to improve accuracy of forecasts obtained using neural network (NN) ensembles. These algorithms include equal-weights combination of Best NN models, combination of trimmed forecasts, and Bayesian Model Averaging (BMA). The predictive performance of these algorithms are evaluated using Australian electricity demand data. The output of the aggregation algorithms of NN ensembles are compared with a Naive approach. Mean absolute percentage error is applied as the performance index for assessing the quality of aggregated forecasts. Through comprehensive simulations, it is found that the aggregation algorithms can significantly improve the forecasting accuracies. The BMA algorithm also demonstrates the best performance amongst aggregation algorithms investigated in this study.


10.00% 10.00%



Creating a set of a number of neural network (NN) models in an ensemble and accumulating them can achieve better overview capability as compared to single neural network. Neural network ensembles are designed to provide solutions to particular problems. Many researchers and academicians have adopted this NN ensemble technique, especially in machine learning, and has been applied in various fields of engineering, medicine and information technology. This paper present a robust aggregation methodology for load demand forecasting based on Bayesian Model Averaging of a set of neural network models in an ensemble. This paper estimate a vector of coefficient for individual NN models' forecasts using validation data-set. These coefficients, also known as weights, are equal to posterior probabilities of the models generating the forecasts. These BMA weights are then used in combining forecasts generated from NN models with test data-set. By comparing the Bayesian results with the Simple Averaging method, it was observed that benefits are obtained by utilizing an advanced method like BMA for forecast combinations.