62 resultados para rules


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Inducing rules from very large datasets is one of the most challenging areas in data mining. Several approaches exist to scaling up classification rule induction to large datasets, namely data reduction and the parallelisation of classification rule induction algorithms. In the area of parallelisation of classification rule induction algorithms most of the work has been concentrated on the Top Down Induction of Decision Trees (TDIDT), also known as the ‘divide and conquer’ approach. However powerful alternative algorithms exist that induce modular rules. Most of these alternative algorithms follow the ‘separate and conquer’ approach of inducing rules, but very little work has been done to make the ‘separate and conquer’ approach scale better on large training data. This paper examines the potential of the recently developed blackboard based J-PMCRI methodology for parallelising modular classification rule induction algorithms that follow the ‘separate and conquer’ approach. A concrete implementation of the methodology is evaluated empirically on very large datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Prism family of algorithms induces modular classification rules which, in contrast to decision tree induction algorithms, do not necessarily fit together into a decision tree structure. Classifiers induced by Prism algorithms achieve a comparable accuracy compared with decision trees and in some cases even outperform decision trees. Both kinds of algorithms tend to overfit on large and noisy datasets and this has led to the development of pruning methods. Pruning methods use various metrics to truncate decision trees or to eliminate whole rules or single rule terms from a Prism rule set. For decision trees many pre-pruning and postpruning methods exist, however for Prism algorithms only one pre-pruning method has been developed, J-pruning. Recent work with Prism algorithms examined J-pruning in the context of very large datasets and found that the current method does not use its full potential. This paper revisits the J-pruning method for the Prism family of algorithms and develops a new pruning method Jmax-pruning, discusses it in theoretical terms and evaluates it empirically.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Prism family of algorithms induces modular classification rules in contrast to the Top Down Induction of Decision Trees (TDIDT) approach which induces classification rules in the intermediate form of a tree structure. Both approaches achieve a comparable classification accuracy. However in some cases Prism outperforms TDIDT. For both approaches pre-pruning facilities have been developed in order to prevent the induced classifiers from overfitting on noisy datasets, by cutting rule terms or whole rules or by truncating decision trees according to certain metrics. There have been many pre-pruning mechanisms developed for the TDIDT approach, but for the Prism family the only existing pre-pruning facility is J-pruning. J-pruning not only works on Prism algorithms but also on TDIDT. Although it has been shown that J-pruning produces good results, this work points out that J-pruning does not use its full potential. The original J-pruning facility is examined and the use of a new pre-pruning facility, called Jmax-pruning, is proposed and evaluated empirically. A possible pre-pruning facility for TDIDT based on Jmax-pruning is also discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In order to gain knowledge from large databases, scalable data mining technologies are needed. Data are captured on a large scale and thus databases are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classification rule induction, parallelisation of classification rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classification rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classification rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There are several scoring rules that one can choose from in order to score probabilistic forecasting models or estimate model parameters. Whilst it is generally agreed that proper scoring rules are preferable, there is no clear criterion for preferring one proper scoring rule above another. This manuscript compares and contrasts some commonly used proper scoring rules and provides guidance on scoring rule selection. In particular, it is shown that the logarithmic scoring rule prefers erring with more uncertainty, the spherical scoring rule prefers erring with lower uncertainty, whereas the other scoring rules are indifferent to either option.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Infrared polarization and intensity imagery provide complementary and discriminative information in image understanding and interpretation. In this paper, a novel fusion method is proposed by effectively merging the information with various combination rules. It makes use of both low-frequency and highfrequency images components from support value transform (SVT), and applies fuzzy logic in the combination process. Images (both infrared polarization and intensity images) to be fused are firstly decomposed into low-frequency component images and support value image sequences by the SVT. Then the low-frequency component images are combined using a fuzzy combination rule blending three sub-combination methods of (1) region feature maximum, (2) region feature weighting average, and (3) pixel value maximum; and the support value image sequences are merged using a fuzzy combination rule fusing two sub-combination methods of (1) pixel energy maximum and (2) region feature weighting. With the variables of two newly defined features, i.e. the low-frequency difference feature for low-frequency component images and the support-value difference feature for support value image sequences, trapezoidal membership functions are proposed and developed in tuning the fuzzy fusion process. Finally the fused image is obtained by inverse SVT operations. Experimental results of visual inspection and quantitative evaluation both indicate the superiority of the proposed method to its counterparts in image fusion of infrared polarization and intensity images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Native-like use of preterit and imperfect morphology in all contexts by English learners of L2 Spanish is the exception rather than the rule, even for successful learners. Nevertheless, recent research has demonstrated that advanced English learners of L2 Spanish attain a native-like morphosyntactic competence for the preterit/imperfect contrast, as evidenced by their native-like knowledge of associated semantic entailments (Goodin-Mayeda and Rothman 2007, Montrul and Slabakova 2003, Slabakova and Montrul 2003, Rothman and Iverson 2007). In addition to an L2 disassociation of morphology and syntax (e.g., Bruhn de Garavito 2003, Lardiere 1998, 2000, 2005, Prévost and White 1999, 2000, Schwartz 2003), I hypothesize that a system of learned pedagogical rules contributes to target-deviant L2 performance in this domain through the most advanced stages of L2 acquisition via its competition with the generative system. I call this hypothesis the Competing Systems Hypothesis. To test its predictions, I compare and contrast the use of the preterit and imperfect in two production tasks by native, tutored (classroom), and naturalistic learners of L2 Spanish.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The experience of learning and using a second language (L2) has been shown to affect the grey matter (GM) structure of the brain. Importantly, GM density in several cortical and subcortical areas has been shown to be related to performance in L2 tasks. Here we show that bilingualism can lead to increased GM volume in the cerebellum, a structure that has been related to the processing of grammatical rules. Additionally, the cerebellar GM volume of highly proficient L2 speakers is correlated to their performance in a task tapping on grammatical processing in a L2, demonstrating the importance of the cerebellum for the establishment and use of grammatical rules in a L2.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The financial crisis of 2007-2009 and the subsequent reaction of the G20 have created a new global regulatory landscape. Within the EU, change of regulatory institutions is ongoing. The research objective of this study is to understand how institutional changes to the EU regulatory landscape may affect corresponding institutionalized operational practices within financial organizations and to understand the role of agency within this process. Our motivation is to provide insight into these changes from an operational management perspective, as well as to test Thelen and Mahoney?s (2010) modes of institutional change. Consequently, the study researched implementations of an Investment Management System with a rules-based compliance module within financial organizations. The research consulted compliance and risk managers, as well as systems experts. The study suggests that prescriptive regulations are likely to create isomorphic configurations of rules-based compliance systems, which consequently will enable the institutionalization of associated compliance practices. The study reveals the ability of some agents within financial organizations to control the impact of regulatory institutions, not directly, but through the systems and processes they adopt to meet requirements. Furthermore, the research highlights the boundaries and relationships between each mode of change as future avenues of research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Prism is a modular classification rule generation method based on the ‘separate and conquer’ approach that is alternative to the rule induction approach using decision trees also known as ‘divide and conquer’. Prism often achieves a similar level of classification accuracy compared with decision trees, but tends to produce a more compact noise tolerant set of classification rules. As with other classification rule generation methods, a principle problem arising with Prism is that of overfitting due to over-specialised rules. In addition, over-specialised rules increase the associated computational complexity. These problems can be solved by pruning methods. For the Prism method, two pruning algorithms have been introduced recently for reducing overfitting of classification rules - J-pruning and Jmax-pruning. Both algorithms are based on the J-measure, an information theoretic means for quantifying the theoretical information content of a rule. Jmax-pruning attempts to exploit the J-measure to its full potential because J-pruning does not actually achieve this and may even lead to underfitting. A series of experiments have proved that Jmax-pruning may outperform J-pruning in reducing overfitting. However, Jmax-pruning is computationally relatively expensive and may also lead to underfitting. This paper reviews the Prism method and the two existing pruning algorithms above. It also proposes a novel pruning algorithm called Jmid-pruning. The latter is based on the J-measure and it reduces overfitting to a similar level as the other two algorithms but is better in avoiding underfitting and unnecessary computational effort. The authors conduct an experimental study on the performance of the Jmid-pruning algorithm in terms of classification accuracy and computational efficiency. The algorithm is also evaluated comparatively with the J-pruning and Jmax-pruning algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite an extensive market segmentation literature, applied academic studies which bridge segmentation theory and practice remain a priority for researchers. The need for studies which examine the segmentation implementation barriers faced by organisations is particularly acute. We explore segmentation implementation through the eyes of a European utilities business, by following its progress through a major segmentation project. The study reveals the character and impact of implementation barriers occurring at different stages in the segmentation process. By classifying the barriers, we develop implementation "rules" for practitioners which are designed to minimise their occurrence and impact. We further contribute to the literature by developing a deeper understanding of the mechanisms through which these implementation rules can be applied.