809 resultados para Rule principles
Resumo:
In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction of Decision Trees (TDIDT) algorithm is a very widely used technology to predict the classification of newly recorded data. However alternative technologies have been derived that often produce better rules but do not scale well on large datasets. Such an alternative to TDIDT is the PrismTCS algorithm. PrismTCS performs particularly well on noisy data but does not scale well on large datasets. In this paper we introduce Prism and investigate its scaling behaviour. We describe how we improved the scalability of the serial version of Prism and investigate its limitations. We then describe our work to overcome these limitations by developing a framework to parallelise algorithms of the Prism family and similar algorithms. We also present the scale up results of a first prototype implementation.
Resumo:
In a world where data is captured on a large scale the major challenge for data mining algorithms is to be able to scale up to large datasets. There are two main approaches to inducing classification rules, one is the divide and conquer approach, also known as the top down induction of decision trees; the other approach is called the separate and conquer approach. A considerable amount of work has been done on scaling up the divide and conquer approach. However, very little work has been conducted on scaling up the separate and conquer approach.In this work we describe a parallel framework that allows the parallelisation of a certain family of separate and conquer algorithms, the Prism family. Parallelisation helps the Prism family of algorithms to harvest additional computer resources in a network of computers in order to make the induction of classification rules scale better on large datasets. Our framework also incorporates a pre-pruning facility for parallel Prism algorithms.
Resumo:
Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unseen data. Alternative algorithms have been developed such as the Prism algorithm. Prism constructs modular rules which produce qualitatively better rules than rules induced by TDIDT. However, along with the increasing size of databases, many existing rule learning algorithms have proved to be computational expensive on large datasets. To tackle the problem of scalability, parallel classification rule induction algorithms have been introduced. As TDIDT is the most popular classifier, even though there are strongly competitive alternative algorithms, most parallel approaches to inducing classification rules are based on TDIDT. In this paper we describe work on a distributed classifier that induces classification rules in a parallel manner based on Prism.
Resumo:
Induction of classification rules is one of the most important technologies in data mining. Most of the work in this field has concentrated on the Top Down Induction of Decision Trees (TDIDT) approach. However, alternative approaches have been developed such as the Prism algorithm for inducing modular rules. Prism often produces qualitatively better rules than TDIDT but suffers from higher computational requirements. We investigate approaches that have been developed to minimize the computational requirements of TDIDT, in order to find analogous approaches that could reduce the computational requirements of Prism.
Resumo:
The fast increase in the size and number of databases demands data mining approaches that are scalable to large amounts of data. This has led to the exploration of parallel computing technologies in order to perform data mining tasks concurrently using several processors. Parallelization seems to be a natural and cost-effective way to scale up data mining technologies. One of the most important of these data mining technologies is the classification of newly recorded data. This paper surveys advances in parallelization in the field of classification rule induction.
Resumo:
Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.
Resumo:
We analyze the choice between the origin and destination principles of taxation when there is product differentiation and Bertrand competition. If taxes are redistributed to consumers and demand is linear the origin principle dominates the destination principle whatever the degree of product differentiation and extent of economic integration. With nonlinear demand the origin principle dominates if there is sufficient economic integration. When the social value assigned to tax revenue is higher than the private value, the destination principle dominates for intermediate values of product differentiation and economic integration. The same results are also shown to hold with Cournot competition.
Resumo:
Purpose The article examines principles of Fair Trade in public procurement in Europe, focusing on legal dimensions related to the European Public Procurement Directives. Design/methodology/approach The article situates public procurement of Fair Trade products in relation to the rise of non-state regulatory initiatives, highlighting how they have entered into new governance dynamics in the public sector and play a part in changing practices in sustainable procurement. A review of legal position on Fair Trade in procurement law is informed by academic research and campaigning experience from the Fair Trade Advocacy Office. Findings Key findings are that the introduction of Fair Trade products into European public procurement has been marked by legal ambiguity, having developed outside comprehensive policy or legal guidelines. Following a 2012 ruling by the Court of Justice of the European Union, it is suggested that the legal position for Fair Trade in procurement has become clearer, and that forthcoming change to the Public Procurement Directives may facilitate the uptake of fair trade products by public authorities. However potential for future expansion of the public sector ‘market’ for Fair Trade is approached with caution: purchasing Fair Trade products as a marker of sustainability, which started to be embedded within procurement practice in the 2000s, is challenged by current European public austerity measures. Research limitations/implications Suggestions for future research include the need for systematic cross-institutional and multi-country comparison of the legal and governance dimensions of procurement practice with regard to Fair Trade. Practical implications A clarification of current state-of-play with regard to legal aspects of fair trade in public procurement of utility for policy and advocacy discussion. Originality/value The article provides needed elaboration on an under researched topic area of value to academia and policy makers.
Resumo:
Objective: To describe the training undertaken by pharmacists employed in a pharmacist-led information technology-based intervention study to reduce medication errors in primary care (PINCER Trial), evaluate pharmacists’ assessment of the training, and the time implications of undertaking the training. Methods: Six pharmacists received training, which included training on root cause analysis and educational outreach, to enable them to deliver the PINCER Trial intervention. This was evaluated using self-report questionnaires at the end of each training session. The time taken to complete each session was recorded. Data from the evaluation forms were entered onto a Microsoft Excel spreadsheet, independently checked and the summary of results further verified. Frequencies were calculated for responses to the three-point Likert scale questions. Free-text comments from the evaluation forms and pharmacists’ diaries were analysed thematically. Key findings: All six pharmacists received 22 hours of training over five sessions. In four out of the five sessions, the pharmacists who completed an evaluation form (27 out of 30 were completed) stated they were satisfied or very satisfied with the various elements of the training package. Analysis of free-text comments and the pharmacists’ diaries showed that the principles of root cause analysis and educational outreach were viewed as useful tools to help pharmacists conduct pharmaceutical interventions in both the study and other pharmacy roles that they undertook. The opportunity to undertake role play was a valuable part of the training received. Conclusions: Findings presented in this paper suggest that providing the PINCER pharmacists with training in root cause analysis and educational outreach contributed to the successful delivery of PINCER interventions and could potentially be utilised by other pharmacists based in general practice to deliver pharmaceutical interventions to improve patient safety.
Resumo:
We consider the problem of constructing balance dynamics for rapidly rotating fluid systems. It is argued that the conventional Rossby number expansion—namely expanding all variables in a series in Rossby number—is secular for all but the simplest flows. In particular, the higher-order terms in the expansion grow exponentially on average, and for moderate values of the Rossby number the expansion is, at best, useful only for times of the order of the doubling times of the instabilities of the underlying quasi-geostrophic dynamics. Similar arguments apply in a wide class of problems involving a small parameter and sufficiently complex zeroth-order dynamics. A modified procedure is proposed which involves expanding only the fast modes of the system; this is equivalent to an asymptotic approximation of the slaving relation that relates the fast modes to the slow modes. The procedure is systematic and thus capable, at least in principle, of being carried to any order—unlike procedures based on truncations. We apply the procedure to construct higher-order balance approximations of the shallow-water equations. At the lowest order quasi-geostrophy emerges. At the next order the system incorporates gradient-wind balance, although the balance relations themselves involve only linear inversions and hence are easily applied. There is a large class of reduced systems associated with various choices for the slow variables, but the simplest ones appear to be those based on potential vorticity.
Resumo:
We establish Maximum Principles which apply to vectorial approximate minimizers of the general integral functional of Calculus of Variations. Our main result is a version of the Convex Hull Property. The primary advance compared to results already existing in the literature is that we have dropped the quasiconvexity assumption of the integrand in the gradient term. The lack of weak Lower semicontinuity is compensated by introducing a nonlinear convergence technique, based on the approximation of the projection onto a convex set by reflections and on the invariance of the integrand in the gradient term under the Orthogonal Group. Maximum Principles are implied for the relaxed solution in the case of non-existence of minimizers and for minimizing solutions of the Euler–Lagrange system of PDE.
Shaming men, performing power: female authority in Zimbabwe and Tanzania on the eve of colonial rule