942 resultados para Rule-Based Classification


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic generation of classification rules has been an increasingly popular technique in commercial applications such as Big Data analytics, rule based expert systems and decision making systems. However, a principal problem that arises with most methods for generation of classification rules is the overfit-ting of training data. When Big Data is dealt with, this may result in the generation of a large number of complex rules. This may not only increase computational cost but also lower the accuracy in predicting further unseen instances. This has led to the necessity of developing pruning methods for the simplification of rules. In addition, classification rules are used further to make predictions after the completion of their generation. As efficiency is concerned, it is expected to find the first rule that fires as soon as possible by searching through a rule set. Thus a suit-able structure is required to represent the rule set effectively. In this chapter, the authors introduce a unified framework for construction of rule based classification systems consisting of three operations on Big Data: rule generation, rule simplification and rule representation. The authors also review some existing methods and techniques used for each of the three operations and highlight their limitations. They introduce some novel methods and techniques developed by them recently. These methods and techniques are also discussed in comparison to existing ones with respect to efficient processing of Big Data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we purpose a rule pruning strategy to reduce the number of rules in a fuzzy rule-based classification system.A confidence factor, which is formulated based on the compatibility of the rules with the input patterns is under deployed for rule pruning.The pruning strategy aims at reducing the complexity of the fuzzy classification system and, at the same time, maintaining the accuracy rate at a good level.To evaluate the effectiveness of the pruning strategy, two benchmark data sets are first tested. Then, a fault classification problem with real senor measurements collected from a power generation plant is evaluated.The results obtained are analyzed and explained, and implications of the proposed rule pruning strategy to the fuzzy classification system are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A rule-based approach for classifying previously identified medical concepts in the clinical free text into an assertion category is presented. There are six different categories of assertions for the task: Present, Absent, Possible, Conditional, Hypothetical and Not associated with the patient. The assertion classification algorithms were largely based on extending the popular NegEx and Context algorithms. In addition, a health based clinical terminology called SNOMED CT and other publicly available dictionaries were used to classify assertions, which did not fit the NegEx/Context model. The data for this task includes discharge summaries from Partners HealthCare and from Beth Israel Deaconess Medical Centre, as well as discharge summaries and progress notes from University of Pittsburgh Medical Centre. The set consists of 349 discharge reports, each with pairs of ground truth concept and assertion files for system development, and 477 reports for evaluation. The system’s performance on the evaluation data set was 0.83, 0.83 and 0.83 for recall, precision and F1-measure, respectively. Although the rule-based system shows promise, further improvements can be made by incorporating machine learning approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several websites utilise a rule-base recommendation system, which generates choices based on a series of questionnaires, for recommending products to users. This approach has a high risk of customer attrition and the bottleneck is the questionnaire set. If the questioning process is too long, complex or tedious; users are most likely to quit the questionnaire before a product is recommended to them. If the questioning process is short; the user intensions cannot be gathered. The commonly used feature selection methods do not provide a satisfactory solution. We propose a novel process combining clustering, decisions tree and association rule mining for a group-oriented question reduction process. The question set is reduced according to common properties that are shared by a specific group of users. When applied on a real-world website, the proposed combined method outperforms the methods where the reduction of question is done only by using association rule mining or only by observing distribution within the group.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Expert systems have been increasingly popular for commercial importance. A rule based system is a special type of an expert system, which consists of a set of ‘if-then‘ rules and can be applied as a decision support system in many areas such as healthcare, transportation and security. Rule based systems can be constructed based on both expert knowledge and data. This paper aims to introduce the theory of rule based systems especially on categorization and construction of such systems from a conceptual point of view. This paper also introduces rule based systems for classification tasks in detail.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The utilization of a fuzzy aspect within data analysis attempts to move from a quantitative to a more qualitative investigative environment. As such, this may allow the more non-quantitative researchers results they can use, based on sets of linguistic terms. In this paper an inductive fuzzy decision tree approach is utilized to construct a fuzzy-rule-based system for the first time in a biological setting. The specific biological problem considered attempts to identify the antecedents (conditions in the fuzzy decision rules) which characterize the length of song flight of the male sedge warbler when attempting to attract a mate. Hence, for a non-quantitative investigator the resultant set of fuzzy rules allows an insight into the linguistic interpretation on the relationship between associated characteristics and the respective song flight duration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper suggests a classification of dynamic rule-based systems. For each class of systems, limit behavior is studied. Systems with stabilizing limit states or stabilizing limit trajectories are identified, and such states and trajectories are found. The structure of the set of limit states and trajectories is investigated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The effective management of bridge stock involves making decisions as to when to repair, remedy, or do nothing, taking into account the financial and service life implications. Such decisions require a reliable diagnosis as to the cause of distress and an understanding of the likely future degradation. Such diagnoses are based on a combination of visual inspections, laboratory tests on samples and expert opinions. In addition, the choice of appropriate laboratory tests requires an understanding of the degradation mechanisms involved. Under these circumstances, the use of expert systems or evaluation tools developed from “realtime” case studies provides a promising solution in the absence of expert knowledge. This paper addresses the issues in bridge infrastructure management in Queensland, Australia. Bridges affected by alkali silica reaction and chloride induced corrosion have been investigated and the results presented using a mind mapping tool. The analysis highights that several levels of rules are required to assess the mechanism causing distress. The systematic development of a rule based approach is presented. An example of this application to a case study bridge has been used to demonstrate that preliminary results are satisfactory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the key issues facing public asset owners is the decision of refurbishing aged built assets. This decision requires an assessment of the “remaining service life” of the key components in a building. The remaining service life is significantly dependent upon the existing condition of the asset and future degradation patterns considering durability and functional obsolescence. Recently developed methods on Residual Service Life modelling, require sophisticated data that are not readily available. Most of the data available are in the form of reports prior to undertaking major repairs or in the form of sessional audit reports. Valuable information from these available sources can serve as bench marks for estimating the reference service life. The authors have acquired similar informations from a public asset building in Melbourne. Using these informations, the residual service life of a case study building façade has been estimated in this paper based on state-of-the-art approaches. These estimations have been evaluated against expert opinion. Though the results are encouraging it is clear that the state-of-the-art methodologies can only provide meaningful estimates provided the level and quality of data are available. This investigation resulted in the development of a new framework for maintenance that integrates the condition assessment procedures and factors influencing residual service life