942 resultados para Rule-Based Classification


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Knowledge discovery in databases is the non-trivial process of identifying valid, novel potentially useful and ultimately understandable patterns from data. The term Data mining refers to the process which does the exploratory analysis on the data and builds some model on the data. To infer patterns from data, data mining involves different approaches like association rule mining, classification techniques or clustering techniques. Among the many data mining techniques, clustering plays a major role, since it helps to group the related data for assessing properties and drawing conclusions. Most of the clustering algorithms act on a dataset with uniform format, since the similarity or dissimilarity between the data points is a significant factor in finding out the clusters. If a dataset consists of mixed attributes, i.e. a combination of numerical and categorical variables, a preferred approach is to convert different formats into a uniform format. The research study explores the various techniques to convert the mixed data sets to a numerical equivalent, so as to make it equipped for applying the statistical and similar algorithms. The results of clustering mixed category data after conversion to numeric data type have been demonstrated using a crime data set. The thesis also proposes an extension to the well known algorithm for handling mixed data types, to deal with data sets having only categorical data. The proposed conversion has been validated on a data set corresponding to breast cancer. Moreover, another issue with the clustering process is the visualization of output. Different geometric techniques like scatter plot, or projection plots are available, but none of the techniques display the result projecting the whole database but rather demonstrate attribute-pair wise analysis

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The country has witnessed tremendous increase in the vehicle population and increased axle loading pattern during the last decade, leaving its road network overstressed and leading to premature failure. The type of deterioration present in the pavement should be considered for determining whether it has a functional or structural deficiency, so that appropriate overlay type and design can be developed. Structural failure arises from the conditions that adversely affect the load carrying capability of the pavement structure. Inadequate thickness, cracking, distortion and disintegration cause structural deficiency. Functional deficiency arises when the pavement does not provide a smooth riding surface and comfort to the user. This can be due to poor surface friction and texture, hydro planning and splash from wheel path, rutting and excess surface distortion such as potholes, corrugation, faulting, blow up, settlement, heaves etc. Functional condition determines the level of service provided by the facility to its users at a particular time and also the Vehicle Operating Costs (VOC), thus influencing the national economy. Prediction of the pavement deterioration is helpful to assess the remaining effective service life (RSL) of the pavement structure on the basis of reduction in performance levels, and apply various alternative designs and rehabilitation strategies with a long range funding requirement for pavement preservation. In addition, they can predict the impact of treatment on the condition of the sections. The infrastructure prediction models can thus be classified into four groups, namely primary response models, structural performance models, functional performance models and damage models. The factors affecting the deterioration of the roads are very complex in nature and vary from place to place. Hence there is need to have a thorough study of the deterioration mechanism under varied climatic zones and soil conditions before arriving at a definite strategy of road improvement. Realizing the need for a detailed study involving all types of roads in the state with varying traffic and soil conditions, the present study has been attempted. This study attempts to identify the parameters that affect the performance of roads and to develop performance models suitable to Kerala conditions. A critical review of the various factors that contribute to the pavement performance has been presented based on the data collected from selected road stretches and also from five corporations of Kerala. These roads represent the urban conditions as well as National Highways, State Highways and Major District Roads in the sub urban and rural conditions. This research work is a pursuit towards a study of the road condition of Kerala with respect to varying soil, traffic and climatic conditions, periodic performance evaluation of selected roads of representative types and development of distress prediction models for roads of Kerala. In order to achieve this aim, the study is focused into 2 parts. The first part deals with the study of the pavement condition and subgrade soil properties of urban roads distributed in 5 Corporations of Kerala; namely Thiruvananthapuram, Kollam, Kochi, Thrissur and Kozhikode. From selected 44 roads, 68 homogeneous sections were studied. The data collected on the functional and structural condition of the surface include pavement distress in terms of cracks, potholes, rutting, raveling and pothole patching. The structural strength of the pavement was measured as rebound deflection using Benkelman Beam deflection studies. In order to collect the details of the pavement layers and find out the subgrade soil properties, trial pits were dug and the in-situ field density was found using the Sand Replacement Method. Laboratory investigations were carried out to find out the subgrade soil properties, soil classification, Atterberg limits, Optimum Moisture Content, Field Moisture Content and 4 days soaked CBR. The relative compaction in the field was also determined. The traffic details were also collected by conducting traffic volume count survey and axle load survey. From the data thus collected, the strength of the pavement was calculated which is a function of the layer coefficient and thickness and is represented as Structural Number (SN). This was further related to the CBR value of the soil and the Modified Structural Number (MSN) was found out. The condition of the pavement was represented in terms of the Pavement Condition Index (PCI) which is a function of the distress of the surface at the time of the investigation and calculated in the present study using deduct value method developed by U S Army Corps of Engineers. The influence of subgrade soil type and pavement condition on the relationship between MSN and rebound deflection was studied using appropriate plots for predominant types of soil and for classified value of Pavement Condition Index. The relationship will be helpful for practicing engineers to design the overlay thickness required for the pavement, without conducting the BBD test. Regression analysis using SPSS was done with various trials to find out the best fit relationship between the rebound deflection and CBR, and other soil properties for Gravel, Sand, Silt & Clay fractions. The second part of the study deals with periodic performance evaluation of selected road stretches representing National Highway (NH), State Highway (SH) and Major District Road (MDR), located in different geographical conditions and with varying traffic. 8 road sections divided into 15 homogeneous sections were selected for the study and 6 sets of continuous periodic data were collected. The periodic data collected include the functional and structural condition in terms of distress (pothole, pothole patch, cracks, rutting and raveling), skid resistance using a portable skid resistance pendulum, surface unevenness using Bump Integrator, texture depth using sand patch method and rebound deflection using Benkelman Beam. Baseline data of the study stretches were collected as one time data. Pavement history was obtained as secondary data. Pavement drainage characteristics were collected in terms of camber or cross slope using camber board (slope meter) for the carriage way and shoulders, availability of longitudinal side drain, presence of valley, terrain condition, soil moisture content, water table data, High Flood Level, rainfall data, land use and cross slope of the adjoining land. These data were used for finding out the drainage condition of the study stretches. Traffic studies were conducted, including classified volume count and axle load studies. From the field data thus collected, the progression of each parameter was plotted for all the study roads; and validated for their accuracy. Structural Number (SN) and Modified Structural Number (MSN) were calculated for the study stretches. Progression of the deflection, distress, unevenness, skid resistance and macro texture of the study roads were evaluated. Since the deterioration of the pavement is a complex phenomena contributed by all the above factors, pavement deterioration models were developed as non linear regression models, using SPSS with the periodic data collected for all the above road stretches. General models were developed for cracking progression, raveling progression, pothole progression and roughness progression using SPSS. A model for construction quality was also developed. Calibration of HDM–4 pavement deterioration models for local conditions was done using the data for Cracking, Raveling, Pothole and Roughness. Validation was done using the data collected in 2013. The application of HDM-4 to compare different maintenance and rehabilitation options were studied considering the deterioration parameters like cracking, pothole and raveling. The alternatives considered for analysis were base alternative with crack sealing and patching, overlay with 40 mm BC using ordinary bitumen, overlay with 40 mm BC using Natural Rubber Modified Bitumen and an overlay of Ultra Thin White Topping. Economic analysis of these options was done considering the Life Cycle Cost (LCC). The average speed that can be obtained by applying these options were also compared. The results were in favour of Ultra Thin White Topping over flexible pavements. Hence, Design Charts were also plotted for estimation of maximum wheel load stresses for different slab thickness under different soil conditions. The design charts showed the maximum stress for a particular slab thickness and different soil conditions incorporating different k values. These charts can be handy for a design engineer. Fuzzy rule based models developed for site specific conditions were compared with regression models developed using SPSS. The Riding Comfort Index (RCI) was calculated and correlated with unevenness to develop a relationship. Relationships were developed between Skid Number and Macro Texture of the pavement. The effort made through this research work will be helpful to highway engineers in understanding the behaviour of flexible pavements in Kerala conditions and for arriving at suitable maintenance and rehabilitation strategies. Key Words: Flexible Pavements – Performance Evaluation – Urban Roads – NH – SH and other roads – Performance Models – Deflection – Riding Comfort Index – Skid Resistance – Texture Depth – Unevenness – Ultra Thin White Topping

Relevância:

90.00% 90.00%

Publicador:

Resumo:

During the past few years, there has been much discussion of a shift from rule-based systems to principle-based systems for natural language processing. This paper outlines the major computational advantages of principle-based parsing, its differences from the usual rule-based approach, and surveys several existing principle-based parsing systems used for handling languages as diverse as Warlpiri, English, and Spanish, as well as language translation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Free-word order languages have long posed significant problems for standard parsing algorithms. This thesis presents an implemented parser, based on Government-Binding (GB) theory, for a particular free-word order language, Warlpiri, an aboriginal language of central Australia. The words in a sentence of a free-word order language may swap about relatively freely with little effect on meaning: the permutations of a sentence mean essentially the same thing. It is assumed that this similarity in meaning is directly reflected in the syntax. The parser presented here properly processes free word order because it assigns the same syntactic structure to the permutations of a single sentence. The parser also handles fixed word order, as well as other phenomena. On the view presented here, there is no such thing as a "configurational" or "non-configurational" language. Rather, there is a spectrum of languages that are more or less ordered. The operation of this parsing system is quite different in character from that of more traditional rule-based parsing systems, e.g., context-free parsers. In this system, parsing is carried out via the construction of two different structures, one encoding precedence information and one encoding hierarchical information. This bipartite representation is the key to handling both free- and fixed-order phenomena. This thesis first presents an overview of the portion of Warlpiri that can be parsed. Following this is a description of the linguistic theory on which the parser is based. The chapter after that describes the representations and algorithms of the parser. In conclusion, the parser is compared to related work. The appendix contains a substantial list of test cases ??th grammatical and ungrammatical ??at the parser has actually processed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Behavior-based navigation of autonomous vehicles requires the recognition of the navigable areas and the potential obstacles. In this paper we describe a model-based objects recognition system which is part of an image interpretation system intended to assist the navigation of autonomous vehicles that operate in industrial environments. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using a rule-based cooperative expert system

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We describe a model-based objects recognition system which is part of an image interpretation system intended to assist autonomous vehicles navigation. The system is intended to operate in man-made environments. Behavior-based navigation of autonomous vehicles involves the recognition of navigable areas and the potential obstacles. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using CEES, the C++ embedded expert system shell developed in the Systems Engineering and Automatic Control Laboratory (University of Girona) as a specific rule-based problem solving tool. It has been especially conceived for supporting cooperative expert systems, and uses the object oriented programming paradigm

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The activated sludge and anaerobic digestion processes have been modelled in widely accepted models. Nevertheless, these models still have limitations when describing operational problems of microbiological origin. The aim of this thesis is to develop a knowledge-based model to simulate risk of plant-wide operational problems of microbiological origin.For the risk model heuristic knowledge from experts and literature was implemented in a rule-based system. Using fuzzy logic, the system can infer a risk index for the main operational problems of microbiological origin (i.e. filamentous bulking, biological foaming, rising sludge and deflocculation). To show the results of the risk model, it was implemented in the Benchmark Simulation Models. This allowed to study the risk model's response in different scenarios and control strategies. The risk model has shown to be really useful providing a third criterion to evaluate control strategies apart from the economical and environmental criteria.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The proposal presented in this thesis is to provide designers of knowledge based supervisory systems of dynamic systems with a framework to facilitate their tasks avoiding interface problems among tools, data flow and management. The approach is thought to be useful to both control and process engineers in assisting their tasks. The use of AI technologies to diagnose and perform control loops and, of course, assist process supervisory tasks such as fault detection and diagnose, are in the scope of this work. Special effort has been put in integration of tools for assisting expert supervisory systems design. With this aim the experience of Computer Aided Control Systems Design (CACSD) frameworks have been analysed and used to design a Computer Aided Supervisory Systems (CASSD) framework. In this sense, some basic facilities are required to be available in this proposed framework: ·

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A new robust neurofuzzy model construction algorithm has been introduced for the modeling of a priori unknown dynamical systems from observed finite data sets in the form of a set of fuzzy rules. Based on a Takagi-Sugeno (T-S) inference mechanism a one to one mapping between a fuzzy rule base and a model matrix feature subspace is established. This link enables rule based knowledge to be extracted from matrix subspace to enhance model transparency. In order to achieve maximized model robustness and sparsity, a new robust extended Gram-Schmidt (G-S) method has been introduced via two effective and complementary approaches of regularization and D-optimality experimental design. Model rule bases are decomposed into orthogonal subspaces, so as to enhance model transparency with the capability of interpreting the derived rule base energy level. A locally regularized orthogonal least squares algorithm, combined with a D-optimality used for subspace based rule selection, has been extended for fuzzy rule regularization and subspace based information extraction. By using a weighting for the D-optimality cost function, the entire model construction procedure becomes automatic. Numerical examples are included to demonstrate the effectiveness of the proposed new algorithm.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ensemble learning techniques generate multiple classifiers, so called base classifiers, whose combined classification results are used in order to increase the overall classification accuracy. In most ensemble classifiers the base classifiers are based on the Top Down Induction of Decision Trees (TDIDT) approach. However, an alternative approach for the induction of rule based classifiers is the Prism family of algorithms. Prism algorithms produce modular classification rules that do not necessarily fit into a decision tree structure. Prism classification rulesets achieve a comparable and sometimes higher classification accuracy compared with decision tree classifiers, if the data is noisy and large. Yet Prism still suffers from overfitting on noisy and large datasets. In practice ensemble techniques tend to reduce the overfitting, however there exists no ensemble learner for modular classification rule inducers such as the Prism family of algorithms. This article describes the first development of an ensemble learner based on the Prism family of algorithms in order to enhance Prism’s classification accuracy by reducing overfitting.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of this work is to develop a web based decision support system, based onfuzzy logic, to assess the motor state of Parkinson patients on their performance in onscreenmotor tests in a test battery on a hand computer. A set of well defined rules, basedon an expert’s knowledge, were made to diagnose the current state of the patient. At theend of a period, an overall score is calculated which represents the overall state of thepatient during the period. Acceptability of the rules is based on the absolute differencebetween patient’s own assessment of his condition and the diagnosed state. Anyinconsistency can be tracked by highlighted as an alert in the system. Graphicalpresentation of data aims at enhanced analysis of patient’s state and performancemonitoring by the clinic staff. In general, the system is beneficial for the clinic staff,patients, project managers and researchers.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this work was to design a set of rules for levodopa infusion dose adjustment in Parkinson’s disease based on a simulation experiments. Using this simulator, optimal infusions dose in different conditions were calculated. There are seven conditions (-3 to +3)appearing in a rating scale for Parkinson’s disease patients. By finding mean of the differences between conditions and optimal dose, two sets of rules were designed. The set of rules was optimized by several testing. Usefulness for optimizing the titration procedure of new infusion patients based on rule-based reasoning was investigated. Results show that both of the number of the steps and the errors for finding optimal dose was shorten by new rules. At last, the dose predicted with new rules well on each single occasion of majority of patients in simulation experiments.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this paper is to evaluate the performance of two divergent methods for delineating commuting regions, also called labour market areas, in a situation that the base spatial units differ largely in size as a result of an irregular population distribution. Commuting patterns in Sweden have been analyzed with geographical information system technology by delineating commuting regions using two regionalization methods. One, a rule-based method, uses one-way commuting flows to delineate local labour market areas in a top-down procedure based on the selection of predefined employment centres. The other method, the interaction-based Intramax analysis, uses two-way flows in a bottom-up procedure based on numerical taxonomy principles. A comparison of these methods will expose a number of strengths and weaknesses. For both methods, the same data source has been used. The performance of both methods has been evaluated for the country as a whole using resident employed population, self-containment levels and job ratios for criteria. A more detailed evaluation has been done in the Goteborg metropolitan area by comparing regional patterns with the commuting fields of a number of urban centres in this area. It is concluded that both methods could benefit from the inclusion of additional control measures to identify improper allocations of municipalities.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

There are many proposals for managing biodiversity by using surrogates, such as umbrella, indicator, focal, and flagship species. We use the term biodiversity management unit for any ecosystem-based classificatory scheme for managing biodiversity. The sufficiency of biodiversity management unit classification schemes depends upon (1) whether different biotic elements (e.g., trees, birds, reptiles) distinguish between biodiversity management units within a classification (i.e., coherence within classes}; and (2) whether different biotic elements agree upon similarities and dissimilarities among biodiversity management unit classes (i.e., conformance among classes). Recent evaluations suggest that biodiversity surrogates based on few or single taxa are not useful. Ecological vegetation classes are an ecosystem-based classification scheme used as one component for biodiversity management in Victoria, Australia. Here we evaluated the potential for ecological vegetation classes to be used as biodiversity management units in the box-ironbark ecosystem of central Victoria, Australia. Eighty sites distributed among 14 ecological vegetation classes were surveyed in the same ways for tree species, birds, mammals, reptiles, terrestrial invertebrates, and nocturnal flying insects. Habitat structure and geographic separations also were measured, which, with the biotic elements, are collectively referred to as variables. Less than half of the biotic element-ecological vegetation class pairings were coherent. Generalized Mantel tests were used to examine conformance among variables with respect to ecological vegetation classes. While most tests were not significant, birds, mammals, tree species, and habitat structure together showed significant agreement on the rating of similarities among ecological vegetation classes. In this system, use of ecological vegetation classes as biodiversity management units may account reasonably well for birds, mammals, and trees; but reptiles and invertebrates would not be accommodated. We conclude that surrogates will usually have to be augmented or developed as hierarchies to provide general representativeness.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Appropriate choice of a kernel is the most important ingredient of the kernel-based learning methods such as support vector machine (SVM). Automatic kernel selection is a key issue given the number of kernels available, and the current trial-and-error nature of selecting the best kernel for a given problem. This paper introduces a new method for automatic kernel selection, with empirical results based on classification. The empirical study has been conducted among five kernels with 112 different classification problems, using the popular kernel based statistical learning algorithm SVM. We evaluate the kernels’ performance in terms of accuracy measures. We then focus on answering the question: which kernel is best suited to which type of classification problem? Our meta-learning methodology involves measuring the problem characteristics using classical, distance and distribution-based statistical information. We then combine these measures with the empirical results to present a rule-based method to select the most appropriate kernel for a classification problem. The rules are generated by the decision tree algorithm C5.0 and are evaluated with 10 fold cross validation. All generated rules offer high accuracy ratings.