17 resultados para rule-based algorithms
em Aston University Research Archive
Resumo:
Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.
Resumo:
The starting point of this research was the belief that manufacturing and similar industries need help with the concept of e-business, especially in assessing the relevance of possible e-business initiatives. The research hypotheses was that it should be possible to produce a systematic model that defines, at a useful level of detail, the probable e-business requirements of an organisation based on objective criteria with an accuracy of 85%-90%. This thesis describes the development and validation of such a model. A preliminary model was developed from a variety of sources, including a survey of current and planned e-business activity and representative examples of e-business material produced by e-business solution providers. The model was subject to a process of testing and refinement based on recursive case studies, with controls over the improving accuracy and stability of the model. Useful conclusions were also possible as to the relevance of e-business functions to the case study participants themselves. Techniques were evolved to synthesise the e-business requirements of an organisation and present them at a management summary level of detail. The results of applying these techniques to all the case studies used in this research were discussed. The conclusion of the research was that the case study methodology employed was successful. A model was achieved suitable for practical application in a manufacturing organisation requiring help with a requirements definition process.
Resumo:
Multi-agent algorithms inspired by the division of labour in social insects and by markets, are applied to a constrained problem of distributed task allocation. The efficiency (average number of tasks performed), the flexibility (ability to react to changes in the environment), and the sensitivity to load (ability to cope with differing demands) are investigated in both static and dynamic environments. A hybrid algorithm combining both approaches, is shown to exhibit improved efficiency and robustness. We employ nature inspired particle swarm optimisation to obtain optimised parameters for all algorithms in a range of representative environments. Although results are obtained for large population sizes to avoid finite size effects, the influence of population size on the performance is also analysed. From a theoretical point of view, we analyse the causes of efficiency loss, derive theoretical upper bounds for the efficiency, and compare these with the experimental results.
Resumo:
Most research in the area of emotion detection in written text focused on detecting explicit expressions of emotions in text. In this paper, we present a rule-based pipeline approach for detecting implicit emotions in written text without emotion-bearing words based on the OCC Model. We have evaluated our approach on three different datasets with five emotion categories. Our results show that the proposed approach outperforms the lexicon matching method consistently across all the three datasets by a large margin of 17–30% in F-measure and gives competitive performance compared to a supervised classifier. In particular, when dealing with formal text which follows grammatical rules strictly, our approach gives an average F-measure of 82.7% on “Happy”, “Angry-Disgust” and “Sad”, even outperforming the supervised baseline by nearly 17% in F-measure. Our preliminary results show the feasibility of the approach for the task of implicit emotion detection in written text.
An improved conflicting evidence combination approach based on a new supporting probability distance
Resumo:
To avoid counter-intuitive result of classical Dempster's combination rule when dealing with highly conflict information, many improved combination methods have been developed through modifying the basic probability assignments (BPAs) of bodies of evidence (BOEs) by using a certain measure of the degree of conflict or uncertain information, such as Jousselme's distance, the pignistic probability distance and the ambiguity measure. However, if BOEs contain some non-singleton elements and the differences among their BPAs are larger than 0.5, the current conflict measure methods have limitations in describing the interrelationship among the conflict BOEs and may even lead to wrong combination results. In order to solve this problem, a new distance function, which is called supporting probability distance, is proposed to characterize the differences among BOEs. With the new distance, the information of how much a focal element is supported by the other focal elements in BOEs can be given. Also, a new combination rule based on the supporting probability distance is proposed for the combination of the conflicting evidences. The credibility and the discounting factor of each BOE are generated by the supporting probability distance and the weighted BOEs are combined directly using Dempster's rules. Analytical results of numerical examples show that the new distance has a better capability of describing the interrelationships among BOEs, especially for the highly conflicting BOEs containing non-singleton elements and the proposed new combination method has better applicability and effectiveness compared with the existing methods.
Resumo:
Microposts are small fragments of social media content that have been published using a lightweight paradigm (e.g. Tweets, Facebook likes, foursquare check-ins). Microposts have been used for a variety of applications (e.g., sentiment analysis, opinion mining, trend analysis), by gleaning useful information, often using third-party concept extraction tools. There has been very large uptake of such tools in the last few years, along with the creation and adoption of new methods for concept extraction. However, the evaluation of such efforts has been largely consigned to document corpora (e.g. news articles), questioning the suitability of concept extraction tools and methods for Micropost data. This report describes the Making Sense of Microposts Workshop (#MSM2013) Concept Extraction Challenge, hosted in conjunction with the 2013 World Wide Web conference (WWW'13). The Challenge dataset comprised a manually annotated training corpus of Microposts and an unlabelled test corpus. Participants were set the task of engineering a concept extraction system for a defined set of concepts. Out of a total of 22 complete submissions 13 were accepted for presentation at the workshop; the submissions covered methods ranging from sequence mining algorithms for attribute extraction to part-of-speech tagging for Micropost cleaning and rule-based and discriminative models for token classification. In this report we describe the evaluation process and explain the performance of different approaches in different contexts.
Resumo:
Substantial behavioural and neuropsychological evidence has been amassed to support the dual-route model of morphological processing, which distinguishes between a rule-based system for regular items (walk–walked, call–called) and an associative system for the irregular items (go–went). Some neural-network models attempt to explain the neuropsychological and brain-mapping dissociations in terms of single-system associative processing. We show that there are problems in the accounts of homogeneous networks in the light of recent brain-mapping evidence of systematic double-dissociation. We also examine the superior capabilities of more internally differentiated connectionist models, which, under certain conditions, display systematic double-dissociations. It appears that the more differentiation models show, the more easily they account for dissociation patterns, yet without implementing symbolic computations.
Resumo:
A major application of computers has been to control physical processes in which the computer is embedded within some large physical process and is required to control concurrent physical processes. The main difficulty with these systems is their event-driven characteristics, which complicate their modelling and analysis. Although a number of researchers in the process system community have approached the problems of modelling and analysis of such systems, there is still a lack of standardised software development formalisms for the system (controller) development, particular at early stage of the system design cycle. This research forms part of a larger research programme which is concerned with the development of real-time process-control systems in which software is used to control concurrent physical processes. The general objective of the research in this thesis is to investigate the use of formal techniques in the analysis of such systems at their early stages of development, with a particular bias towards an application to high speed machinery. Specifically, the research aims to generate a standardised software development formalism for real-time process-control systems, particularly for software controller synthesis. In this research, a graphical modelling formalism called Sequential Function Chart (SFC), a variant of Grafcet, is examined. SFC, which is defined in the international standard IEC1131 as a graphical description language, has been used widely in industry and has achieved an acceptable level of maturity and acceptance. A comparative study between SFC and Petri nets is presented in this thesis. To overcome identified inaccuracies in the SFC, a formal definition of the firing rules for SFC is given. To provide a framework in which SFC models can be analysed formally, an extended time-related Petri net model for SFC is proposed and the transformation method is defined. The SFC notation lacks a systematic way of synthesising system models from the real world systems. Thus a standardised approach to the development of real-time process control systems is required such that the system (software) functional requirements can be identified, captured, analysed. A rule-based approach and a method called system behaviour driven method (SBDM) are proposed as a development formalism for real-time process-control systems.
Resumo:
In a certain automobile factory, batch-painting of the body types in colours is controlled by an allocation system. This tries to balance production with orders, whilst making optimally-sized batches of colours. Sequences of cars entering painting cannot be optimised for easy selection of colour and batch size. `Over-production' is not allowed, in order to reduce buffer stocks of unsold vehicles. Paint quality is degraded by random effects. This thesis describes a toolkit which supports IKBS in an object-centred formalism. The intended domain of use for the toolkit is flexible manufacturing. A sizeable application program was developed, using the toolkit, to test the validity of the IKBS approach in solving the real manufacturing problem above, for which an existing conventional program was already being used. A detailed statistical analysis of the operating circumstances of the program was made to evaluate the likely need for the more flexible type of program for which the toolkit was intended. The IKBS program captures the many disparate and conflicting constraints in the scheduling knowledge and emulates the behaviour of the program installed in the factory. In the factory system, many possible, newly-discovered, heuristics would be awkward to represent and it would be impossible to make many new extensions. The representation scheme is capable of admitting changes to the knowledge, relying on the inherent encapsulating properties of object-centres programming to protect and isolate data. The object-centred scheme is supported by an enhancement of the `C' programming language and runs under BSD 4.2 UNIX. The structuring technique, using objects, provides a mechanism for separating control of expression of rule-based knowledge from the knowledge itself and allowing explicit `contexts', within which appropriate expression of knowledge can be done. Facilities are provided for acquisition of knowledge in a consistent manner.
Resumo:
Diagnosing faults in wastewater treatment, like diagnosis of most problems, requires bi-directional plausible reasoning. This means that both predictive (from causes to symptoms) and diagnostic (from symptoms to causes) inferences have to be made, depending on the evidence available, in reasoning for the final diagnosis. The use of computer technology for the purpose of diagnosing faults in the wastewater process has been explored, and a rule-based expert system was initiated. It was found that such an approach has serious limitations in its ability to reason bi-directionally, which makes it unsuitable for diagnosing tasks under the conditions of uncertainty. The probabilistic approach known as Bayesian Belief Networks (BBNS) was then critically reviewed, and was found to be well-suited for diagnosis under uncertainty. The theory and application of BBNs are outlined. A full-scale BBN for the diagnosis of faults in a wastewater treatment plant based on the activated sludge system has been developed in this research. Results from the BBN show good agreement with the predictions of wastewater experts. It can be concluded that the BBNs are far superior to rule-based systems based on certainty factors in their ability to diagnose faults and predict systems in complex operating systems having inherently uncertain behaviour.
Resumo:
This study was concerned with the computer automation of land evaluation. This is a broad subject with many issues to be resolved, so the study concentrated on three key problems: knowledge based programming; the integration of spatial information from remote sensing and other sources; and the inclusion of socio-economic information into the land evaluation analysis. Land evaluation and land use planning were considered in the context of overseas projects in the developing world. Knowledge based systems were found to provide significant advantages over conventional programming techniques for some aspects of the land evaluation process. Declarative languages, in particular Prolog, were ideally suited to integration of social information which changes with every situation. Rule-based expert system shells were also found to be suitable for this role, including knowledge acquisition at the interview stage. All the expert system shells examined suffered from very limited constraints to problem size, but new products now overcome this. Inductive expert system shells were useful as a guide to knowledge gaps and possible relationships, but the number of examples required was unrealistic for typical land use planning situations. The accuracy of classified satellite imagery was significantly enhanced by integrating spatial information on soil distribution for Thailand data. Estimates of the rice producing area were substantially improved (30% change in area) by the addition of soil information. Image processing work on Mozambique showed that satellite remote sensing was a useful tool in stratifying vegetation cover at provincial level to identify key development areas, but its full utility could not be realised on typical planning projects, without treatment as part of a complete spatial information system.
Resumo:
The primary objective of this research was to understand what kinds of knowledge and skills people use in `extracting' relevant information from text and to assess the extent to which expert systems techniques could be applied to automate the process of abstracting. The approach adopted in this thesis is based on research in cognitive science, information science, psycholinguistics and textlinguistics. The study addressed the significance of domain knowledge and heuristic rules by developing an information extraction system, called INFORMEX. This system, which was implemented partly in SPITBOL, and partly in PROLOG, used a set of heuristic rules to analyse five scientific papers of expository type, to interpret the content in relation to the key abstract elements and to extract a set of sentences recognised as relevant for abstracting purposes. The analysis of these extracts revealed that an adequate abstract could be generated. Furthermore, INFORMEX showed that a rule based system was a suitable computational model to represent experts' knowledge and strategies. This computational technique provided the basis for a new approach to the modelling of cognition. It showed how experts tackle the task of abstracting by integrating formal knowledge as well as experiential learning. This thesis demonstrated that empirical and theoretical knowledge can be effectively combined in expert systems technology to provide a valuable starting approach to automatic abstracting.
Resumo:
Initially this thesis examines the various mechanisms by which technology is acquired within anodizing plants. In so doing the history of the evolution of anodizing technology is recorded, with particular reference to the growth of major markets and to the contribution of the marketing efforts of the aluminium industry. The business economics of various types of anodizing plants are analyzed. Consideration is also given to the impact of developments in anodizing technology on production economics and market growth. The economic costs associated with work rejected for process defects are considered. Recent changes in the industry have created conditions whereby information technology has a potentially important role to play in retaining existing knowledge. One such contribution is exemplified by the expert system which has been developed for the identification of anodizing process defects. Instead of using a "rule-based" expert system, a commercial neural networks program has been adapted for the task. The advantages of neural networks over 'rule-based' systems is that they are better suited to production problems, since the actual conditions prevailing when the defect was produced are often not known with certainty. In using the expert system, the user first identifies the process stage at which the defect probably occurred and is then directed to a file enabling the actual defects to be identified. After making this identification, the user can consult a database which gives a more detailed description of the defect, advises on remedial action and provides a bibliography of papers relating to the defect. The database uses a proprietary hypertext program, which also provides rapid cross-referencing to similar types of defect. Additionally, a graphics file can be accessed which (where appropriate) will display a graphic of the defect on screen. A total of 117 defects are included, together with 221 literature references, supplemented by 48 cross-reference hyperlinks. The main text of the thesis contains 179 literature references. (DX186565)
Resumo:
Orthogonal frequency division multiplexing (OFDM) is becoming a fundamental technology in future generation wireless communications. Call admission control is an effective mechanism to guarantee resilient, efficient, and quality-of-service (QoS) services in wireless mobile networks. In this paper, we present several call admission control algorithms for OFDM-based wireless multiservice networks. Call connection requests are differentiated into narrow-band calls and wide-band calls. For either class of calls, the traffic process is characterized as batch arrival since each call may request multiple subcarriers to satisfy its QoS requirement. The batch size is a random variable following a probability mass function (PMF) with realistically maximum value. In addition, the service times for wide-band and narrow-band calls are different. Following this, we perform a tele-traffic queueing analysis for OFDM-based wireless multiservice networks. The formulae for the significant performance metrics call blocking probability and bandwidth utilization are developed. Numerical investigations are presented to demonstrate the interaction between key parameters and performance metrics. The performance tradeoff among different call admission control algorithms is discussed. Moreover, the analytical model has been validated by simulation. The methodology as well as the result provides an efficient tool for planning next-generation OFDM-based broadband wireless access systems.