73 resultados para Applied artificial intelligence
Resumo:
Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.
Resumo:
The Twitter network has been labelled the most commonly used microblogging application around today. With about 500 million estimated registered users as of June, 2012, Twitter has become a credible medium of sentiment/opinion expression. It is also a notable medium for information dissemination; including breaking news on diverse issues since it was launched in 2007. Many organisations, individuals and even government bodies follow activities on the network in order to obtain knowledge on how their audience reacts to tweets that affect them. We can use postings on Twitter (known as tweets) to analyse patterns associated with events by detecting the dynamics of the tweets. A common way of labelling a tweet is by including a number of hashtags that describe its contents. Association Rule Mining can find the likelihood of co-occurrence of hashtags. In this paper, we propose the use of temporal Association Rule Mining to detect rule dynamics, and consequently dynamics of tweets. We coined our methodology Transaction-based Rule Change Mining (TRCM). A number of patterns are identifiable in these rule dynamics including, new rules, emerging rules, unexpected rules and ?dead' rules. Also the linkage between the different types of rule dynamics is investigated experimentally in this paper.
Resumo:
Planning is one of the key problems for autonomous vehicles operating in road scenarios. Present planning algorithms operate with the assumption that traffic is organised in predefined speed lanes, which makes it impossible to allow autonomous vehicles in countries with unorganised traffic. Unorganised traffic is though capable of higher traffic bandwidths when constituting vehicles vary in their speed capabilities and sizes. Diverse vehicles in an unorganised exhibit unique driving behaviours which are analysed in this paper by a simulation study. The aim of the work reported here is to create a planning algorithm for mixed traffic consisting of both autonomous and non-autonomous vehicles without any inter-vehicle communication. The awareness (e.g. vision) of every vehicle is restricted to nearby vehicles only and a straight infinite road is assumed for decision making regarding navigation in the presence of multiple vehicles. Exhibited behaviours include obstacle avoidance, overtaking, giving way for vehicles to overtake from behind, vehicle following, adjusting the lateral lane position and so on. A conflict of plans is a major issue which will almost certainly arise in the absence of inter-vehicle communication. Hence each vehicle needs to continuously track other vehicles and rectify plans whenever a collision seems likely. Further it is observed here that driver aggression plays a vital role in overall traffic dynamics, hence this has also been factored in accordingly. This work is hence a step forward towards achieving autonomous vehicles in unorganised traffic, while similar effort would be required for planning problems such as intersections, mergers, diversions and other modules like localisation.
Resumo:
Ensemble learning can be used to increase the overall classification accuracy of a classifier by generating multiple base classifiers and combining their classification results. A frequently used family of base classifiers for ensemble learning are decision trees. However, alternative approaches can potentially be used, such as the Prism family of algorithms that also induces classification rules. Compared with decision trees, Prism algorithms generate modular classification rules that cannot necessarily be represented in the form of a decision tree. Prism algorithms produce a similar classification accuracy compared with decision trees. However, in some cases, for example, if there is noise in the training and test data, Prism algorithms can outperform decision trees by achieving a higher classification accuracy. However, Prism still tends to overfit on noisy data; hence, ensemble learners have been adopted in this work to reduce the overfitting. This paper describes the development of an ensemble learner using a member of the Prism family as the base classifier to reduce the overfitting of Prism algorithms on noisy datasets. The developed ensemble classifier is compared with a stand-alone Prism classifier in terms of classification accuracy and resistance to noise.
Resumo:
One of the most challenging tasks in financial management for large governmental and industrial organizations is Planning and Budgeting (P&B). The processes involved with P&B are cost and time intensive, especially when dealing with uncertainties and budget adjustments during the planning horizon. This work builds on our previous research in which we proposed and evaluated a fuzzy approach that allows optimizing the budget interactively beyond the initial planning stage. In this research we propose an extension that handles financial stress (i.e. drastic budget cuts) occurred during the budget period. This is done by introducing fuzzy stress parameters which are used to re-distribute the budget in order to minimize the negative impact of the financial stress. The benefits and possible issues of this approach are analyzed critically using a real world case study from the Nuremberg Institute of Technology (NIT). Additionally, ongoing and future research directions are presented.
Resumo:
This paper considers the use of Association Rule Mining (ARM) and our proposed Transaction based Rule Change Mining (TRCM) to identify the rule types present in tweet’s hashtags over a specific consecutive period of time and their linkage to real life occurrences. Our novel algorithm was termed TRCM-RTI in reference to Rule Type Identification. We created Time Frame Windows (TFWs) to detect evolvement statuses and calculate the lifespan of hashtags in online tweets. We link RTI to real life events by monitoring and recording rule evolvement patterns in TFWs on the Twitter network.
Resumo:
This paper considers variations of a neuron pool selection method known as Affordable Neural Network (AfNN). A saliency measure, based on the second derivative of the objective function is proposed to assess the ability of a trained AfNN to provide neuronal redundancy. The discrepancies between the various affordability variants are explained by correlating unique sub group selections with relevant saliency variations. Overall this study shows that the method in which neurons are selected from a pool is more relevant to how salient individual neurons are, than how often a particular neuron is used during training. The findings herein are relevant to not only providing an analogy to brain function but, also, in optimizing the way a neural network using the affordability method is trained.
Resumo:
Whilst common sense knowledge has been well researched in terms of intelligence and (in particular) artificial intelligence, specific, factual knowledge also plays a critical part in practice. When it comes to testing for intelligence, testing for factual knowledge is, in every-day life, frequently used as a front line tool. This paper presents new results which were the outcome of a series of practical Turing tests held on 23rd June 2012 at Bletchley Park, England. The focus of this paper is on the employment of specific knowledge testing by interrogators. Of interest are prejudiced assumptions made by interrogators as to what they believe should be widely known and subsequently the conclusions drawn if an entity does or does not appear to know a particular fact known to the interrogator. The paper is not at all about the performance of machines or hidden humans but rather the strategies based on assumptions of Turing test interrogators. Full, unedited transcripts from the tests are shown for the reader as working examples. As a result, it might be possible to draw critical conclusions with regard to the nature of human concepts of intelligence, in terms of the role played by specific, factual knowledge in our understanding of intelligence, whether this is exhibited by a human or a machine. This is specifically intended as a position paper, firstly by claiming that practicalising Turing's test is a useful exercise throwing light on how we humans think, and secondly, by taking a potentially controversial stance, because some interrogators adopt a solipsist questioning style of hidden entities with a view that it is a thinking intelligent human if it thinks like them and knows what they know. The paper is aimed at opening discussion with regard to the different aspects considered.
Resumo:
Paraconsistent logics are non-classical logics which allow non-trivial and consistent reasoning about inconsistent axioms. They have been pro- posed as a formal basis for handling inconsistent data, as commonly arise in human enterprises, and as methods for fuzzy reasoning, with applica- tions in Artificial Intelligence and the control of complex systems. Formalisations of paraconsistent logics usually require heroic mathe- matical efforts to provide a consistent axiomatisation of an inconsistent system. Here we use transreal arithmetic, which is known to be consis- tent, to arithmetise a paraconsistent logic. This is theoretically simple and should lead to efficient computer implementations. We introduce the metalogical principle of monotonicity which is a very simple way of making logics paraconsistent. Our logic has dialetheaic truth values which are both False and True. It allows contradictory propositions, allows variable contradictions, but blocks literal contradictions. Thus literal reasoning, in this logic, forms an on-the- y, syntactic partition of the propositions into internally consistent sets. We show how the set of all paraconsistent, possible worlds can be represented in a transreal space. During the development of our logic we discuss how other paraconsistent logics could be arithmetised in transreal arithmetic.
Resumo:
Advances in hardware and software technologies allow to capture streaming data. The area of Data Stream Mining (DSM) is concerned with the analysis of these vast amounts of data as it is generated in real-time. Data stream classification is one of the most important DSM techniques allowing to classify previously unseen data instances. Different to traditional classifiers for static data, data stream classifiers need to adapt to concept changes (concept drift) in the stream in real-time in order to reflect the most recent concept in the data as accurately as possible. A recent addition to the data stream classifier toolbox is eRules which induces and updates a set of expressive rules that can easily be interpreted by humans. However, like most rule-based data stream classifiers, eRules exhibits a poor computational performance when confronted with continuous attributes. In this work, we propose an approach to deal with continuous data effectively and accurately in rule-based classifiers by using the Gaussian distribution as heuristic for building rule terms on continuous attributes. We show on the example of eRules that incorporating our method for continuous attributes indeed speeds up the real-time rule induction process while maintaining a similar level of accuracy compared with the original eRules classifier. We termed this new version of eRules with our approach G-eRules.
Resumo:
Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.